Skip to main content

Showing 1–50 of 2,628 results for author: Xie, X

.
  1. arXiv:2604.14978  [pdf, ps, other

    math.CO

    Counting tight Hamilton cycles in Dirac hypergraphs

    Authors: Felix Joos, Xinyue Xie

    Abstract: Suppose $G$ is a $k$-uniform hypergraph on $n$ vertices such that every $(k-1)$-subset $S$ of $V(G)$ belongs to at least $δn$ edges, where $δ> 1/2$. Let $Ψ(G)$ denote the number of tight Hamilton cycles in $G$, that is, cyclic orderings of $V(G)$ in which every $k$ consecutive vertices form an edge. We prove that $\logΨ(G)\ge kh(G)-n\log{n\choose k-1}+n\log n-n\log e-o(n)$, where $h(G)$ is the hyp… ▽ More

    Submitted 16 April, 2026; originally announced April 2026.

    Comments: 23 pages + 9 pages appendix for general l-cycles

  2. arXiv:2604.14975  [pdf, ps, other

    stat.CO math.NA stat.AP stat.ML

    Theta-regularized Kriging: Modelling and Algorithms

    Authors: Xuelin Xie, Xiliang Lu

    Abstract: To obtain more accurate model parameters and improve prediction accuracy, we proposed a regularized Kriging model that penalizes the hyperparameter theta in the Gaussian stochastic process, termed the Theta-regularized Kriging. We derived the optimization problem for this model from a maximum likelihood perspective. Additionally, we presented specific implementation details for the iterative proce… ▽ More

    Submitted 16 April, 2026; originally announced April 2026.

    Journal ref: Applied Mathematical Modelling, Vol. 136, 115627 (2024)

  3. arXiv:2604.14712  [pdf, ps, other

    cs.AI

    SGA-MCTS: Decoupling Planning from Execution via Training-Free Atomic Experience Retrieval

    Authors: Xin Xie, Dongyun Xue, Wuguannan Yao, Mingxiao Feng, Wengang Zhou, Xiang Qi, Houqiang Li, Peng Zhang

    Abstract: LLM-powered systems require complex multi-step decision-making abilities to solve real-world tasks, yet current planning approaches face a trade-off between the high latency of inference-time search and the limited generalization of supervised fine-tuning. To address this limitation, we introduce \textbf{SGA-MCTS}, a framework that casts LLM planning as non-parametric retrieval. Offline, we levera… ▽ More

    Submitted 16 April, 2026; originally announced April 2026.

  4. arXiv:2604.14116  [pdf, ps, other

    cs.AI cs.CL

    TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

    Authors: Zerun Ma, Guoqiang Wang, Xinchen Xie, Yicheng Chen, He Du, Bowen Li, Yanan Sun, Wenran Liu, Kai Chen, Yining Li

    Abstract: While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle. By orchestrating collaboration between two core modules-the Researcher and the Executor-th… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  5. arXiv:2604.13942  [pdf, ps, other

    cs.RO

    Goal2Skill: Long-Horizon Manipulation with Adaptive Planning and Reflection

    Authors: Zhen Liu, Xinyu Ning, Zhe Hu, Xinxin Xie, Weize Li, Zhipeng Tang, Chongyu Wang, Zejun Yang, Hanlin Wang, Yitong Liu, Zhongzhu Pu

    Abstract: Recent vision-language-action (VLA) systems have demonstrated strong capabilities in embodied manipulation. However, most existing VLA policies rely on limited observation windows and end-to-end action prediction, which makes them brittle in long-horizon, memory-dependent tasks with partial observability, occlusions, and multi-stage dependencies. Such tasks require not only precise visuomotor cont… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  6. arXiv:2604.13725  [pdf, ps, other

    cs.SE

    On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

    Authors: Jia Feng, Zhanyue Qin, Cuiyun Gao, Ruiqi Wang, Chaozheng Wang, Yingwei Ma, Xiaoyuan Xie

    Abstract: Repository-level code intelligence tasks require large language models (LLMs) to process long, multi-file contexts. Such inputs introduce three challenges: crucial context can be obscured by noise, truncated due to limited windows, and increased inference latency. Context compression mitigates these risks by condensing inputs. While studied in NLP, its applicability to code tasks remains largely u… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

    Comments: Work in progress

  7. arXiv:2604.13559  [pdf, ps, other

    cs.SE

    WebMAC: A Multi-Agent Collaborative Framework for Scenario Testing of Web Systems

    Authors: Zhenyu Wan, Gong Chen, Qing Huang, Xiaoyuan Xie

    Abstract: Scenario testing is an important technique for detecting errors in web systems. Testers draft test scenarios and convert them into test scripts for execution. Early methods relied on testers to convert test scenarios into test scripts. Recent LLM-based scenario testing methods can generate test scripts from natural language descriptions of test scenarios. However, these methods are not only limite… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  8. arXiv:2604.13463  [pdf, ps, other

    cs.SE

    From Exploration to Specification: LLM-Based Property Generation for Mobile App Testing

    Authors: Yiheng Xiong, Shiwen Song, Bo Ma, Ting Su, Xiaofei Xie

    Abstract: Mobile apps often suffer from functional bugs that do not cause crashes but instead manifest as incorrect behaviors under specific user interactions. Such bugs are difficult to detect automatically because they often lack explicit test oracles. Property-based testing can effectively expose them by checking intended behavioral properties under diverse interactions. However, its use largely depends… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  9. arXiv:2604.12968  [pdf, ps, other

    cs.LG cs.CV

    Evolution of Optimization Methods: Algorithms, Scenarios, and Evaluations

    Authors: Tong Zhang, Jiangning Zhang, Zhucun Xue, Juntao Jiang, Yicheng Xu, Chengming Xu, Teng Hu, Xingyu Xie, Xiaobin Hu, Yabiao Wang, Yong Liu, Shuicheng Yan

    Abstract: Balancing convergence speed, generalization capability, and computational efficiency remains a core challenge in deep learning optimization. First-order gradient descent methods, epitomized by stochastic gradient descent (SGD) and Adam, serve as the cornerstone of modern training pipelines. However, large-scale model training, stringent differential privacy requirements, and distributed learning p… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

  10. arXiv:2604.12851  [pdf, ps, other

    cs.CY

    Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment

    Authors: Bryan Chen Zhengyu Tan, Zhengyuan Liu, Xiaoyuan Yi, Jing Yao, Xing Xie, Nancy F. Chen, Roy Ka-Wei Lee

    Abstract: Despite their global prevalence, many Large Language Models (LLMs) are aligned to a monolithic, often Western-centric set of values. This paper investigates the more challenging task of fine-grained value alignment: examining whether LLMs can emulate the distinct cultural values of demographic subgroups. Using Singapore as a case study and the World Values Survey (WVS), we examine the value landsc… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

    Comments: ACL 2026

  11. arXiv:2604.12600  [pdf, ps, other

    cs.CV math.NA

    Spatial-Spectral Adaptive Fidelity and Noise Prior Reduction Guided Hyperspectral Image Denoising

    Authors: Xuelin Xie, Xiliang Lu, Zhengshan Wang, Yang Zhang, Long Chen

    Abstract: The core challenge of hyperspectral image denoising is striking the right balance between data fidelity and noise prior modeling. Most existing methods place too much emphasis on the intrinsic priors of the image while overlooking diverse noise assumptions and the dynamic trade-off between fidelity and priors. To address these issues, we propose a denoising framework that integrates noise prior re… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

  12. arXiv:2604.12344  [pdf, ps, other

    astro-ph.IM cs.AI

    FRTSearch: Unified Detection and Parameter Inference of Fast Radio Transients using Instance Segmentation

    Authors: Bin Zhang, Yabiao Wang, Xiaoyao Xie, Shanping You, Xuhong Yu, Qiuhua Li, Hongwei Li, Shaowen Du, Chenchen Miao, Dengke Zhou, Jianhua Fang, Jiafu Wu, Pei Wang, Di Li

    Abstract: The exponential growth of data from modern radio telescopes presents a significant challenge to traditional single-pulse search algorithms, which are computationally intensive and prone to high false-positive rates due to Radio Frequency Interference (RFI). In this work, we introduce FRTSearch, an end-to-end framework unifying the detection and physical characterization of Fast Radio Transients (F… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

    Comments: Accepted for publication in The Astrophysical Journal Supplement Series (ApJS)

  13. arXiv:2604.10963  [pdf, ps, other

    cs.AI

    Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models

    Authors: Ruiyang Li, Fang Liu, Licheng Jiao, Xinglin Xie, Jiayao Hao, Shuo Li, Xu Liu, Jingyi Yang, Lingling Li, Puhua Chen, Wenping Ma

    Abstract: Medical image segmentation supports clinical workflows by precisely delineating anatomical structures and lesions. However, medical image datasets medical image datasets suffer from acquisition noise and annotation ambiguity, causing pervasive data uncertainty that substantially undermines model robustness. Existing research focuses primarily on model architectural improvements and predictive reli… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

  14. arXiv:2604.08311  [pdf, ps, other

    cs.IT

    On quadratic binomial vectorial functions with maximal bent components

    Authors: Xianhong Xie, Yi Ouyang, Shenxing Zhang

    Abstract: Assume $n=2m\geq 2$ and let $F(x)=x^{d_1}+x^{d_2}$ be a binomial vectorial function over $\F_{2^n}$ possessing the maximal number (i.e. $2^n-2^m$) of bent components. Suppose the $2$-adic Hamming weights $\wt_2(d_1)$ and $\wt_2(d_2)$ are both at most $2$, we prove that $F(x)$ is affine equivalent to either $x^{2^m+1}$ or $x^{2^i}(x+x^{2^m})$, provided that \[ \ell(n):=\min_{γ:~\F_2(γ)=\F_{2^n}}… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  15. arXiv:2604.08034  [pdf, ps, other

    cs.CV

    Rotation Equivariant Convolutions in Deformable Registration of Brain MRI

    Authors: Arghavan Rezvani, Kun Han, Anthony T. Wu, Pooya Khosravi, Xiaohui Xie

    Abstract: Image registration is a fundamental task that aligns anatomical structures between images. While CNNs perform well, they lack rotation equivariance - a rotated input does not produce a correspondingly rotated output. This hinders performance by failing to exploit the rotational symmetries inherent in anatomical structures, particularly in brain MRI. In this work, we integrate rotation-equivariant… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: Accepted at the 2026 International Symposium on Biomedical Imaging (ISBI) Poster 4-page paper presentation

  16. arXiv:2604.07999  [pdf, ps, other

    cs.LG

    Benchmarking Deep Learning for Future Liver Remnant Segmentation in Colorectal Liver Metastasis

    Authors: Anthony T. Wu, Arghavan Rezvani, Kela Liu, Roozbeh Houshyar, Pooya Khosravi, Whitney Li, Xiaohui Xie

    Abstract: Accurate segmentation of the future liver remnant (FLR) is critical for surgical planning in colorectal liver metastases (CRLM) to prevent fatal post-hepatectomy liver failure. However, this segmentation task is technically challenging due to complex resection boundaries, convoluted hepatic vasculature and diffuse metastatic lesions. A primary bottleneck in developing automated AI tools has been t… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: Accepted at the 2026 International Symposium on Biomedical Imaging (ISBI) Oral 4-page paper presentation

    ACM Class: I.2.1

  17. arXiv:2604.07701  [pdf, ps, other

    physics.optics

    Controllable Chirality Sorting of Particles via Topological Optical Quasiparticles

    Authors: Hao Zhang, Xi Xie, Yijie Shen

    Abstract: The manipulation and sorting of chiral nanoparticles are of fundamental importance in multidisciplinary fields ranging from biochemistry to nanophotonics. In this study, we propose a novel and controllable chirality sorting mechanism for continuous particle separation using focused topological optical quasiparticles. Specifically, we investigate the sorting dynamics driven by tight-focused optical… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

  18. arXiv:2604.06210  [pdf, ps, other

    cs.CL cs.AI cs.CY cs.LG

    Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

    Authors: Jaehyeok Lee, Xiaoyuan Yi, Jing Yao, Hyunjin Hwang, Roy Ka-Wei Lee, Xing Xie, JinYeong Bak

    Abstract: As LLMs are globally deployed, aligning their cultural value orientations is critical for safety and user engagement. However, existing benchmarks face the Construct-Composition-Context ($C^3$) challenge: relying on discriminative, multiple-choice formats that probe value knowledge rather than true orientations, overlook subcultural heterogeneity, and mismatch with real-world open-ended generation… ▽ More

    Submitted 8 April, 2026; v1 submitted 16 March, 2026; originally announced April 2026.

  19. arXiv:2604.05339  [pdf, ps, other

    cs.CL

    Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities

    Authors: Xiangxu Zhang, Jiamin Wang, Qinlin Zhao, Hanze Guo, Linzhuo Li, Jing Yao, Xiao Zhou, Xiaoyuan Yi, Xing Xie

    Abstract: As LLMs become increasingly integrated into human society, evaluating their orientations on human values from social science has drawn growing attention. Nevertheless, it is still unclear why human values matter for LLMs, especially in LLM-based multi-agent systems, where group-level failures may accumulate from individually misaligned actions. We ask whether misalignment with human values alters… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

  20. arXiv:2604.05015  [pdf, ps, other

    cs.CV

    Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

    Authors: Chaoyou Fu, Haozhi Yuan, Yuhao Dong, Yi-Fan Zhang, Yunhang Shen, Xiaoxing Hu, Xueying Li, Jinsen Su, Chengwu Long, Xiaoyao Xie, Yongkang Xie, Xiawu Zheng, Xue Yang, Haoyu Cao, Yunsheng Wu, Ziwei Liu, Xing Sun, Caifeng Shan, Ran He

    Abstract: With the rapid advancement of video understanding, existing benchmarks are becoming increasingly saturated, exposing a critical discrepancy between inflated leaderboard scores and real-world model capabilities. To address this widening gap, we introduce Video-MME-v2, a comprehensive benchmark designed to rigorously evaluate the robustness and faithfulness of video understanding. To systematically… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: Homepage: https://video-mme-v2.netlify.app/

  21. arXiv:2604.04804  [pdf, ps, other

    cs.CL cs.AI cs.IR cs.LG cs.MA

    SkillX: Automatically Constructing Skill Knowledge Bases for Agents

    Authors: Chenxi Wang, Zhuoyun Yu, Xin Xie, Wuguannan Yao, Runnan Fang, Shuofei Qiao, Kexin Cao, Guozhou Zheng, Xiang Qi, Peng Zhang, Shumin Deng

    Abstract: Learning from experience is critical for building capable large language model (LLM) agents, yet prevailing self-evolving paradigms remain inefficient: agents learn in isolation, repeatedly rediscover similar behaviors from limited experience, resulting in redundant exploration and poor generalization. To address this problem, we propose SkillX, a fully automated framework for constructing a \text… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: Work in progress

  22. arXiv:2604.03244  [pdf, ps, other

    cs.AI cs.CY cs.DB

    Position: Science of AI Evaluation Requires Item-level Benchmark Data

    Authors: Han Jiang, Susu Zhang, Xiaoyuan Yi, Xing Xie, Ziang Xiao

    Abstract: AI evaluations have become the primary evidence for deploying generative AI systems across high-stakes domains. However, current evaluation paradigms often exhibit systemic validity failures. These issues, ranging from unjustified design choices to misaligned metrics, remain intractable without a principled framework for gathering validity evidence and conducting granular diagnostic analysis. In t… ▽ More

    Submitted 26 February, 2026; originally announced April 2026.

  23. arXiv:2604.02630  [pdf

    cond-mat.mtrl-sci

    Evolution from Landau Quantization to Discrete Scale Invariance Revealed by Quantum Oscillations in Topological Materials

    Authors: Jiayi Yang, Nannan Tang, Yunxing Li, Jiawei Luo, Huakun Zuo, Gangjian Jin, Ziqiao Wang, Haiwen Liu, Yanzhao Liu, Donghui Guo, XinCheng Xie, Jian Wang, Huichao Wang

    Abstract: Dirac materials have been a unique solid state platform for exploring relativistic quantum phenomena including supercritical atomic collapse, which leads to emergent discrete scale symmetry and logperiodic quantum oscillations. In the relativistic regime, the fundamental effect in quantum electrodynamics, vacuum polarization, can further modulate the atomic collapselike state by screening bare cha… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 19 pages, 4 figures

  24. arXiv:2604.00948  [pdf, ps, other

    math.NA

    Physics-informed neural networks for solving two-phase flow problems with moving interfaces

    Authors: Qijia Zhai, Pengtao Sun, Xiaoping Xie, Xingwen Zhu, Chen-Song Zhang

    Abstract: In this paper, a meshfree method using physics-informed neural networks (PINNs) is developed for solving two-phase flow problems with moving interfaces, where two immiscible fluids bearing different material properties, are separated by a dynamically evolving interface and interact with each other through interface conditions. Two kinds of distinct scenarios of interface motion are addressed: the… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  25. arXiv:2604.00704  [pdf, ps, other

    cs.CR cs.AI cs.SE

    AutoEG: Exploiting Known Third-Party Vulnerabilities in Black-Box Web Applications

    Authors: Ruozhao Yang, Mingfei Cheng, Gelei Deng, Junjie Wang, Tianwei Zhang, Xiaofei Xie

    Abstract: Large-scale web applications are widely deployed with complex third-party components, inheriting security risks arising from component vulnerabilities. Security assessment is therefore required to determine whether such known vulnerabilities remain practically exploitable in real applications. Penetration testing is a widely adopted approach that validates exploitability by launching concrete atta… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

    Comments: 21 pages, 18 figures

  26. arXiv:2603.30045  [pdf, ps, other

    cs.CV

    OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

    Authors: Yuheng Liu, Xin Lin, Xinke Li, Baihan Yang, Chen Wang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Hao Tan, Kai Zhang, Xiaohui Xie, Zifan Shi, Yiwei Hu

    Abstract: Modeling scenes using video generation models has garnered growing research interest in recent years. However, most existing approaches rely on perspective video models that synthesize only limited observations of a scene, leading to issues of completeness and global consistency. We propose OmniRoam, a controllable panoramic video generation framework that exploits the rich per-frame scene coverag… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

    Comments: Code is available at https://github.com/yuhengliu02/OmniRoam

  27. arXiv:2603.29852  [pdf, ps, other

    cs.GR cs.AI cs.CV

    VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

    Authors: Juan Rodriguez, Haotian Zhang, Abhay Puri, Tianyang Zhang, Rishav Pramanik, Meng Lin, Xiaoqing Xie, Marco Terral, Darsh Kaushik, Aly Shariff, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli

    Abstract: We introduce VectorGym, a comprehensive benchmark suite for Scalable Vector Graphics (SVG) that spans generation from text and sketches, complex editing, and visual understanding. VectorGym addresses the lack of realistic, challenging benchmarks aligned with professional design workflows. Our benchmark comprises four tasks with expert human-authored annotations: the novel Sketch2SVG task (VG-Sketc… ▽ More

    Submitted 22 February, 2026; originally announced March 2026.

  28. arXiv:2603.28760  [pdf, ps, other

    cs.CV cs.RO

    SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

    Authors: Patrick Rim, Kevin Harris, Braden Copple, Shangchen Han, Xu Xie, Ivan Shugurov, Sizhe An, He Wen, Alex Wong, Tomas Hodan, Kun He

    Abstract: Accurate 3D understanding of human hands and objects during manipulation remains a significant challenge for egocentric computer vision. Existing hand-object interaction datasets are predominantly captured in controlled studio settings, which limits both environmental diversity and the ability of models trained on such data to generalize to real-world scenarios. To address this challenge, we intro… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: CVPR 2026

  29. arXiv:2603.28342  [pdf, ps, other

    cs.CL cs.LG

    Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

    Authors: He Du, Qiming Ge, Jiakai Hu, Aijun Yang, Zheng Cai, Zixian Huang, Sheng Yuan, Qinxiu Cheng, Xinchen Xie, Yicheng Chen, Yining Li, Jiaxing Xie, Huanan Dong, Yaguang Wu, Xiangjun Huang, Jian Yang, Hui Wang, Bowen Zhou, Bowen Li, Qipeng Guo, Kai Chen

    Abstract: We present Kernel-Smith, a framework for high-performance GPU kernel and operator generation that combines a stable evaluation-driven evolutionary agent with an evolution-oriented post-training recipe. On the agent side, Kernel-Smith maintains a population of executable candidates and iteratively improves them using an archive of top-performing and diverse programs together with structured executi… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  30. arXiv:2603.28042  [pdf, ps, other

    cs.AR

    MCPT-Solver: An Monte Carlo Algorithm Solver Using MTJ Devices for Particle Transport Problems

    Authors: Siqing Fu, Lizhou Wu, Tiejun Li, Xuchao Xie, Chunyuan Zhang, Sheng Ma, Jianmin Zhang, Yuhan Tang, Jixuan Tang

    Abstract: Monte Carlo particle transport problems play a vital role in scientific computing, but solving them on exiting von Neumann architectures suffers from random branching and irregular memory access, causing computing inefficiency due to a fundamental mismatch between stochastic algorithms and deterministic hardware. To bridge this gap, we propose MCPT-Solver, a spin-based hardware true random number… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 10 pages, 14 figures

  31. arXiv:2603.27703  [pdf, ps, other

    cs.CL cs.LG

    KAT-Coder-V2 Technical Report

    Authors: Fengxiang Li, Han Zhang, Haoyang Huang, Jinghui Wang, Jinhua Hao, Kun Yuan, Mengtong Li, Minglei Zhang, Pengcheng Xu, Wenhao Zhuang, Yizhen Shao, Zongxian Feng, Can Tang, Chao Wang, Chengxiao Tong, Fan Yang, Gang Xiong, Haixuan Gao, Han Gao, Hao Wang, Haochen Liu, Hongliang Sun, Jiabao Li, Jingwen Chang, Jun Du , et al. (21 additional authors not shown)

    Abstract: We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy disti… ▽ More

    Submitted 29 March, 2026; originally announced March 2026.

    Comments: 22 pages, 7 figures

  32. arXiv:2603.27667  [pdf, ps, other

    cs.SD cs.AI

    EvA: An Evidence-First Audio Understanding Paradigm for LALMs

    Authors: Xinyuan Xie, Shunian Chen, Zhiheng Liu, Yuhao Zhang, Zhiqiang Lv, Liyin Liang, Benyou Wang

    Abstract: Large Audio Language Models (LALMs) still struggle in complex acoustic scenes because they often fail to preserve task-relevant acoustic evidence before reasoning begins. We call this failure the evidence bottleneck: state-of-the-art systems show larger deficits in evidence extraction than in downstream reasoning, suggesting that the main limitation lies in upstream perception rather than reasonin… ▽ More

    Submitted 29 March, 2026; originally announced March 2026.

  33. arXiv:2603.27594  [pdf, ps, other

    math.NA

    Stability Analysis of Monolithic Globally Divergence-Free ALE-HDG Methods for Fluid-Structure Interaction

    Authors: Shuaijun Liu, Xiaoping Xie

    Abstract: In this paper, we propose two monolithic fully discrete finite element methods for fluid-structure interaction (FSI) based on a novel Piola-type Arbitrary Lagrangian-Eulerian (ALE) mapping. For the temporal discretization, we apply the backward Euler method to both the non-conservative and conservative formulations. For the spatial discretization, we adopt arbitrary order hybridizable discontinuou… ▽ More

    Submitted 8 April, 2026; v1 submitted 29 March, 2026; originally announced March 2026.

  34. arXiv:2603.27012  [pdf, ps, other

    cs.RO cs.AI

    UMI-Underwater: Learning Underwater Manipulation without Underwater Teleoperation

    Authors: Hao Li, Long Yin Chung, Jack Goler, Ryan Zhang, Xiaochi Xie, Huy Ha, Shuran Song, Mark Cutkosky

    Abstract: Underwater robotic grasping is difficult due to degraded, highly variable imagery and the expense of collecting diverse underwater demonstrations. We introduce a system that (i) autonomously collects successful underwater grasp demonstrations via a self-supervised data collection pipeline and (ii) transfers grasp knowledge from on-land human demonstrations through a depth-based affordance represen… ▽ More

    Submitted 27 March, 2026; originally announced March 2026.

  35. arXiv:2603.26091  [pdf, ps, other

    cs.SE

    Search-Induced Issues in Web-Augmented LLM Code Generation: Detecting and Repairing Error-Inducing Pages

    Authors: Guoqing Wang, Zeyu Sun, Xiaofei Xie, Yizhou Chen, Yanchao Tan, Yifan Zhao, Dan Hao

    Abstract: Web-augmented large language models (LLMs) offer promising capabilities for automatic code generation. However, integrating live web search exposes models to unreliable or malicious content, leading to Search-Induced Issues (SII), a novel failure mode in which external pages mislead LLMs into producing incorrect code. This paper presents a comprehensive empirical study of the prevalence and impact… ▽ More

    Submitted 27 March, 2026; originally announced March 2026.

    Comments: 12 pages, 5 figures

  36. arXiv:2603.25040  [pdf, ps, other

    cs.LG cs.CL cs.CV

    Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

    Authors: Yicheng Zou, Dongsheng Zhu, Lin Zhu, Tong Zhu, Yunhua Zhou, Peiheng Zhou, Xinyu Zhou, Dongzhan Zhou, Zhiwang Zhou, Yuhao Zhou, Bowen Zhou, Zhanping Zhong, Zhijie Zhong, Haiteng Zhao, Penghao Zhao, Xiaomeng Zhao, Zhiyuan Zhao, Yechen Zhang, Jin Zhang, Wenwei Zhang, Hongjie Zhang, Zhuo Zhang, Wenlong Zhang, Bo Zhang, Chao Zhang , et al. (152 additional authors not shown)

    Abstract: We introduce Intern-S1-Pro, the first one-trillion-parameter scientific multimodal foundation model. Scaling to this unprecedented size, the model delivers a comprehensive enhancement across both general and scientific domains. Beyond stronger reasoning and image-text understanding capabilities, its intelligence is augmented with advanced agent capabilities. Simultaneously, its scientific expertis… ▽ More

    Submitted 2 April, 2026; v1 submitted 26 March, 2026; originally announced March 2026.

  37. arXiv:2603.24102  [pdf, ps, other

    cond-mat.mes-hall cond-mat.other

    Electron Dynamics Reconstruction and Nontrivial Transport by Acoustic Waves

    Authors: Zi-Qian Zhou, Zhi-Fan Zhang, Cong Xiao, Hua Jiang, X. C. Xie

    Abstract: Surface acoustic waves (SAWs) become a popular driving source in modern condensed matter physics, but most existing theories simplify them as electric fields and ignore the non-uniform Brillouin zone folding effect. We develop a semiclassical framework and reconstruct the electron dynamics by treating SAW as a quasi-periodic potential modulating electronic momentum distribution. This framework nat… ▽ More

    Submitted 25 March, 2026; originally announced March 2026.

    Comments: 8 pages, 2 figures

  38. arXiv:2603.23510  [pdf, ps, other

    cs.CL cs.AI

    Visuospatial Perspective Taking in Multimodal Language Models

    Authors: Jonathan Prunty, Seraphina Zhang, Patrick Quinn, Jianxun Lian, Xing Xie, Lucy Cheke

    Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial to evaluate their perspective-taking abilities. Existing benchmarks largely rely on text-based vignettes or static scene understanding, leaving visuospatial perspective-taking (VPT) underexplored. We adapt two evaluation tasks from human studies: the Director Task, assessing VPT in a refe… ▽ More

    Submitted 4 March, 2026; originally announced March 2026.

  39. arXiv:2603.21222  [pdf, ps, other

    cs.CV

    A Large-Scale Remote Sensing Dataset and VLM-based Algorithm for Fine-Grained Road Hierarchy Classification

    Authors: Ting Han, Xiangyi Xie, Yiping Chen, Yumeng Du, Jin Ma, Aiguang Li, Jiaan Liu, Yin Gao

    Abstract: In this work, we present SYSU-HiRoads, a large-scale hierarchical road dataset, and RoadReasoner, a vision-language-geometry framework for automatic multi-grade road mapping from remote sensing imagery. SYSU-HiRoads is built from GF-2 imagery covering 3631 km2 in Henan Province, China, and contains 1079 image tiles at 0.8 m spatial resolution. Each tile is annotated with dense road masks, vectoriz… ▽ More

    Submitted 22 March, 2026; originally announced March 2026.

  40. arXiv:2603.20751  [pdf, ps, other

    math.OC

    Local Convergence Analysis of ADMM for Nonconvex Composite Optimization

    Authors: Xiyuan Xie, Lihua Yang, Qia li

    Abstract: In this paper, we study the local convergence of the standard ADMM scheme for a class of nonconvex composite problems arising from modern imaging and machine learning models. This problem is constrained by a closed convex set, while its objective is the sum of a continuously differentiable (possibly nonconvex) smooth term and a polyhedral convex nonsmooth term composed with a linear mapping. Our a… ▽ More

    Submitted 21 March, 2026; originally announced March 2026.

    MSC Class: 90C26; 90C30; 65K05

  41. arXiv:2603.19627  [pdf, ps, other

    astro-ph.SR

    Precise parameter determination of the open cluster NGC 1647 via asteroseismology of p-mode pulsators

    Authors: Mingfeng Qin, Jian-Ning Fu, Weikai Zong, Tianqi Cang, Antonio Frasca, Gang Meng, Xiran Xie

    Abstract: Asteroseismology of member pulsators provides a robust physical constraint on cluster parameters by linking internal stellar structures to the global properties of the host cluster. However, the parameters of NGC 1647 remains poorly constrained due to limited investigation, a situation that cluster asteroseismology can significantly refine. In this study, we identified 271 high confidential cluste… ▽ More

    Submitted 20 March, 2026; originally announced March 2026.

  42. arXiv:2603.17684  [pdf, ps, other

    cs.CV

    Does YOLO Really Need to See Every Training Image in Every Epoch?

    Authors: Xingxing Xie, Jiahua Dong, Junwei Han, Gong Cheng

    Abstract: YOLO detectors are known for their fast inference speed, yet training them remains unexpectedly time-consuming due to their exhaustive pipeline that processes every training image in every epoch, even when many images have already been sufficiently learned. This stands in clear contrast to the efficiency suggested by the ``You Only Look Once'' philosophy. This naturally raises an important questio… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

    Comments: Accepted to CVPR 2026

  43. Facial beauty prediction fusing transfer learning and broad learning system

    Authors: Junying Gan, Xiaoshan Xie, Yikui Zhai, Guohui He, Chaoyun Mai, Heng Luo

    Abstract: Facial beauty prediction (FBP) is an important and challenging problem in the fields of computer vision and machine learning. Not only it is easily prone to overfitting due to the lack of large-scale and effective data, but also difficult to quickly build robust and effective facial beauty evaluation models because of the variability of facial appearance and the complexity of human perception. Tra… ▽ More

    Submitted 13 March, 2026; originally announced March 2026.

  44. arXiv:2603.15224  [pdf, ps, other

    hep-ph

    Gluon TMDs for tensor polarized deuteron in a spectator model

    Authors: Xiupeng Xie, Dian-Yong Chen, Zhun Lu

    Abstract: We present a model calculation of the transverse-momentum-dependent distributions (TMDs) for gluons in a tensor-polarized deuteron. Our model is based on the assumption that an on-shell deuteron can emit a time-like off-shell gluon, while the remaining system is treated as a single on-shell spectator particle whose mass can take on a continuous range of real values, described by a spectral functio… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

    Comments: 16 pages, 5 figures

  45. arXiv:2603.15031  [pdf, ps, other

    cs.CL

    Attention Residuals

    Authors: Kimi Team, Guangyu Chen, Yu Zhang, Jianlin Su, Weixin Xu, Siyuan Pan, Yaoyu Wang, Yucheng Wang, Guanduo Chen, Bohong Yin, Yutian Chen, Junjie Yan, Ming Wei, Y. Zhang, Fanqing Meng, Chao Hong, Xiaotong Xie, Shaowei Liu, Enzhe Lu, Yunpeng Tai, Yanru Chen, Xin Men, Haiqing Guo, Y. Charles, Haoyu Lu , et al. (12 additional authors not shown)

    Abstract: Residual connections with PreNorm are standard in modern LLMs, yet they accumulate all layer outputs with fixed unit weights. This uniform aggregation causes uncontrolled hidden-state growth with depth, progressively diluting each layer's contribution. We propose Attention Residuals (AttnRes), which replaces this fixed accumulation with softmax attention over preceding layer outputs, allowing each… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

    Comments: attnres tech report

  46. arXiv:2603.14956  [pdf, ps, other

    cs.LG

    SFedHIFI: Fire Rate-Based Heterogeneous Information Fusion for Spiking Federated Learning

    Authors: Ran Tao, Qiugang Zhan, Shantian Yang, Xiurui Xie, Qi Tian, Guisong Liu

    Abstract: Spiking Federated Learning (SFL) has been widely studied with the energy efficiency of Spiking Neural Networks (SNNs). However, existing SFL methods require model homogeneity and assume all clients have sufficient computational resources, resulting in the exclusion of some resource-constrained clients. To address the prevalent system heterogeneity in real-world scenarios, enabling heterogeneous SF… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

    Comments: 9 pages, 1 figure

  47. arXiv:2603.13506  [pdf, ps, other

    cs.CV

    LibraGen: Playing a Balance Game in Subject-Driven Video Generation

    Authors: Jiahao Zhu, Shanshan Lao, Lijie Liu, Gen Li, Tianhao Qi, Wei Han, Bingchuan Li, Fangfang Liu, Zhuowei Chen, Tianxiang Ma, Qian HE, Yi Zhou, Xiaohua Xie

    Abstract: With the advancement of video generation foundation models (VGFMs), customized generation, particularly subject-to-video (S2V), has attracted growing attention. However, a key challenge lies in balancing the intrinsic priors of a VGFM, such as motion coherence, visual aesthetics, and prompt alignment, with its newly derived S2V capability. Existing methods often neglect this balance by enhancing o… ▽ More

    Submitted 17 March, 2026; v1 submitted 13 March, 2026; originally announced March 2026.

  48. arXiv:2603.13500  [pdf, ps, other

    cs.CV

    ActionPlan: Future-Aware Streaming Motion Synthesis via Frame-Level Action Planning

    Authors: Eric Nazarenus, Chuqiao Li, Yannan He, Xianghui Xie, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: We present ActionPlan, a unified motion diffusion framework that bridges real-time streaming with high-quality offline generation within a single model. The core idea is to introduce a per-frame action plan: the model predicts frame-level text latents that act as dense semantic anchors throughout denoising, and uses them to denoise the full motion sequence with combined semantic and motion cues. T… ▽ More

    Submitted 13 March, 2026; originally announced March 2026.

    Comments: Project page:https://coral79.github.io/ActionPlan/

  49. arXiv:2603.13411  [pdf, ps, other

    cs.SE cs.HC

    Human in the Loop for Fuzz Testing: Literature Review and the Road Ahead

    Authors: Jiongchi Yu, Xiaolin Wen, Sizhe Cheng, Xiaofei Xie, Qiang Hu, Yong Wang

    Abstract: Fuzz testing is one of the most effective techniques for detecting bugs and vulnerabilities in software. However, as the basis of fuzz testing, automated heuristics often fail to uncover deep or complex vulnerabilities. As a result, the performance of fuzz testing remains limited. One promising way to address this limitation is to integrate human expert guidance into the paradigm of fuzz testing.… ▽ More

    Submitted 12 March, 2026; originally announced March 2026.

    Comments: 23 pages

  50. arXiv:2603.12126  [pdf, ps, other

    cs.CV cs.LG

    Hoi3DGen: Generating High-Quality Human-Object-Interactions in 3D

    Authors: Agniv Sharma, Xianghui Xie, Tom Fischer, Eddy Ilg, Gerard Pons-Moll

    Abstract: Modeling and generating 3D human-object interactions from text is crucial for applications in AR, XR, and gaming. Existing approaches often rely on score distillation from text-to-image models, but their results suffer from the Janus problem and do not follow text prompts faithfully due to the scarcity of high-quality interaction data. We introduce Hoi3DGen, a framework that generates high-quality… ▽ More

    Submitted 12 March, 2026; originally announced March 2026.