Skip to main content

Showing 1–50 of 13,192 results for author: Li, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.04932  [pdf, ps, other

    cs.CL

    Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated Text Detection

    Authors: Yang Li, Qiang Sheng, Zhengjia Wang, Yehan Yang, Danding Wang, Juan Cao

    Abstract: The misuse of large language models (LLMs) requires precise detection of synthetic text. Existing works mainly follow binary or ternary classification settings, which can only distinguish pure human/LLM text or collaborative text at best. This remains insufficient for the nuanced regulation, as the LLM-polished human text and humanized LLM text often trigger different policy consequences. In this… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: ACL 2026 Accepted Paper

  2. arXiv:2604.04297  [pdf, ps, other

    cs.AI

    PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

    Authors: Marija Zelic, Anna Tegon, Yawei Li, Thorir Mar Ingolfsson, Luca Benini

    Abstract: Physiological foundation models (FMs) have shown promise for biosignal representation learning, yet most remain confined to a single modality such as EEG, ECG, or PPG, largely because paired multimodal datasets are scarce. In this paper, we present PanLUNA, a compact 5.4M-parameter pan-modal FM that jointly processes EEG, ECG, and PPG within a single shared encoder. Extending LUNA's channel-unific… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

    Comments: 5 pages, 5 tables, 1 figure, preprint

  3. arXiv:2604.04135  [pdf, ps, other

    cs.CV

    NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results

    Authors: Shuhong Liu, Chenyu Bao, Ziteng Cui, Xuangeng Chu, Bin Ren, Lin Gu, Xiang Chen, Mingrui Li, Long Ma, Marcos V. Conde, Radu Timofte, Yun Liu, Ryo Umagami, Tomohiro Hashimoto, Zijian Hu, Yuan Gan, Tianhan Xu, Yusuke Kurose, Tatsuya Harada, Junwei Yuan, Gengjia Chang, Xining Ge, Mache You, Qida Cao, Zeliang Li , et al. (81 additional authors not shown)

    Abstract: This paper presents a comprehensive review of the NTIRE 2026 3D Restoration and Reconstruction (3DRR) Challenge, detailing the proposed methods and results. The challenge seeks to identify robust reconstruction pipelines that are robust under real-world adverse conditions, specifically extreme low-light and smoke-degraded environments, as captured by our RealX3D benchmark. A total of 279 participa… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

  4. arXiv:2604.03774  [pdf, ps, other

    cs.CV cs.AI

    When Does Multimodal AI Help? Diagnostic Complementarity of Vision-Language Models and CNNs for Spectrum Management in Satellite-Terrestrial Networks

    Authors: Yuanhang Li

    Abstract: The adoption of vision-language models (VLMs) for wireless network management is accelerating, yet no systematic understanding exists of where these large foundation models outperform lightweight convolutional neural networks (CNNs) for spectrum-related tasks. This paper presents the first diagnostic comparison of VLMs and CNNs for spectrum heatmap understanding in non-terrestrial network and terr… ▽ More

    Submitted 4 April, 2026; originally announced April 2026.

    Comments: 10 pages, 4 figures

  5. arXiv:2604.03630  [pdf, ps, other

    cs.AI q-bio.QM

    A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

    Authors: Jinxi Xiang, Siyu Hou, Yuchen Li, Ryan Quinton, Xiaoming Zhang, Feyisope Eweje, Xiangde Luo, Yijiang Chen, Zhe Li, Colin Bergstrom, Ted Kim, Sierra Willens, Francesca Maria Olguin, Matthew Abikenari, Andrew Heider, Sanjeeth Rajaram, Joel Neal, Maximilian Diehn, Xiang Zhou, Ruijiang Li

    Abstract: Spatial transcriptomics (ST) enables gene expression mapping within anatomical context but remains costly and low-throughput. Hematoxylin and eosin (H\&E) staining offers rich morphology yet lacks molecular resolution. We present \textbf{\ours} (\textbf{S}patial \textbf{T}ranscriptomics and hist\textbf{O}logy \textbf{R}epresentation \textbf{M}odel), a foundation model trained on 1.2 million spatia… ▽ More

    Submitted 4 April, 2026; originally announced April 2026.

    Comments: 29 pages, 5 figures. This manuscript is a work in progress; further updates and revisions will be posted as they become available

  6. arXiv:2604.03624  [pdf, ps, other

    cs.AR cs.FL cs.LO

    Efficient Solving for Dynamic Data Structure Constraint Satisfaction Problem

    Authors: Nanbing Li, Weijie Peng, Jin Luo, Shuai Wang, Yihui Li, Jun Fang, Yun Liang

    Abstract: Functional verification plays a central role in ensuring the correctness of modern integrated circuit designs, where constrained-random verification is widely adopted to generate diverse stimuli under high-level constraints. In industrial verification environments, constraint solving increasingly involves dynamic data structures whose shape and content are determined at runtime, causing the sets o… ▽ More

    Submitted 4 April, 2026; originally announced April 2026.

  7. arXiv:2604.03562  [pdf, ps, other

    cs.AI

    When Adaptive Rewards Hurt: Causal Probing and the Switching-Stability Dilemma in LLM-Guided LEO Satellite Scheduling

    Authors: Yuanhang Li

    Abstract: Adaptive reward design for deep reinforcement learning (DRL) in multi-beam LEO satellite scheduling is motivated by the intuition that regime-aware reward weights should outperform static ones. We systematically test this intuition and uncover a switching-stability dilemma: near-constant reward weights (342.1 Mbps) outperform carefully-tuned dynamic weights (103.3+/-96.8 Mbps) because PPO requires… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: 8 pages, 3 figures

  8. arXiv:2604.03444  [pdf, ps, other

    cs.LG cs.CL

    Olmo Hybrid: From Theory to Practice and Back

    Authors: William Merrill, Yanhong Li, Tyler Romero, Anej Svete, Caia Costello, Pradeep Dasigi, Dirk Groeneveld, David Heineman, Bailey Kuehl, Nathan Lambert, Jacob Morrison, Luca Soldaini, Finbarr Timbers, Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi, Ashish Sabharwal

    Abstract: Recent work has demonstrated the potential of non-transformer language models, especially linear recurrent neural networks (RNNs) and hybrid models that mix recurrence and attention. Yet there is no consensus on whether the potential benefits of these new architectures justify the risk and effort of scaling them up. To address this, we provide evidence for the advantages of hybrid models over pure… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  9. arXiv:2604.03414  [pdf, ps, other

    cs.CV

    KiToke: Kernel-based Interval-aware Token Compression for Video Large Language Models

    Authors: Haifeng Huang, Yang Li

    Abstract: Video Large Language Models (Video LLMs) achieve strong performance on video understanding tasks but suffer from high inference costs due to the large number of visual tokens. We propose KiToke, a training-free, query-agnostic token compression approach that reduces spatiotemporal redundancy while preserving critical visual information. Our method estimates token diversity globally using a kernel-… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  10. arXiv:2604.03302  [pdf, ps, other

    cs.CV cs.AI

    Beyond Static Vision: Scene Dynamic Field Unlocks Intuitive Physics Understanding in Multi-modal Large Language Models

    Authors: Nanxi Li, Xiang Wang, Yuanjie Chen, Haode Zhang, Hong Li, Yong-Lu Li

    Abstract: While Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in image and video understanding, their ability to comprehend the physical world has become an increasingly important research focus. Despite their improvements, current MLLMs struggle significantly with high-level physics reasoning. In this work, we investigate the first step of physical reasoning, i.e., intu… ▽ More

    Submitted 30 March, 2026; originally announced April 2026.

  11. arXiv:2604.03212  [pdf, ps, other

    cs.CV

    ProtoFlow: Mitigating Forgetting in Class-Incremental Remote Sensing Segmentation via Low-Curvature Prototype Flow

    Authors: Jiekai Wu, Rong Fu, Chuangqi Li, Zijian Zhang, Guangxin Wu, Hao Zhang, Shiyin Lin, Jianyuan Ni, Yang Li, Dongxu Zhang, Amir H. Gandomi, Simon Fong, Pengbin Feng

    Abstract: Remote sensing segmentation in real deployment is inherently continual: new semantic categories emerge, and acquisition conditions shift across seasons, cities, and sensors. Despite recent progress, many incremental approaches still treat training steps as isolated updates, which leaves representation drift and forgetting insufficiently controlled. We present ProtoFlow, a time-aware prototype dyna… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  12. arXiv:2604.03198  [pdf, ps, other

    cs.CV

    The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, Hongyuan Yu, Pufan Xu, Chen Wu, Long Peng, Jiaojiao Yi, Siyang Yi, Yuning Cui, Jingyuan Xia, Xing Mou, Keji He, Jinlin Wu, Zongang Gao , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2026 challenge on efficient single-image super-resolution with a focus on the proposed solutions and results. The aim of this challenge is to devise a network that reduces one or several aspects, such as runtime, parameters, and FLOPs, while maintaining PSNR of around 26.90 dB on the DIV2K_LSDIR_valid dataset, and 26.99 dB on the DIV2K_LSDIR_test dataset. The challenge… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: CVPR 2026 NTIRE Workshop Paper, Efficient Super Resolution Technical Report

  13. arXiv:2604.03144  [pdf, ps, other

    cs.AR cs.AI cs.CL

    InCoder-32B-Thinking: Industrial Code World Model for Thinking

    Authors: Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Tuney Zheng, Fanglin Xu, Weicheng Gu, Lin Jing, Yaxin Du, Joseph Li, Yizhi Li, Yan Xing, Chuan Hao, Ran Tao, Ruihao Gong, Aishan Liu, Zhoujun Li, Mingjie Tang, Chenghua Lin, Siheng Chen, Wayne Xin Zhao, Xianglong Liu, Ming Zhou, Bryan Dai, Weifeng Lv

    Abstract: Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning tra… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  14. arXiv:2604.03139  [pdf, ps, other

    cs.RO

    FSUNav: A Cerebrum-Cerebellum Architecture for Fast, Safe, and Universal Zero-Shot Goal-Oriented Navigation

    Authors: Mingao Tan, Yiyang Li, Shanze Wang, Xinming Zhang, Wei Zhang

    Abstract: Current vision-language navigation methods face substantial bottlenecks regarding heterogeneous robot compatibility, real-time performance, and navigation safety. Furthermore, they struggle to support open-vocabulary semantic generalization and multimodal task inputs. To address these challenges, this paper proposes FSUNav: a Cerebrum-Cerebellum architecture for fast, safe, and universal zero-shot… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  15. arXiv:2604.03134  [pdf, ps, other

    cs.CV

    SD-FSMIS: Adapting Stable Diffusion for Few-Shot Medical Image Segmentation

    Authors: Meihua Li, Yang Zhang, Weizhao He, Hu Qu, Yisong Li

    Abstract: Few-Shot Medical Image Segmentation (FSMIS) aims to segment novel object classes in medical images using only minimal annotated examples, addressing the critical challenges of data scarcity and domain shifts prevalent in medical imaging. While Diffusion Models (DM) excel in visual tasks, their potential for FSMIS remains largely unexplored. We propose that the rich visual priors learned by large-s… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: CVPR2026

  16. arXiv:2604.03081  [pdf, ps, other

    cs.CR cs.AI cs.CL

    Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

    Authors: Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng, Yuekang Li, Leo Yu Zhang, Ying Zhang, Lei Ma

    Abstract: LLM-based coding agents extend their capabilities via third-party agent skills distributed through open marketplaces without mandatory security review. Unlike traditional packages, these skills are executed as operational directives with system-level privileges, so a single malicious skill can compromise the host. Prior work has not examined whether supply-chain attacks can directly hijack an agen… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  17. arXiv:2604.03070  [pdf, ps, other

    cs.CR cs.AI

    Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

    Authors: Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng, Yuekang Li, Yanjun Zhang, Jianting Ning, Leo Yu Zhang, Lei Ma, Zhiqiang Li

    Abstract: Third-party skills extend LLM agents with powerful capabilities but often handle sensitive credentials in privileged environments, making leakage risks poorly understood. We present the first large-scale empirical study of this problem, analyzing 17,022 skills (sampled from 170,226 on SkillsMP) using static analysis, sandbox testing, and manual inspection. We identify 520 vulnerable skills with 1,… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  18. arXiv:2604.03037  [pdf, ps, other

    cs.RO cs.AI cs.CV

    ARM: Advantage Reward Modeling for Long-Horizon Manipulation

    Authors: Yiming Mao, Zixi Yu, Weixin Mao, Yinhao Li, Qirui Hu, Zihan Lan, Minzhao Zhu, Hua Chen

    Abstract: Long-horizon robotic manipulation remains challenging for reinforcement learning (RL) because sparse rewards provide limited guidance for credit assignment. Practical policy improvement thus relies on richer intermediate supervision, such as dense progress rewards, which are costly to obtain and ill-suited to non-monotonic behaviors such as backtracking and recovery. To address this, we propose Ad… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  19. arXiv:2604.02947  [pdf, ps, other

    cs.AI

    AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

    Authors: Yunhao Feng, Yifan Ding, Yingshui Tan, Xingjun Ma, Yige Li, Yutao Wu, Yifeng Gao, Kun Zhai, Yanming Guo

    Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments. Unlike chat systems, they maintain state across interactions and translate intermediate outputs into concrete actions. This creates a distinct safety challenge in that harmful behavior may emerge through sequences of individually plausible steps, including intermediat… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  20. arXiv:2604.02934  [pdf, ps, other

    cs.CV

    PolyReal: A Benchmark for Real-World Polymer Science Workflows

    Authors: Wanhao Liu, Weida Wang, Jiaqing Xie, Suorong Yang, Jue Wang, Benteng Chen, Guangtao Mei, Zonglin Yang, Shufei Zhang, Yuchun Mo, Lang Cheng, Jin Zeng, Houqiang Li, Wanli Ouyang, Yuqiang Li

    Abstract: Multimodal Large Language Models (MLLMs) excel in general domains but struggle with complex, real-world science. We posit that polymer science, an interdisciplinary field spanning chemistry, physics, biology, and engineering, is an ideal high-stakes testbed due to its diverse multimodal data. Yet, existing benchmarks related to polymer science largely overlook real-world workflows, limiting their… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  21. arXiv:2604.02923  [pdf, ps, other

    cs.CL cs.AI

    Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

    Authors: Shuai Wu, Xue Li, Yanna Feng, Yufang Li, Zhijun Wang

    Abstract: Large Language Models (LLMs), particularly those employing Mixture-of-Experts (MoE) architectures, have achieved remarkable capabilities across diverse natural language processing tasks. However, these models frequently suffer from hallucinations -- generating plausible but factually incorrect content -- and exhibit systematic biases that are amplified by uneven expert activation during inference.… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: 13 pages, 8 figures, technical report

  22. arXiv:2604.02753  [pdf, ps, other

    cs.CV

    DeCo-DETR: Decoupled Cognition DETR for efficient Open-Vocabulary Object Detection

    Authors: Siheng Wang, Yanshu Li, Bohan Hu, Zhengdao Li, Haibo Zhan, Linshan Li, Weiming Liu, Ruizhi Qian, Guangxin Wu, Hao Zhang, Jifeng Shen, Piotr Koniusz, Zhengtao Yao, Junhao Dong, Qiang Sun

    Abstract: Open-vocabulary Object Detection (OVOD) enables models to recognize objects beyond predefined categories, but existing approaches remain limited in practical deployment. On the one hand, multimodal designs often incur substantial computational overhead due to their reliance on text encoders at inference time. On the other hand, tightly coupled training objectives introduce a trade-off between clos… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: Accepted at ICLR 2026

  23. arXiv:2604.02624  [pdf

    physics.optics cs.CV cs.NE physics.app-ph

    Wavelength-multiplexed massively parallel diffractive optical information storage and image projection

    Authors: Che-Yung Shen, Yuhang Li, Cagatay Isil, Jingxi Li, Leon Lenk, Tianyi Gan, Guangdong Ma, Fazil Onuralp Ardic, Mona Jarrahi, Aydogan Ozcan

    Abstract: We introduce a wavelength-multiplexed massively parallel diffractive information storage platform composed of dielectric surfaces that are structurally optimized at the wavelength scale using deep learning to store and project thousands of distinct image patterns, each assigned to a unique wavelength. Through numerical simulations in the visible spectrum, we demonstrated that our wavelength-multip… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 28 Pages, 8 Figures

  24. arXiv:2604.02618  [pdf, ps, other

    cs.AI

    OntoKG: Ontology-Oriented Knowledge Graph Construction with Intrinsic-Relational Routing

    Authors: Yitao Li, Zhanlin Liu, Anuranjan Pandey, Muni Srikanth

    Abstract: Organizing a large-scale knowledge graph into a typed property graph requires structural decisions -- which entities become nodes, which properties become edges, and what schema governs these choices. Existing approaches embed these decisions in pipeline code or extract relations ad hoc, producing schemas that are tightly coupled to their construction process and difficult to reuse for downstream… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  25. arXiv:2604.02482  [pdf, ps, other

    cs.LG

    SEDGE: Structural Extrapolated Data Generation

    Authors: Kun Zhang, Jiaqi Sun, Yiqing Li, Ignavier Ng, Namrata Deka, Shaoan Xie

    Abstract: This paper proposes a framework for Structural Extrapolated Data GEneration (SEDGE) based on suitable assumptions on the underlying data generating process. We provide conditions under which data satisfying new specifications can be generated reliably, together with the approximate identifiability of the distribution of such data under certain ``conservative" assumptions. On the algorithmic side,… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  26. arXiv:2604.02368  [pdf, ps, other

    cs.AI cs.CL

    Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

    Authors: Xue Liu, Xin Ma, Yuxin Ma, Yongchang Peng, Duo Wang, Zhoufutu Wen, Ge Zhang, Kaiyuan Zhang, Xinyu Chen, Tianci He, Jiani Hou, Liang Hu, Ziyun Huang, Yongzhe Hui, Jianpeng Jiao, Chennan Ju, Yingru Kong, Yiran Li, Mengyun Liu, Luyao Ma, Fei Ni, Yiqing Ni, Yueyan Qiu, Yanle Ren, Zilin Shi , et al. (9 additional authors not shown)

    Abstract: As Large Language Models (LLMs) exhibit plateauing performance on conventional benchmarks, a pivotal challenge persists: evaluating their proficiency in complex, open-ended tasks characterizing genuine expert-level cognition. Existing frameworks suffer from narrow domain coverage, reliance on generalist tasks, or self-evaluation biases. To bridge this gap, we present XpertBench, a high-fidelity be… ▽ More

    Submitted 6 April, 2026; v1 submitted 27 March, 2026; originally announced April 2026.

  27. arXiv:2604.02190  [pdf, ps, other

    cs.CV cs.RO

    UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

    Authors: Yongkang Li, Lijun Zhou, Sixu Yan, Bencheng Liao, Tianyi Yan, Kaixin Xiong, Long Chen, Hongwei Xie, Bing Wang, Guang Chen, Hangjun Ye, Wenyu Liu, Haiyang Sun, Xinggang Wang

    Abstract: Vision-Language-Action (VLA) models have recently emerged in autonomous driving, with the promise of leveraging rich world knowledge to improve the cognitive capabilities of driving systems. However, adapting such models for driving tasks currently faces a critical dilemma between spatial perception and semantic reasoning. Consequently, existing VLA systems are forced into suboptimal compromises:… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: code has been released at https://github.com/xiaomi-research/unidrivevla

  28. arXiv:2604.02029  [pdf, ps, other

    cs.AI

    The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

    Authors: Xinlei Yu, Zhangquan Chen, Yongbo He, Tianyu Fu, Cheng Yang, Chengming Xu, Yue Ma, Xiaobin Hu, Zhe Cao, Jie Xu, Guibin Zhang, Jiale Tao, Jiayi Zhang, Siyuan Ma, Kaituo Feng, Haojie Huang, Youxing Li, Ronghao Chen, Huacan Wang, Chenglin Wu, Zikun Su, Xiaogang Xu, Kelu Yao, Kun Wang, Chen Gao , et al. (12 additional authors not shown)

    Abstract: Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of expli… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  29. arXiv:2604.02022  [pdf, ps, other

    cs.AI

    ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

    Authors: Yu Li, Haoyu Luo, Yuejin Xie, Yuqian Fu, Zhonghao Yang, Shuai Shao, Qihan Ren, Wanying Qu, Yanwei Fu, Yujiu Yang, Jing Shao, Xia Hu, Dongrui Liu

    Abstract: Evaluating the safety of LLM-based agents is increasingly important because risks in realistic deployments often emerge over multi-step interactions rather than isolated prompts or final responses. Existing trajectory-level benchmarks remain limited by insufficient interaction diversity, coarse observability of safety failures, and weak long-horizon realism. We introduce ATBench, a trajectory-leve… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  30. arXiv:2604.01884  [pdf, ps, other

    cs.CV

    GS^2: Graph-based Spatial Distribution Optimization for Compact 3D Gaussian Splatting

    Authors: Xianben Yang, Tao Wang, Yuxuan Li, Yi Jin, Haibin Ling

    Abstract: 3D Gaussian Splatting (3DGS) has demonstrated breakthrough performance in novel view synthesis and real-time rendering. Nevertheless, its practicality is constrained by the high memory cost due to a huge number of Gaussian points. Many pruning-based 3DGS variants have been proposed for memory saving, but often compromise spatial consistency and may lead to rendering artifacts. To address this issu… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  31. arXiv:2604.01874  [pdf, ps, other

    quant-ph cs.IT math-ph

    Transversal non-Clifford gates on almost-good quantum LDPC and quantum locally testable codes

    Authors: Yiming Li, Zimu Li, Zi-Wen Liu

    Abstract: We exhibit nontrivial transversal logical multi-controlled-$Z$ gates on $[\![N,Θ(N),\tildeΘ(N)]\!]$ quantum low-density parity-check codes and $[\![N,Θ(N),\tildeΘ(N)]\!]$ quantum locally testable codes with soundness $\tildeΘ(1)$, combining nearly optimal code parameters with fault-tolerant non-Clifford gates for the first time. Remarkably, our proofs are almost entirely algebraic-topological, sho… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 30 pages

  32. arXiv:2604.01723  [pdf, ps, other

    cs.RO cs.AI

    Causal Scene Narration with Runtime Safety Supervision for Vision-Language-Action Driving

    Authors: Yun Li, Yidu Zhang, Simon Thompson, Ehsan Javanmardi, Manabu Tsukada

    Abstract: Vision-Language-Action (VLA) models for autonomous driving must integrate diverse textual inputs, including navigation commands, hazard warnings, and traffic state descriptions, yet current systems often present these as disconnected fragments, forcing the model to discover on its own which environmental constraints are relevant to the current maneuver. We introduce Causal Scene Narration (CSN), w… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 18 pages, 6 figures, 4 tables

  33. arXiv:2604.01670  [pdf, ps, other

    cs.AI

    Hierarchical Memory Orchestration for Personalized Persistent Agents

    Authors: Junming Liu, Yifei Sun, Weihua Cheng, Haodong Lei, Yuqi Li, Yirong Chen, Ding Wang

    Abstract: While long-term memory is essential for intelligent agents to maintain consistent historical awareness, the accumulation of extensive interaction data often leads to performance bottlenecks. Naive storage expansion increases retrieval noise and computational latency, overwhelming the reasoning capacity of models deployed on constrained personal devices. To address this, we propose Hierarchical Mem… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 10 pages, 5 figures, 7 tables

  34. arXiv:2604.01667  [pdf, ps, other

    cs.AI cs.CV

    M3D-BFS: a Multi-stage Dynamic Fusion Strategy for Sample-Adaptive Multi-Modal Brain Network Analysis

    Authors: Rui Dong, Xiaotong Zhang, Jiaxing Li, Yueying Li, Jiayin Wei, Youyong Kong

    Abstract: Multi-modal fusion is of great significance in neuroscience which integrates information from different modalities and can achieve better performance than uni-modal methods in downstream tasks. Current multi-modal fusion methods in brain networks, which mainly focus on structural connectivity (SC) and functional connectivity (FC) modalities, are static in nature. They feed different samples into t… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  35. arXiv:2604.01635  [pdf, ps, other

    cs.CR

    Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

    Authors: Yue Li, Linying Xue, Kaiqing Lin, Hanyu Quan, Dongdong Lin, Hui Tian, Hongxia Wang, Bin Wang

    Abstract: Recent advances in GAN and diffusion models have significantly improved the realism and controllability of facial deepfake manipulation, raising serious concerns regarding privacy, security, and identity misuse. Proactive defenses attempt to counter this threat by injecting adversarial perturbations into images before manipulation takes place. However, existing approaches remain limited in effecti… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  36. arXiv:2604.01520  [pdf, ps, other

    cs.AI

    LLM Agents as Social Scientists: A Human-AI Collaborative Platform for Social Science Automation

    Authors: Lei Wang, Yuanzi Li, Jinchao Wu, Heyang Gao, Xiaohe Bo, Xu Chen, Ji-Rong Wen

    Abstract: Traditional social science research often requires designing complex experiments across vast methodological spaces and depends on real human participants, making it labor-intensive, costly, and difficult to scale. Here we present S-Researcher, an LLM-agent-based platform that assists researchers in conducting social science research more efficiently and at greater scale by "siliconizing" both the… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  37. arXiv:2604.01502  [pdf, ps, other

    stat.ML cs.LG

    Non-monotonicity in Conformal Risk Control

    Authors: Tareq Aldirawi, Yun Li, Wenge Guo

    Abstract: Conformal risk control (CRC) provides distribution-free guarantees for controlling the expected loss at a user-specified level. Existing theory typically assumes that the loss decreases monotonically with a tuning parameter that governs the size of the prediction set. This assumption is often violated in practice, where losses may behave non-monotonically due to competing objectives such as covera… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

    Comments: 38 pages, 6 figures, 3 tables

  38. arXiv:2604.01397  [pdf, ps, other

    cs.DC

    EXaCTz: Guaranteed Extremum Graph and Contour Tree Preservation for Distributed- and GPU-Parallel Lossy Compression

    Authors: Yuxiao Li, Mingze Xia, Xin Liang, Bei Wang, Hanqi Guo

    Abstract: This paper introduces EXaCTz, a parallel algorithm that concurrently preserves extremum graphs and contour trees in lossy-compressed scalar field data. While error-bounded lossy compression is essential for large-scale scientific simulations and workflows, existing topology-preserving methods suffer from (1) a significant throughput disparity, where topology correction speeds are on the order of M… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  39. arXiv:2604.01158  [pdf, ps, other

    cs.RO

    SMASH: Mastering Scalable Whole-Body Skills for Humanoid Ping-Pong with Egocentric Vision

    Authors: Junli Ren, Yinghui Li, Kai Zhang, Penglin Fu, Haoran Jiang, Yixuan Pan, Guangjun Zeng, Tao Huang, Weizhong Guo, Peng Lu, Tianyu Li, Jingbo Wang, Li Chen, Hongyang Li, Ping Luo

    Abstract: Existing humanoid table tennis systems remain limited by their reliance on external sensing and their inability to achieve agile whole-body coordination for precise task execution. These limitations stem from two core challenges: achieving low-latency and robust onboard egocentric perception under fast robot motion, and obtaining sufficiently diverse task-aligned strike motions for learning precis… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  40. arXiv:2604.00865  [pdf, ps, other

    cs.IR

    Doctor-RAG: Failure-Aware Repair for Agentic Retrieval-Augmented Generation

    Authors: Shuguang Jiao, Chengkai Huang, Shuhan Qi, Xuan Wang, Yifan Li, Lina Yao

    Abstract: Agentic Retrieval-Augmented Generation (Agentic RAG) has become a widely adopted paradigm for multi-hop question answering and complex knowledge reasoning, where retrieval and reasoning are interleaved at inference time. As reasoning trajectories grow longer, failures become increasingly common. Existing approaches typically address such failures by either stopping at diagnostic analysis or rerunn… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  41. arXiv:2604.00830  [pdf, ps, other

    cs.LG cs.AI

    Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

    Authors: Zhanzhi Lou, Hui Chen, Yibo Li, Qian Wang, Bryan Hooi

    Abstract: Test-Time Learning (TTL) enables language agents to iteratively refine their performance through repeated interactions with the environment at inference time. At the core of TTL is an adaptation policy that updates the actor policy based on experience from previous episodes, thereby improving future behavior. Existing methods rely on fixed, hand-crafted adaptation policies rather than optimizing t… ▽ More

    Submitted 2 April, 2026; v1 submitted 1 April, 2026; originally announced April 2026.

  42. arXiv:2604.00821  [pdf, ps, other

    cs.LG

    Optimal Brain Decomposition for Accurate LLM Low-Rank Approximation

    Authors: Yuhang Li, Donghyun Lee, Ruokai Yin, Priyadarshini Panda

    Abstract: Low-rank decomposition has emerged as an important problem in Large Language Model (LLM) fine-tuning and inference. Through Singular Value Decomposition (SVD), the weight matrix can be factorized into low-rank spaces optimally. Previously, a common practice was to decompose the weight in the activation-whitened space, and then achieve satisfying results. In this work, we propose Optimal Brain Deco… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  43. arXiv:2604.00627  [pdf, ps, other

    cs.CR

    When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

    Authors: Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li, Tianlong Yu, Kailong Wang

    Abstract: Model merging has emerged as a powerful technique for combining specialized capabilities from multiple fine-tuned LLMs without additional training costs. However, the security implications of this widely-adopted practice remain critically underexplored. In this work, we reveal that model merging introduces a novel attack surface that can be systematically exploited to compromise safety alignment.… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  44. arXiv:2604.00591  [pdf, ps, other

    cs.CC cs.DS math.PR

    On the average-case complexity landscape for Tensor-Isomorphism-complete problems over finite fields

    Authors: Tiange Li, Yinan Li, Youming Qiao, Dacheng Tao, Yingjie Wang

    Abstract: In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

    Comments: 45 pages

  45. arXiv:2604.00344  [pdf, ps, other

    cs.CL stat.AP

    Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learning

    Authors: Eric Hanchen Jiang, Levina Li, Rui Sun, Xiao Liang, Yubei Li, Yuchen Wu, Haozheng Luo, Hengli Li, Zhi Zhang, Zhaolu Kang, Kai-Wei Chang, Ying Nian Wu

    Abstract: Large Language Models (LLMs) have shown remarkable performance in completing various tasks. However, solving complex problems often requires the coordination of multiple agents, raising a fundamental question: how to effectively select and interconnect these agents. In this paper, we propose \textbf{Agent Q-Mix}, a reinforcement learning framework that reformulates topology selection as a cooperat… ▽ More

    Submitted 31 March, 2026; originally announced April 2026.

  46. arXiv:2604.00282  [pdf, ps, other

    cs.HC

    Not Just Duolingo: Supporting Immigrant Language Preservation Through Family-Based Play

    Authors: Alejandro Ciuba, Zheng YY Li, Aakash Gautam

    Abstract: For immigrants, language preservation is crucial to maintain their identity, but the process of immigration can put a strain on a community's ability to do so. We interviewed eight Nepali immigrants to understand barriers to language preservation across sociopolitical contexts in Nepal and immigrant life in the United States. Participants described strong motivation but limited institutional suppo… ▽ More

    Submitted 31 March, 2026; originally announced April 2026.

    Comments: CHI 2026

  47. arXiv:2604.00268  [pdf, ps, other

    cs.CC cs.DS

    The Mystery Deepens: On the Query Complexity of Tarski Fixed Points

    Authors: Xi Chen, Yuhao Li, Mihalis Yannakakis

    Abstract: We give an $O(\log^2 n)$-query algorithm for finding a Tarski fixed point over the $4$-dimensional lattice $[n]^4$, matching the $Ω(\log^2 n)$ lower bound of [EPRY20]. Additionally, our algorithm yields an ${O(\log^{\lceil (k-1)/3\rceil+1} n)}$-query algorithm for any constant $k$, improving the previous best upper bound ${O(\log^{\lceil (k-1)/2\rceil+1} n)}$ of [CL22]. Our algorithm uses a new… ▽ More

    Submitted 31 March, 2026; originally announced April 2026.

  48. arXiv:2603.30025  [pdf, ps, other

    cs.CL

    ContextClaim: A Context-Driven Paradigm for Verifiable Claim Detection

    Authors: Yufeng Li, Rrubaa Panchendrarajan, Arkaitz Zubiaga

    Abstract: Verifiable claim detection asks whether a claim expresses a factual statement that can, in principle, be assessed against external evidence. As an early filtering stage in automated fact-checking, it plays an important role in reducing the burden on downstream verification components. However, existing approaches to claim detection, whether based on check-worthiness or verifiability, rely solely o… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

  49. arXiv:2603.29957  [pdf, ps, other

    cs.SE cs.LG

    Think Anywhere in Code Generation

    Authors: Xue Jiang, Tianyu Zhang, Ge Li, Mengyang Liu, Taozhi Chen, Zhenhua Xu, Binhua Li, Wenpin Jiao, Zhi Jin, Yongbin Li, Yihong Dong

    Abstract: Recent advances in reasoning Large Language Models (LLMs) have primarily relied on upfront thinking, where reasoning occurs before final answer. However, this approach suffers from critical limitations in code generation, where upfront thinking is often insufficient as problems' full complexity only reveals itself during code implementation. Moreover, it cannot adaptively allocate reasoning effort… ▽ More

    Submitted 2 April, 2026; v1 submitted 31 March, 2026; originally announced March 2026.

  50. arXiv:2603.29944  [pdf, ps, other

    quant-ph cs.AI

    Four Generations of Quantum Biomedical Sensors

    Authors: Xin Jin, Priyam Srivastava, Ronghe Wang, Yuqing Li, Jonathan Beaumariage, Tom Purdy, M. V. Gurudev Dutt, Kang Kim, Kaushik Seshadreesan, Junyu Liu

    Abstract: Quantum sensing technologies offer transformative potential for ultra-sensitive biomedical sensing, yet their clinical translation remains constrained by classical noise limits and a reliance on macroscopic ensembles. We propose a unifying generational framework to organize the evolving landscape of quantum biosensors based on their utilization of quantum resources. First-generation devices utiliz… ▽ More

    Submitted 2 April, 2026; v1 submitted 31 March, 2026; originally announced March 2026.

    Comments: 22 pages, 5 figures, 6 tables