Skip to main content

Showing 1–50 of 6,245 results for author: Zhang, M

.
  1. arXiv:2604.08337  [pdf, ps, other

    cs.CV cs.AI

    InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding

    Authors: Ashutosh Kumar, Rajat Saini, Jingjing Pan, Mustafa Erdogan, Mingfang Zhang, Betty Le Dem, Norimasa Kobori, Quan Kong

    Abstract: Current vision-language pre-training (VLP) paradigms excel at global scene understanding but struggle with instance-level reasoning due to global-only supervision. We introduce InstAP, an Instance-Aware Pre-training framework that jointly optimizes global vision-text alignment and fine-grained, instance-level contrastive alignment by grounding textual mentions to specific spatial-temporal regions.… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  2. arXiv:2604.08304  [pdf, ps, other

    cs.CR cs.AI

    Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

    Authors: Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li, Nicole Hu, Jason Chen Zhang, Qing Li, Lei Chen

    Abstract: Retrieval-augmented generation (RAG) significantly enhances large language models (LLMs) but introduces novel security risks through external knowledge access. While existing studies cover various RAG vulnerabilities, they often conflate inherent LLM risks with those specifically introduced by RAG. In this paper, we propose that secure RAG is fundamentally about the security of the external knowle… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  3. arXiv:2604.08281  [pdf, ps, other

    cs.CL

    When to Trust Tools? Adaptive Tool Trust Calibration For Tool-Integrated Math Reasoning

    Authors: Ruotao Xu, Yixin Ji, Yu Luo, Jinpeng Li, Dong Li, Peifeng Li, Juntao Li, Min Zhang

    Abstract: Large reasoning models (LRMs) have achieved strong performance enhancement through scaling test time computation, but due to the inherent limitations of the underlying language models, they still have shortcomings in tasks that require precise computation and extensive knowledge reserves. Tool-Integrated Reasoning (TIR) has emerged as a promising paradigm that incorporates tool call and execution… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  4. arXiv:2604.07958  [pdf, ps, other

    cs.CV

    ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks

    Authors: Jiayang Xu, Fan Zhuo, Majun Zhang, Changhao Pan, Zehan Wang, Siyu Chen, Xiaoda Yang, Tao Jin, Zhou Zhao

    Abstract: Current video editing models often rely on expensive paired video data, which limits their practical scalability. In essence, most video editing tasks can be formulated as a decoupled spatiotemporal process, where the temporal dynamics of the pretrained model are preserved while spatial content is selectively and precisely modified. Based on this insight, we propose ImVideoEdit, an efficient frame… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  5. arXiv:2604.07922  [pdf, ps, other

    cs.AI cs.CL

    SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking

    Authors: Weiyang Huang, Xuefeng Bai, Kehai Chen, Xinyang Chen, Yibin Chen, Weili Guan, Min Zhang

    Abstract: Large Reasoning Models (LRMs) have revolutionized complex problem-solving, yet they exhibit a pervasive "overthinking", generating unnecessarily long reasoning chains. While current solutions improve token efficiency, they often sacrifice fine-grained control or risk disrupting the logical integrity of the reasoning process. To address this, we introduce Stepwise Adaptive Thinking (SAT), a framewo… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: accepted to ACL2026 main conference

  6. arXiv:2604.07812  [pdf, ps, other

    cs.CV

    HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models

    Authors: Qihui Zhu, Tao Zhang, Yuchen Wang, Zijian Wen, Mengjie Zhang, Shuangwu Chen, Xiaobin Tan, Jian Yang, Yang Liu, Zhenhua Dong, Xianzhi Yu, Yinfei Pan

    Abstract: In multimodal large language models (MLLMs), the surge of visual tokens significantly increases the inference time and computational overhead, making them impractical for real-time or resource-constrained applications. Visual token pruning is a promising strategy for reducing the cost of MLLM inference by removing redundant visual tokens. Existing research usually assumes that all attention heads… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: CVPR 2026

  7. arXiv:2604.07800  [pdf

    physics.bio-ph

    Influence of Plaque Characteristics on Stent Biomechanical Outcomes - A Case Study on Double Kissing Crush Coronary Stenting

    Authors: Andrea Colombo, Dario Carbonarob, Mingzi Zhang, Chi Shen, Ankush Kapoor, Nigel Jepson, Claudio Chiastra, Susann Beier

    Abstract: Background Double Kissing (DK) Crush is a two-stent technique for complex coronary bifurcation lesions, yet the biomechanical influence of plaque on its performance remains poorly understood. This study developed a computational biomechanical model of the DK-Crush procedure to quantify how plaque presence and composition affect procedural outcomes and the performance of Xience Sierra and Orsiro st… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  8. arXiv:2604.07765  [pdf, ps, other

    cs.CV

    RemoteAgent: Bridging Vague Human Intents and Earth Observation with RL-based Agentic MLLMs

    Authors: Liang Yao, Shengxiang Xu, Fan Liu, Chuanyi Zhang, Bishun Yao, Rui Min, Yongjun Li, Chaoqian Ouyang, Shimin Di, Min-Ling Zhang

    Abstract: Earth Observation (EO) systems are essentially designed to support domain experts who often express their requirements through vague natural language rather than precise, machine-friendly instructions. Depending on the specific application scenario, these vague queries can demand vastly different levels of visual precision. Consequently, a practical EO AI system must bridge the gap between ambiguo… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

  9. arXiv:2604.07717  [pdf

    cs.CL cs.AI

    Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models

    Authors: Ziyi Chen, Yasir Khan, Mengyuan Zhang, Cheng Peng, Mengxian Lyu, Yiyang Liu, Krishna Vaddiparti, Robert L Cook, Mattia Prosperi, Yonghui Wu

    Abstract: Human immunodeficiency virus (HIV)-related stigma is a critical psychosocial determinant of health for people living with HIV (PLWH), influencing mental health, engagement in care, and treatment outcomes. Although stigma-related experiences are documented in clinical narratives, there is a lack of off-the-shelf tools to extract and categorize them. This study aims to develop a large language model… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

  10. arXiv:2604.07394  [pdf, ps, other

    cs.LG cs.CL

    Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

    Authors: Quantong Qiu, Zhiyi Hong, Yi Yang, Haitian Wang, Kebin Liu, Qingqing Dang, Juntao Li, Min Zhang

    Abstract: The quadratic computational complexity of standard attention mechanisms presents a severe scalability bottleneck for LLMs in long-context scenarios. While hybrid attention mechanisms combining Full Attention (FA) and Sparse Attention (SA) offer a potential solution, existing methods typically rely on static allocation ratios that fail to accommodate the variable retrieval demands of different task… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

  11. arXiv:2604.06787  [pdf, ps, other

    cs.CL

    When Is Thinking Enough? Early Exit via Sufficiency Assessment for Efficient Reasoning

    Authors: Yang Xiang, Yixin Ji, Ruotao Xu, Dan Qiao, Zheming Yang, Juntao Li, Min Zhang

    Abstract: Large reasoning models (LRMs) have achieved remarkable performance in complex reasoning tasks, driven by their powerful inference-time scaling capability. However, LRMs often suffer from overthinking, which results in substantial computational redundancy and significantly reduces efficiency. Early-exit methods aim to mitigate this issue by terminating reasoning once sufficient evidence has been ge… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

    Comments: ACL 2026 Main Conference

  12. arXiv:2604.06747  [pdf

    cs.AI

    TurboAgent: An LLM-Driven Autonomous Multi-Agent Framework for Turbomachinery Aerodynamic Design

    Authors: Juan Du, Yueteng Wu, Pan Zhao, Yuze Liu, Min Zhang, Xiaobin Xu, Xinglong Zhang

    Abstract: The aerodynamic design of turbomachinery is a complex and tightly coupled multi-stage process involving geometry generation, performance prediction, optimization, and high-fidelity physical validation. Existing intelligent design approaches typically focus on individual stages or rely on loosely coupled pipelines, making fully autonomous end-to-end design challenging. To address this issue, this s… ▽ More

    Submitted 8 April, 2026; v1 submitted 8 April, 2026; originally announced April 2026.

  13. arXiv:2604.05712  [pdf, ps, other

    hep-ex

    Precise measurement of the CKM angle $γ$ with a novel approach

    Authors: The BESIII, LHCb Collaborations, :, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco , et al. (1936 additional authors not shown)

    Abstract: A measurement of the CKM angle $γ$ is performed by applying a novel, unbinned, model-independent approach to datasets of electron-positron collisions collected by the BESIII experiment and proton-proton collisions by the LHCb experiment, corresponding to integrated luminosities of 8 fb$^{-1}$ and 9 fb$^{-1}$, respectively. The $C\!P$-violating phase $γ$ is determined from… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/5991/ (LHCb public pages)

    Report number: LHCb-PAPER-2025-064, CERN-EP-2026-068

  14. arXiv:2604.05701  [pdf, ps, other

    hep-ex

    Measurement of the CKM angle $γ$ in $B^{\pm} \rightarrow D(\rightarrow K^{0}_{\rm S} h^{\prime+}h^{\prime-})h^{\pm}$ decays with a novel approach

    Authors: The BESIII, LHCb Collaborations, :, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco , et al. (1936 additional authors not shown)

    Abstract: A measurement of the CKM angle $γ$ and related strong-phase parameters is performed using a novel, model-independent approach in ${B^{\pm}\rightarrow D(\rightarrow K^{0}_{\rm S} h^{\prime+}h^{\prime-}) h^{\pm}}$ decays, where $h^{(\prime)} \equiv π, K$. The analysis uses a joint data sample of electron-positron collisions collected by the BESIII experiment at the Beijing Electron-Positron Collider… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3989/ (LHCb public pages)

    Report number: LHCb-PAPER-2025-063, CERN-EP-2026-067

  15. arXiv:2604.05682  [pdf, ps, other

    cs.IT

    Non-GRS type MDS and AMDS codes from extended TGRS codes

    Authors: Meiying Zhang, Shudi Yang, Yanbin Zheng

    Abstract: Maximum distance separable (MDS) and almost maximum distance separable (AMDS) codes have been widely used in various fields such as communication systems, data storage, and quantum codes because of their algebraic properties and excellent error-correcting capabilities. In this paper, we construct a class of extended twisted generalized Reed-Solomon (TGRS) codes and determine the necessary and suff… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

  16. arXiv:2604.05430  [pdf, ps, other

    cs.RO

    Synergizing Efficiency and Reliability for Continuous Mobile Manipulation

    Authors: Chengkai Wu, Ruilin Wang, Yixin Zeng, Jiayuan Wang, Mingjie Zhang, Guiyong Zheng, Qun Niu, Juepeng Zheng, Jun Ma, Boyu Zhou

    Abstract: Humans seamlessly fuse anticipatory planning with immediate feedback to perform successive mobile manipulation tasks without stopping, achieving both high efficiency and reliability. Replicating this fluid and reliable behavior in robots remains fundamentally challenging, not only due to conflicts between long-horizon planning and real-time reactivity, but also because excessively pursuing efficie… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: 33 pages, 26 figures, 4 tables. Video: https://www.bilibili.com/video/BV1YWP4zxEQD

  17. arXiv:2604.05005  [pdf, ps, other

    cs.CY cs.AI cs.CL

    EduIllustrate: Towards Scalable Automated Generation Of Multimodal Educational Content

    Authors: Shuzhen Bi, Mingzi Zhang, Zhuoxuan Li, Xiaolong Wang, keqian Li, Aimin Zhou

    Abstract: Large language models are increasingly used as educational assistants, yet evaluation of their educational capabilities remains concentrated on question-answering and tutoring tasks. A critical gap exists for multimedia instructional content generation -- the ability to produce coherent, diagram-rich explanations that combine geometrically accurate visuals with step-by-step reasoning. We present E… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

  18. arXiv:2604.04986  [pdf, ps, other

    cs.LG

    Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model

    Authors: Zesheng Yao, Zhen-Hua Wan, Canjun Yang, Qingchao Xia, Mengqi Zhang

    Abstract: Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures, the proposed approach leverages a ROM to estimate the gradient information required for controller opt… ▽ More

    Submitted 4 April, 2026; originally announced April 2026.

    Comments: 43 pages, 26 figures

  19. arXiv:2604.04783  [pdf, ps, other

    cs.CR cs.AR

    GPU Acceleration of TFHE-Based High-Precision Nonlinear Layers for Encrypted LLM Inference

    Authors: Guoci Chen, Xiurui Pan, Qiao Li, Bo Mao, Congming Gao, Chengying Huan, Mingzhe Zhang, Jie Zhang

    Abstract: Deploying large language models (LLMs) as cloud services raises privacy concerns as inference may leak sensitive data. Fully Homomorphic Encryption (FHE) allows computation on encrypted data, but current FHE methods struggle with efficient and precise nonlinear function evaluation. Specifically, CKKS-based approaches require high-degree polynomial approximations, which are costly when target preci… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: 11 pages, 7 figures

  20. arXiv:2604.04074  [pdf, ps, other

    cs.AI cs.LG

    FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

    Authors: Hang Xu, Ling Yue, Chaoqian Ouyang, Yuchen Liu, Libin Zheng, Shaowu Pan, Shimin Di, Min-Ling Zhang

    Abstract: Peer review in machine learning is under growing pressure from rising submission volume and limited reviewer time. Most LLM-based reviewing systems read only the manuscript and generate comments from the paper's own narrative. This makes their outputs sensitive to presentation quality and leaves them weak when the evidence needed for review lies in related work or released code. We present FactRev… ▽ More

    Submitted 7 April, 2026; v1 submitted 5 April, 2026; originally announced April 2026.

  21. arXiv:2604.04009  [pdf, ps, other

    cs.SE

    Benchmarking and Evaluating VLMs for Software Architecture Diagram Understanding

    Authors: Shuyin Ouyang, Jie M. Zhang, Jingzhi Gong, Gunel Jahangirova, Mohammad Reza Mousavi, Jack Johns, Beum Seuk Lee, Adam Ziolkowski, Botond Virginas, Joost Noppen

    Abstract: Software architecture diagrams are important design artifacts for communicating system structure, behavior, and data organization throughout the software development lifecycle. Although recent progress in large language models has substantially advanced code-centric software engineering tasks such as code generation, testing, and maintenance, the ability of modern vision-language models (VLMs) to… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

  22. arXiv:2604.03964  [pdf, ps, other

    cs.AI

    SKILLFOUNDRY: Building Self-Evolving Agent Skill Libraries from Heterogeneous Scientific Resources

    Authors: Shuaike Shen, Wenduo Cheng, Mingqian Ma, Alistair Turcan, Martin Jinye Zhang, Jian Ma

    Abstract: Modern scientific ecosystems are rich in procedural knowledge across repositories, APIs, scripts, notebooks, documentation, databases, and papers, yet much of this knowledge remains fragmented across heterogeneous artifacts that agents cannot readily operationalize. This gap between abundant scientific know-how and usable agent capabilities is a key bottleneck for building effective scientific age… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

  23. arXiv:2604.03563  [pdf, ps, other

    astro-ph.GA

    SPURS: Evidence for Clumpy Neutral Envelopes and Ionized IGM Surrounding Little Red Dots in Abell 2744 from Ultra-Deep Rest-UV Spectroscopy

    Authors: Mengtao Tang, Daniel P. Stark, Charlotte A. Mason, Zuyi Chen, Harley Katz, Max Gronke, Lukas J. Furtak, Seok-Jun Chang, Jorryt Matthee, Lily Whitler, Adi Zitrin, Ryan Endsley, Viola Gelli, Tamojeet Roychowdhury, Peter Senchyna, Michael W. Topping, Meng Zhang

    Abstract: Rest-frame ultraviolet (UV) spectra of Little Red Dots (LRDs) often show Ly$α$ emission. Along with broad Balmer emission, LRDs are expected to produce broad Ly$α$ emission. However, the large column density of neutral gas invoked to explain the Balmer break should significantly redshift and further broaden the Ly$α$ line, making it challenging to detect without sensitive, moderate-resolution spec… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: 31 pages, 15 figures, 4 tables, submitted to ApJ

  24. arXiv:2604.03259  [pdf, ps, other

    cs.CY

    From Pre-trained Models to Large Language Models: A Comprehensive Survey of AI-Driven Psychological Computing

    Authors: Huiyao Chen, Ruimeng Liu, Yan Luo, Jiawen Zhang, Meishan Zhang, Baotian Hu, Min Zhang

    Abstract: The intersection of artificial intelligence and psychological science has experienced remarkable growth, with annual publications expanding from 859 papers in 2000 to 29,979 by 2025. However, this rapid evolution has created methodological fragmentation where similar computational techniques are independently developed across isolated psychological domains. This survey introduces the first systema… ▽ More

    Submitted 12 March, 2026; originally announced April 2026.

    Comments: 56 pages, Psychological Computing with AI

    MSC Class: 68U35 ACM Class: K.4.2

  25. arXiv:2604.03021  [pdf, ps, other

    q-bio.NC

    Temporal structure of the language hierarchy within small cortical patches

    Authors: Julien Gadonneix, Mingfang Zhang, Jérémy Rapin, Linnea Evanson, Pierre Bourdillon, Jean-Rémi King

    Abstract: Speech production requires the rapid coordination of a complex hierarchy of linguistic units, transforming a semantic representation into a precise sequence of articulatory movements. To unravel the neural mechanisms underlying this feat, we leverage recordings from eight 3.2 x 3.2 mm 64-microelectrode arrays implanted in the motor cortex and inferior frontal gyrus of two patients tasked to produc… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  26. arXiv:2604.02843  [pdf

    cond-mat.supr-con cond-mat.str-el

    High-energy electronic excitations in La3Ni2O7 by time-resolved optical spectroscopy

    Authors: Junzhi Zhu, Mengwu Huo, Yubin Wang, Yuxin Zhai, Lili Hu, Haiyun Huang, Xiu Zhang, Baixu Xiang, Mengdi Zhang, Yusong Gan, Zhiyuan An, Meng Wang, Qihua Xiong, Haiyun Liu

    Abstract: Recently, high-temperature superconductivity has been established in bilayer La3Ni2O7, which exhibits a density-wave (DW) transition at ~ 150 K under ambient pressure. The DW order is believed to be linked to superconductivity, as it is suppressed upon the emergence of superconductivity at high pressures. Here, we explore the ultrafast dynamics of high-energy electronic excitations from 10 K to ro… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: 4 figures, 12 pages

  27. ChatSVA: Bridging SVA Generation for Hardware Verification via Task-Specific LLMs

    Authors: Lik Tung Fu, Jie Zhou, Shaokai Ren, Mengli Zhang, Jia Xiong, Hugo Jiang, Nan Guan, Xi Wang, Jun Yang

    Abstract: Functional verification consumes over 50% of the IC development lifecycle, where SystemVerilog Assertions (SVAs) are indispensable for formal property verification and enhanced simulation-based debugging. However, manual SVA authoring is labor-intensive and error-prone. While Large Language Models (LLMs) show promise, their direct deployment is hindered by low functional accuracy and a severe scar… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: Accepted by DAC 2026

  28. arXiv:2604.02759  [pdf, ps, other

    cs.RO

    OMNI-PoseX: A Fast Vision Model for 6D Object Pose Estimation in Embodied Tasks

    Authors: Michael Zhang, Wei Ying, Fangwen Chen, Shifeng Bai, Hanwen Kang

    Abstract: Accurate 6D object pose estimation is a fundamental capability for embodied agents, yet remains highly challenging in open-world environments. Many existing methods often rely on closed-set assumptions or geometry-agnostic regression schemes, limiting their generalization, stability, and real-time applicability in robotic systems. We present OMNI-PoseX, a vision foundation model that introduces a… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  29. arXiv:2604.02730  [pdf, ps, other

    astro-ph.GA astro-ph.SR

    PhDLspec: physical-prior embedded deep learning method for spectroscopic determination of stellar labels in high-dimensional parameter space

    Authors: Tianmin Wu, Maosheng Xiang, Jianrong Shi, Meng Zhang, Lanya Mou, Hong-Liang Yan, A-Li Luo

    Abstract: Unlocking the full physical information encoded in low-resolution spectra poses a significant challenge for astronomical survey analysis. Such a task demands modeling spectra and optimizing astrophysical parameters in high-dimensional space, as a consequence of line blending. Here we present PhDLspec -- a deep learning framework embedded with physical priors for stellar spectra modeling and analys… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: Accepted for publication in The Astrophysical Journal. 28 pages, 16 figures. Data and code are available at Zenodo

  30. arXiv:2604.02686  [pdf, ps, other

    cs.LG cs.AI

    Beyond Semantic Manipulation: Token-Space Attacks on Reward Models

    Authors: Yuheng Zhang, Mingyue Huo, Minghao Zhu, Mengxue Zhang, Nan Jiang

    Abstract: Reward models (RMs) are widely used as optimization targets in reinforcement learning from human feedback (RLHF), yet they remain vulnerable to reward hacking. Existing attacks mainly operate within the semantic space, constructing human-readable adversarial outputs that exploit RM biases. In this work, we introduce a fundamentally different paradigm: Token Mapping Perturbation Attack (TOMPA), a f… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  31. arXiv:2604.02647  [pdf, ps, other

    cs.SE

    Runtime Execution Traces Guided Automated Program Repair with Multi-Agent Debate

    Authors: Jiaqing Wu, Tong Wu, Manqing Zhang, Yunwei Dong, Bo Shen

    Abstract: Automated Program Repair (APR) struggles with complex logic errors and silent failures. Current LLM-based APR methods are mostly static, relying on source code and basic test outputs, which fail to accurately capture complex runtime behaviors and dynamic data dependencies. While incorporating runtime evidence like execution traces exposes concrete state transitions, a single LLM interpreting this… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 12 pages, 4 figures, 8 tables

    ACM Class: D.2.5; I.2.2

  32. arXiv:2604.01826  [pdf, ps, other

    cs.CV

    SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers

    Authors: Xiang Yang, Feifei Li, Mi Zhang, Geng Hong, Xiaoyu You, Min Yang

    Abstract: Recent Text-to-Image (T2I) models based on rectified-flow transformers (e.g., SD3, FLUX) achieve high generative fidelity but remain vulnerable to unsafe semantics, especially when triggered by multi-token interactions. Existing mitigation methods largely rely on fine-tuning or attention modulation for concept unlearning; however, their expensive computational overhead and design tailored to U-Net… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: CVPR26

  33. arXiv:2604.01538  [pdf

    cs.CL cs.AI

    Countering Catastrophic Forgetting of Large Language Models for Better Instruction Following via Weight-Space Model Merging

    Authors: Mengxian Lyu, Cheng Peng, Ziyi Chen, Mengyuan Zhang, Jieting Li Lu, Yonghui Wu

    Abstract: Large language models have been adopted in the medical domain for clinical documentation to reduce clinician burden. However, studies have reported that LLMs often "forget" a significant amount of instruction-following ability when fine-tuned using a task-specific medical dataset, a critical challenge in adopting general-purpose LLMs for clinical applications. This study presents a model merging f… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  34. arXiv:2604.01092  [pdf, ps, other

    cs.CR cs.AR cs.NI

    LightGuard: Transparent WiFi Security via Physical-Layer LiFi Key Bootstrapping

    Authors: Shiqi Xu, Yuyang Du, Mingyue Zhang, Hongwei Cui, Soung Chang Liew

    Abstract: WiFi is inherently vulnerable to eavesdropping because RF signals may penetrate many physical boundaries, such as walls and floors. LiFi, by contrast, is an optical method confined to line-of-sight and blocked by opaque surfaces. We present LightGuard, a dual-link architecture built on this insight: cryptographic key establishment can be offloaded from WiFi to a physically confined LiFi channel to… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  35. arXiv:2604.00835  [pdf, ps, other

    cs.CL

    Agentic Tool Use in Large Language Models

    Authors: Jinchao Hu, Meizhi Zhong, Kehai Chen, Xuefeng Bai, Min Zhang

    Abstract: Large language models are increasingly being deployed as autonomous agents yet their real world effectiveness depends on reliable tools for information retrieval, computation and external action. Existing studies remain fragmented across tasks, tool types, and training settings, lacking a unified view of how tool-use methods differ and evolve. This paper organizes the literature into three paradig… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  36. arXiv:2604.00702  [pdf, ps, other

    cs.SE cs.CR

    Enhancing REST API Fuzzing with Access Policy Violation Checks and Injection Attacks

    Authors: Omur Sahin, Man Zhang, Andrea Arcuri

    Abstract: Due to their widespread use in industry, several techniques have been proposed in the literature to fuzz REST APIs. Existing fuzzers for REST APIs have been focusing on detecting crashes (e.g., 500 HTTP server error status code). However, security vulnerabilities can have major drastic consequences on existing cloud infrastructures. In this paper, we propose a series of novel automated oracles a… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  37. arXiv:2604.00368  [pdf, ps, other

    cs.DC

    TENT: A Declarative Slice Spraying Engine for Performant and Resilient Data Movement in Disaggregated LLM Serving

    Authors: Feng Ren, Ruoyu Qin, Teng Ma, Shangming Cai, Zheng Liu, Chao Lei, Dejiang Zhu, Ke Yang, Zheming Li, Jialei Cui, Weixiao Huang, Yikai Zhao, Yineng Zhang, Hao Wu, Xiang Gao, Yuhao Fu, Jinlei Jiang, Yongwei Wu, Mingxing Zhang

    Abstract: Modern GPU clusters are built upon a complex hierarchy of heterogeneous interconnects, ranging from multi-rail RDMA to proprietary fabrics such as Multi-Node NVLink and Ascend UB. Orchestrating these diverse links effectively remains a critical challenge in disaggregated LLM serving. Operating Mooncake TE on thousands of GPUs exposed a critical limitation shared by existing frameworks: imperative,… ▽ More

    Submitted 31 March, 2026; originally announced April 2026.

  38. arXiv:2603.29854  [pdf, ps, other

    hep-ex

    First energy scan measurement of $e^{+}e^{-}\to K^{+}K^{-}$ around the $ψ(2S)$ resonance

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (683 additional authors not shown)

    Abstract: We report the first measurement of the $e^{+}e^{-}\to K^{+}K^{-}$ cross sections around the $ψ(2S)$ resonance using the energy scan method. The analysis is based on $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of 495~pb$^{-1}$ collected with the BESIII detector at BEPCII. By analyzing the cross section line-shape, we extract the relative phase $Φ$ between the strong and el… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

    Comments: 9 pages, 4 figures

  39. arXiv:2603.29667  [pdf, ps, other

    physics.flu-dyn

    Instabilities in flow through and around a circular array of cylinders

    Authors: Huaibao Zhang, Yongliang Yang, Guangxue Wang, Mengqi Zhang

    Abstract: This paper presents results of two-dimensional direct numerical simulations (DNS) and global linear stability analyses (based on mean flow and base flow) of a viscous incompressible flow past a circular array of cylinders with six-fold rotational symmetry. Six cylinder arrays, with varied patch density $φ= N_c (d/D)^2$ (with $N_c$ cylinders of diameter $d$ within a patch of diameter $D$) is invest… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

  40. arXiv:2603.29620  [pdf, ps, other

    cs.CV cs.MM

    Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

    Authors: Shuang Chen, Quanxin Shou, Hangting Chen, Yucheng Zhou, Kaituo Feng, Wenbo Hu, Yi-Fan Zhang, Yunlong Lin, Wenxuan Huang, Mingyang Song, Dasen Dai, Bolin Jiang, Manyuan Zhang, Shi-Xue Zhang, Zhengkai Jiang, Lucas Wang, Zhao Zhong, Yu Cheng, Nanyun Peng

    Abstract: Unified multimodal models provide a natural and promising architecture for understanding diverse and complex real-world knowledge while generating high-quality images. However, they still rely primarily on frozen parametric knowledge, which makes them struggle with real-world image generation involving long-tail and knowledge-intensive concepts. Inspired by the broad success of agents on real-worl… ▽ More

    Submitted 1 April, 2026; v1 submitted 31 March, 2026; originally announced March 2026.

    Comments: Project Page: https://github.com/shawn0728/Unify-Agent

  41. arXiv:2603.29587  [pdf, ps, other

    cs.GR

    Style-Instructed Mask-Free Virtual Try On

    Authors: Mengqi Zhang, Qi Li, Mehmet Saygin Seyfioglu, Karim Bouyarmane

    Abstract: Virtual Try-On is a promising research area with broad applications in e-commerce and everyday life, enabling users to visualize garments on themselves or others before purchase. Most existing methods depend on predefined or user-specified masks to guide garment placement, but their performance is highly sensitive to mask quality, often causing misalignment or artifacts, and introduces redundant s… ▽ More

    Submitted 4 February, 2026; originally announced March 2026.

    Comments: Project page: https://smf-vto.github.io

  42. arXiv:2603.29407  [pdf, ps, other

    cs.LG cs.AI

    Hybrid Quantum-Classical Spatiotemporal Forecasting for 3D Cloud Fields

    Authors: Fu Wang, Qifeng Lu, Xinyu Long, Meng Zhang, Xiaofei Yang, Weijia Cao, Xiaowen Chu

    Abstract: Accurate forecasting of three-dimensional (3D) cloud fields is important for atmospheric analysis and short-range numerical weather prediction, yet it remains challenging because cloud evolution involves cross-layer interactions, nonlocal dependencies, and multiscale spatiotemporal dynamics. Existing spatiotemporal prediction models based on convolutions, recurrence, or attention often rely on loc… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

  43. arXiv:2603.28767  [pdf, ps, other

    cs.CV

    Gen-Searcher: Reinforcing Agentic Search for Image Generation

    Authors: Kaituo Feng, Manyuan Zhang, Shuang Chen, Yunlong Lin, Kaixuan Fan, Yilei Jiang, Hongyu Li, Dian Zheng, Chenyang Wang, Xiangyu Yue

    Abstract: Recent image generation models have shown strong capabilities in generating high-fidelity and photorealistic images. However, they are fundamentally constrained by frozen internal knowledge, thus often failing on real-world scenarios that are knowledge-intensive or require up-to-date information. In this paper, we present Gen-Searcher, as the first attempt to train a search-augmented image generat… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: Project page: https://gen-searcher.vercel.app Code: https://github.com/tulerfeng/Gen-Searcher

  44. arXiv:2603.28560  [pdf, ps, other

    cs.CV

    Curriculum-Guided Myocardial Scar Segmentation for Ischemic and Non-ischemic Cardiomyopathy

    Authors: Nivetha Jayakumar, Jonathan Pan, Shuo Wang, Bishow Paudel, Nisha Hosadurg, Cristiane C. Singulane, Sivam Bhatt, Amit R. Patel, Miaomiao Zhang

    Abstract: Identification and quantification of myocardial scar is important for diagnosis and prognosis of cardiovascular diseases. However, reliable scar segmentation from Late Gadolinium Enhancement Cardiac Magnetic Resonance (LGE-CMR) images remains a challenge due to variations in contrast enhancement across patients, suboptimal imaging conditions such as post contrast washout, and inconsistencies in gr… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  45. arXiv:2603.28458  [pdf, ps, other

    cs.LG cs.AI

    HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

    Authors: Yufei Xu, Fanxu Meng, Fan Jiang, Yuxuan Wang, Ruijie Zhou, Zhaohui Wang, Jiexi Wu, Zhixin Pan, Xiaojuan Tang, Wenjie Pei, Tongxuan Liu, Di Yin, Xing Sun, Muhan Zhang

    Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical key for each query through a lightweight indexer, then computing attention only on the selected subset. While the downstream sparse attention itself scales favorably, the indexer must still scan the entire prefix for every query, introducing an per… ▽ More

    Submitted 6 April, 2026; v1 submitted 30 March, 2026; originally announced March 2026.

  46. arXiv:2603.28452  [pdf, ps, other

    cs.SE

    Detecting and Mitigating Flakiness in REST API Fuzzing

    Authors: Man Zhang, Chongyang Shen, Andrea Arcuri, Tao Yue

    Abstract: Test flakiness is a common problem in industry, which hinders the reliability of automated build and testing workflows. Most existing research on test flakiness has primarily focused on unit and small-scale integration tests. In contrast, flakiness in system-level testing such as REST APIs are comparatively under-explored. A large body of literature has been dedicated to the topic of fuzzing REST… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  47. arXiv:2603.28362  [pdf

    cs.RO cond-mat.mtrl-sci cond-mat.soft physics.app-ph

    A Foldable and Agile Soft Electromagnetic Robot for Multimodal Navigation in Confined and Unstructured Environments

    Authors: Zhihao Lv, Xiaoyong Zhang, Mengfan Zhang, Xiaoyu Song, Xingyue Liu, Yide Liu, Shaoxing Qu, Guoyong Mao

    Abstract: Multimodal locomotion is crucial for an animal's adaptability in unstructured wild environments. Similarly, in the human gastrointestinal tract, characterized by viscoelastic mucus, complex rugae, and narrow sphincters like the cardia, multimodal locomotion is also essential for a small-scale soft robot to conduct tasks. Here, we introduce a small-scale compact, foldable, and robust soft electroma… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  48. arXiv:2603.28279  [pdf

    physics.soc-ph

    The electricity system value of the local acceptance of onshore wind in Europe

    Authors: James Price, Guillermo Valenzuela-Venegas, Oskar Vågerö, Marianne Zeyringer, Monika Bucha, Ruihong Chen, Adrienne Etard, Andrea N. Hahmann, Alena Lohrmann, Russell McKenna, Christian Mikovits, Evangelos Panos, Meixi Zhang, Luis Ramirez Camargo

    Abstract: The large-scale deployment of wind power is central to Europe`s energy transition but faces challenges due to its social and environmental impacts on communities. Here we assess how the tolerance of local stakeholders to such impacts translates across spatial scales to shape the cost and design of the continent`s net-zero electricity system using a soft-linked modelling framework. We find that low… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  49. arXiv:2603.28232  [pdf, ps, other

    hep-ex hep-ph

    Observation of $Λ^+_c\to nπ^+η$ and search for $Λ^+_c\to na_0(980)^+$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (722 additional authors not shown)

    Abstract: By analysing 6.1 ${\rm fb}^{-1}$ of data collected at center-of-mass energies between $\sqrt{s}=4.600$ and 4.843 $\rm GeV$ with the BESIII detector at the BEPCII collider, we observe the decay $Λ_c^+\to nπ^+η$ for the first time with a statistical significance of $9.5σ$. The ratio of branching fractions $\mathcal{B}(Λ_c^+\to nπ^+η)/\mathcal{B}(Λ_c^+\to Λπ^+η)$ is measured to be… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 25 pages, 6 figures

  50. arXiv:2603.27989  [pdf

    cond-mat.mtrl-sci quant-ph

    Graphitic-C3N4/TiO2(B) S-scheme Heterojunctions for Efficient Photocatalytic H2 Production and Organic Pollution Degradation

    Authors: Xiaoyi Zhou, Min Zhang, Qiushi Wang, Shiwen Du, Xuedong Jing, Zhenyi Zhang

    Abstract: Achieving both broad solar-spectrum absorption and strong redox capability is critical for semiconductor photocatalysts in environmental remediation and energy conversion. Herein, an S-scheme heterojunction photocatalyst is constructed by coupling TiO2(B) nanorods with g-C3N4 nanosheets. Its well-matched band structure extends light absorption from the UV to the visible region and enables efficien… ▽ More

    Submitted 29 March, 2026; originally announced March 2026.

    Comments: 28 pages, 10 figures