Skip to main content

Showing 1–50 of 1,914 results for author: Ma, W

.
  1. arXiv:2604.14474  [pdf

    cs.LG

    Scouting By Reward: VLM-TO-IRL-Driven Player Selection For Esports

    Authors: Qing Yan, Wenyu Yang, Yufei Wang, Wenhao Ma, Linchong Hu, Yifei Jin, Anton Dahbura

    Abstract: Traditional esports scouting workflows rely heavily on manual video review and aggregate performance metrics, which often fail to capture the nuanced decision-making patterns necessary to determine if a prospect fits a specific tactical archetype. To address this, we reframe style-based player evaluation in esports as an Inverse Reinforcement Learning (IRL) problem. In this paper, we introduce a n… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  2. arXiv:2604.12821  [pdf, ps, other

    cs.CY

    Detecting and Enhancing Intellectual Humility in Online Political Discourse

    Authors: Samantha D'Alonzo, Rachel Chen, Weidong Zhang, Melody Yu, Jasmine Mangat, Ivory Yang, Weicheng Ma, Martin Saveski, Soroush Vosoughi, Nabeel Gillani

    Abstract: Intellectual humility (IH)-a recognition of one's own intellectual limitations-can reduce polarization and foster more understanding across lines of difference. Yet little work explores how IH can be systematically defined, measured, evaluated, and enhanced in spaces that often lack it the most: online political discussions. In this paper, we seek to bridge these gaps by exploring two questions: 1… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

    Comments: In Proceedings of ICWSM 2026

  3. arXiv:2604.11709  [pdf, ps, other

    cs.AI

    A Mamba-Based Multimodal Network for Multiscale Blast-Induced Rapid Structural Damage Assessment

    Authors: Wanli Ma, Sivasakthy Selvakumaran, Dain G. Farrimond, Adam A. Dennis, Samuel E. Rigby

    Abstract: Accurate and rapid structural damage assessment (SDA) is crucial for post-disaster management, helping responders prioritise resources, plan rescues, and support recovery. Traditional field inspections, though precise, are limited by accessibility, safety risks, and time constraints, especially after large explosions. Machine learning with remote sensing has emerged as a scalable solution for rapi… ▽ More

    Submitted 13 April, 2026; originally announced April 2026.

  4. arXiv:2604.10963  [pdf, ps, other

    cs.AI

    Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models

    Authors: Ruiyang Li, Fang Liu, Licheng Jiao, Xinglin Xie, Jiayao Hao, Shuo Li, Xu Liu, Jingyi Yang, Lingling Li, Puhua Chen, Wenping Ma

    Abstract: Medical image segmentation supports clinical workflows by precisely delineating anatomical structures and lesions. However, medical image datasets medical image datasets suffer from acquisition noise and annotation ambiguity, causing pervasive data uncertainty that substantially undermines model robustness. Existing research focuses primarily on model architectural improvements and predictive reli… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

  5. arXiv:2604.10700  [pdf, ps, other

    eess.IV

    VCC-DSA: A Novel Vascular Consistency Constrained DSA Imaging Model for Motion Artifact Suppression

    Authors: Rongjun Ge, Weilong Mao, Jian Lu, Rong Yan, Yikun Zhang, Peng Yuan, Jun Xiang, Hui Tang, Guanyu Yang, Yudong Zhang, Yang Chen, Shuo Li

    Abstract: Digital Subtraction Angiography (DSA) is a clinically significant imaging technique for diagnosing cerebrovascular disease, as gold-standard. However, the artifacts caused by motion of high-attenuation tissues such as bones, teeth, and catheters, seriously reduce the visibility of blood vessels. This paper presents a novel Vascular Consistency Constrained DSA Imaging Model (VCC-DSA) for robust mot… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

  6. arXiv:2604.10477  [pdf, ps, other

    astro-ph.HE

    Probing the Origin of Magnetar X-ray Polarization Diversity: A Multi-wavelength Geometrical Study of 1E 1547.0-5408 and 1E 2259+586

    Authors: Biao-Peng Li, Zhi-Fu Gao, Wen-Qi Ma, Wei-Feng Zhang

    Abstract: The exceptionally high X-ray polarization recently detected in the magnetar 1E 1547.0-5408 is considered a strong candidate signature of quantum electrodynamic vacuum birefringence, an interpretation that hinges critically on the source's viewing geometry. This stark contrast to the typically lower polarization degrees seen in other magnetars prompts a fundamental question: to what extent does vie… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

    Comments: 12 pages, 7 figures

  7. arXiv:2604.10207  [pdf

    cond-mat.supr-con cond-mat.str-el physics.optics

    Ultrafast decoupling of the pseudogap from superconductivity in a pressurized cuprate

    Authors: Yanghao Meng, Wenjin Mao, Liucheng Chen, Elbert E. M. Chia, Yifeng Yang, Jianlin Luo, Lin Zhao, Xingjiang Zhou, Xiaohui Yu, Xinbo Wang

    Abstract: The relationship between the pseudogap and superconductivity remains a central puzzle in the physics of cuprates. Hydrostatic pressure provides a clean tuning parameter free from chemical disorder, yet probing the microscopic energy scales of these phases under compression has remained experimentally challenging. Here, we utilize ultrafast optical spectroscopy to construct the high-pressure phase… ▽ More

    Submitted 11 April, 2026; originally announced April 2026.

    Comments: 29 pages, 10 figures

  8. arXiv:2604.10110  [pdf, ps, other

    cs.AI

    Trust Your Memory: Verifiable Control of Smart Homes through Reinforcement Learning with Multi-dimensional Rewards

    Authors: Kai-Yuan Guo, Jiang Wang, Renjie Zhao, Tianyi Wang, Wandong Mao, Yu Gao, Mou Xiao Feng, Yi Xu

    Abstract: Large Language Models (LLMs) have become a key foundation for enabling personalized smart home experiences. While existing studies have explored how smart home assistants understand user queries to control devices in real time, their ability to perform memory-driven device control remains challenging from both evaluation and methodological perspectives. In terms of evaluation, existing benchmarks… ▽ More

    Submitted 11 April, 2026; originally announced April 2026.

  9. arXiv:2604.09525  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    High-temperature superconductivity in Nd$_{0.85}$Sr$_{0.15}$NiO$_2$ membranes under pressure

    Authors: Yonghun Lee, Mengnan Wang, Xin Wei, Yijun Yu, Wendy L. Mao, Yu Lin, Harold Y. Hwang

    Abstract: Lattice compression has emerged as a fundamental tuning parameter for nickelate superconductivity. Pressure acts as a trigger to induce superconductivity in bulk Ruddlesden-Popper nickelates. For infinite-layer nickelate thin films, compressive epitaxial strain and rare-earth ion chemical pressure have been used to substantially enhance the superconducting transition temperature ($T_c$). Efforts t… ▽ More

    Submitted 10 April, 2026; originally announced April 2026.

    Comments: 18 pages, 3 figures

  10. arXiv:2604.09196  [pdf, ps, other

    quant-ph

    Pontryagin's Principle for Leakage-Immune Adiabatic Quantum State Transfer

    Authors: Xiao-Yu Dong, Xi-Lai Wang, Wen-Long Ma

    Abstract: The standard stimulated Raman adiabatic passage (STIRAP) protocol enables high-fidelity quantum state transfer in an ideal three-level system via adiabatic following of a dark state evolution. However, in practical systems with more energy levels, control pulses with finite spectral selectivity often couple the three-level subspace to the remaining subspace, introducing leakage that fundamentally… ▽ More

    Submitted 13 April, 2026; v1 submitted 10 April, 2026; originally announced April 2026.

    Comments: 17 pages, 5 figures

  11. arXiv:2604.08559  [pdf, ps, other

    cs.CL cs.AI

    Medical Reasoning with Large Language Models: A Survey and MR-Bench

    Authors: Xiaohan Ren, Chenxiao Fan, Wenyin Ma, Hongliang He, Chongming Gao, Xiaoyan Zhao, Fuli Feng

    Abstract: Large language models (LLMs) have achieved strong performance on medical exam-style tasks, motivating growing interest in their deployment in real-world clinical settings. However, clinical decision-making is inherently safety-critical, context-dependent, and conducted under evolving evidence. In such situations, reliable LLM performance depends not on factual recall alone, but on robust medical r… ▽ More

    Submitted 17 March, 2026; originally announced April 2026.

  12. arXiv:2604.08064  [pdf, ps, other

    cs.AI

    ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models

    Authors: Chonghan Qin, Xiachong Feng, Weitao Ma, Xiaocheng Feng, Lingpeng Kong

    Abstract: Existing memory benchmarks for LLM agents evaluate explicit recall of facts, yet overlook implicit memory where experience becomes automated behavior without conscious retrieval. This gap is critical: effective assistants must automatically apply learned procedures or avoid failed actions without explicit reminders. We introduce ImplicitMemBench, the first systematic benchmark evaluating implicit… ▽ More

    Submitted 15 April, 2026; v1 submitted 9 April, 2026; originally announced April 2026.

    Comments: Accepted to ACL 2026 Main Conference

  13. arXiv:2604.07331  [pdf, ps, other

    cs.RO cs.AI cs.CV

    RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild

    Authors: Wenjing Margaret Mao, Jefferson Ng, Luyang Hu, Daniel Gehrig, Antonio Loquercio

    Abstract: Scaling up robot learning will likely require human data containing rich and long-horizon interactions in the wild. Existing approaches for collecting such data trade off portability, robustness to occlusion, and global consistency. We introduce RoSHI, a hybrid wearable that fuses low-cost sparse IMUs with the Project Aria glasses to estimate the full 3D pose and body shape of the wearer in a metr… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

    Comments: 8 pages, 4 figures. *Equal contribution by first three authors. Project webpage: https://roshi-mocap.github.io/

  14. arXiv:2604.06936  [pdf, ps, other

    math.OC

    Adaptive Distributionally Robust Optimal Control with Bayesian Ambiguity Sets

    Authors: Wentao Ma, Zhiping Chen, Huifu Xu, Enlu Zhou

    Abstract: In stochastic optimal control (SOC), uncertainty may arise from incomplete knowledge of the true probability distribution of the underlying environment, which is known as Knightian or epistemic uncertainty. Distributionally robust optimal control (DROC) models are subsequently proposed to tackle this source of uncertainty. While such models are effective in some practical applications, most existi… ▽ More

    Submitted 9 April, 2026; v1 submitted 8 April, 2026; originally announced April 2026.

  15. arXiv:2604.05506  [pdf

    cond-mat.supr-con

    Visualizing the interplay of dual electronic nematicities in kagome superconductors

    Authors: Yunmei Zhang, Jun Zhan, Ping Wu, Yun-Peng Huang, Qixiao Yuan, Hongyu Li, Zhuying Wang, Wanru Ma, Shuikang Yu, Kunming Zhang, Wanlin Cheng, Deshu Chen, Minrui Chen, Tao Wu, Ziji Xiang, Xianxin Wu, Zhenyu Wang, Xianhui Chen

    Abstract: Kagome superconductor AV$_3$Sb$_5$ (A stands for K, Rb, and Cs) hosts a wealth of intertwined electronic orders driven by geometric frustration and electron correlations. Among them, the breaking of rotational and/or time-reversal symmetry, observed within the triple-$Q$ charge density wave (CDW) phase yet exhibiting a more complex temperature dependence, remains a central puzzle. Here, by using s… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: 14pages, 5 figures;

  16. arXiv:2604.04921  [pdf, ps, other

    cs.CL cs.CV

    TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

    Authors: Weian Mao, Xi Lin, Wei Huang, Yuxin Xie, Tianfu Fu, Bohan Zhuang, Song Han, Yukang Chen

    Abstract: Extended reasoning in large language models (LLMs) creates severe KV cache memory bottlenecks. Leading KV cache compression methods estimate KV importance using attention scores from recent post-RoPE queries. However, queries rotate with position during RoPE, making representative queries very few, leading to poor top-key selection and unstable reasoning. To avoid this issue, we turn to the pre-Ro… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: Code is available at https://github.com/WeianMao/triattention

  17. arXiv:2604.04913  [pdf, ps, other

    cs.CV

    A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

    Authors: Tommie Kerssies, Gabriele Berton, Ju He, Qihang Yu, Wufei Ma, Daan de Geus, Gijs Dubbelman, Liang-Chieh Chen

    Abstract: Anticipating diverse future states is a central challenge in video world modeling. Discriminative world models produce a deterministic prediction that implicitly averages over possible futures, while existing generative world models remain computationally expensive. Recent work demonstrates that predicting the future in the feature space of a vision foundation model (VFM), rather than a latent spa… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: CVPR 2026. Code and weights: https://deltatok.github.io

  18. arXiv:2604.03037  [pdf, ps, other

    cs.RO cs.AI cs.CV

    ARM: Advantage Reward Modeling for Long-Horizon Manipulation

    Authors: Yiming Mao, Zixi Yu, Weixin Mao, Yinhao Li, Qirui Hu, Zihan Lan, Minzhao Zhu, Hua Chen

    Abstract: Long-horizon robotic manipulation remains challenging for reinforcement learning (RL) because sparse rewards provide limited guidance for credit assignment. Practical policy improvement thus relies on richer intermediate supervision, such as dense progress rewards, which are costly to obtain and ill-suited to non-monotonic behaviors such as backtracking and recovery. To address this, we propose Ad… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

  19. arXiv:2604.01533  [pdf, ps, other

    eess.AS

    Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation

    Authors: Fuxiang Tao, Dongwei Li, Shuning Tang, Xuri Ge, Wei Ma, Anna Esposito, Alessandro Vinciarelli

    Abstract: Speech-based depression detection has shown promise as an objective diagnostic tool, yet the cross-linguistic robustness of acoustic markers and their neurobiological underpinnings remain underexplored. This study extends Cross-Data Multilevel Attention (CDMA) framework, initially validated on Italian, to investigate these dimensions using a Chinese Mandarin dataset with Electroencephalography (EE… ▽ More

    Submitted 5 April, 2026; v1 submitted 1 April, 2026; originally announced April 2026.

    Comments: 12 pages, 6 figures

  20. arXiv:2604.00792  [pdf, ps, other

    cs.CV

    HICT: High-precision 3D CBCT reconstruction from a single X-ray

    Authors: Wen Ma, Jiaxiang Liu, Zikai Xiao, Ziyang Wang, Feng Yang, Zuozhu Liu

    Abstract: Accurate 3D dental imaging is vital for diagnosis and treatment planning, yet CBCT's high radiation dose and cost limit its accessibility. Reconstructing 3D volumes from a single low-dose panoramic X-ray is a promising alternative but remains challenging due to geometric inconsistencies and limited accuracy. We propose HiCT, a two-stage framework that first generates geometrically consistent multi… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

  21. arXiv:2603.29235  [pdf, ps, other

    cs.PF

    SysOM-AI: Continuous Cross-Layer Performance Diagnosis for Production AI Training

    Authors: Yusheng Zheng, Wenan Mao, Shuyi Cheng, Fuqiu Feng, Guangshui Li, Zhaoyan Liao, Yongzhuo Huang, Zhenwei Xiao, Yuqing Li, Andi Quinn, Tao Ma

    Abstract: Performance diagnosis in production-scale AI training is challenging because subtle OS-level issues can trigger cascading GPU delays and network slowdowns, degrading training efficiency across thousands of GPUs. Existing profiling tools are limited to single system layers, incur prohibitive overhead (10--30%), or lack continuous deployment capabilities, resulting in manual analyses spanning days.… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

    Comments: 9 pages, 8 figures. Equal contribution by Wenan Mao and Yusheng Zheng

  22. arXiv:2603.27460  [pdf, ps, other

    cs.CV cs.AI

    Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

    Authors: Zhongying Deng, Cheng Tang, Ziyan Huang, Jiashi Lin, Ying Chen, Junzhi Ning, Chenglong Ma, Jiyao Liu, Wei Li, Yinghao Zhu, Shujian Gao, Yanyan Huang, Sibo Ju, Yanzhou Su, Pengcheng Chen, Wenhao Tang, Tianbin Li, Haoyu Wang, Yuanfeng Ji, Hui Sun, Shaobo Min, Liang Peng, Feilong Tang, Haochen Xue, Rulin Zhou , et al. (102 additional authors not shown)

    Abstract: Foundation models have demonstrated remarkable success across diverse domains and tasks, primarily due to the thrive of large-scale, diverse, and high-quality datasets. However, in the field of medical imaging, the curation and assembling of such medical datasets are highly challenging due to the reliance on clinical expertise and strict ethical and privacy constraints, resulting in a scarcity of… ▽ More

    Submitted 28 March, 2026; originally announced March 2026.

    Comments: 157 pages, 19 figures, 26 tables. Project repo: \url{https://github.com/uni-medical/Project-Imaging-X}

  23. arXiv:2603.26877  [pdf, ps, other

    physics.ins-det hep-ex

    Pushing the Limits of Pulse Shape Discrimination in a Large Liquid Xenon Detector

    Authors: D. S. Akerib, A. K. Al Musalhi, F. Alder, B. J. Almquist, C. S. Amarasinghe, A. Ames, T. J. Anderson, N. Angelides, H. M. Araújo, J. E. Armstrong, M. Arthurs, A. Baker, S. Balashov, J. Bang, J. W. Bargemann, E. E. Barillier, K. Beattie, A. Bhatti, T. P. Biesiadzinski, H. J. Birch, E. Bishop, G. M. Blockinger, C. A. J. Brew, P. Brás, S. Burdin , et al. (186 additional authors not shown)

    Abstract: The LUX-ZEPLIN (LZ) experiment is a direct-detection dark matter experiment, optimized to search for weakly interacting massive particles (WIMPs) through WIMP-nucleon interactions. The main challenge in dark matter detection is differentiating between WIMP signals and background events. In LZ, the ratio of ionization to scintillation signals (charge-to-light) is the primary method for rejecting el… ▽ More

    Submitted 27 March, 2026; originally announced March 2026.

    Comments: 16 pages, 14 figures

  24. arXiv:2603.26768  [pdf, ps, other

    cs.CV cs.AI cs.CL

    Aesthetic Assessment of Chinese Handwritings Based on Vision Language Models

    Authors: Chen Zheng, Yuxuan Lai, Haoyang Lu, Wentao Ma, Jitao Yang, Jian Wang

    Abstract: The handwriting of Chinese characters is a fundamental aspect of learning the Chinese language. Previous automated assessment methods often framed scoring as a regression problem. However, this score-only feedback lacks actionable guidance, which limits its effectiveness in helping learners improve their handwriting skills. In this paper, we leverage vision-language models (VLMs) to analyze the qu… ▽ More

    Submitted 24 March, 2026; originally announced March 2026.

    Comments: Accepted by CCL2025

  25. arXiv:2603.25727  [pdf, ps, other

    cs.AI cs.MM

    Back to Basics: Revisiting ASR in the Age of Voice Agents

    Authors: Geeyang Tay, Wentao Ma, Jaewon Lee, Yuzhi Tang, Daniel Lee, Weisu Yin, Dongming Shen, Silin Meng, Yi Zhu, Mu Li, Alex Smola

    Abstract: Automatic speech recognition (ASR) systems have achieved near-human accuracy on curated benchmarks, yet still fail in real-world voice agents under conditions that current evaluations do not systematically cover. Without diagnostic tools that isolate specific failure factors, practitioners cannot anticipate which conditions, in which languages, will cause what degree of degradation. We introduce W… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

    Comments: 10 pages, 5 figures

  26. arXiv:2603.22954  [pdf, ps, other

    cs.CR cs.LG

    Privacy-Preserving EHR Data Transformation via Geometric Operators: A Human-AI Co-Design Technical Report

    Authors: Maolin Wang, Beining Bao, Gan Yuan, Hongyu Chen, Bingkun Zhao, Baoshuo Kan, Jiming Xu, Qi Shi, Yinggong Zhao, Yao Wang, Wei Ying Ma, Jun Yan

    Abstract: Electronic health records (EHRs) and other real-world clinical data are essential for clinical research, medical artificial intelligence, and life science, but their sharing is severely limited by privacy, governance, and interoperability constraints. These barriers create persistent data silos that hinder multi-center studies, large-scale model development, and broader biomedical discovery. Exist… ▽ More

    Submitted 24 March, 2026; originally announced March 2026.

  27. arXiv:2603.19621  [pdf, ps, other

    cs.LG cs.AI

    DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management

    Authors: Yaqi Xie, Xinru Hao, Jiaxi Liu, Will Ma, Linwei Xin, Lei Cao, Yidong Zhang

    Abstract: Deep Reinforcement Learning (DRL) provides a general-purpose methodology for training inventory policies that can leverage big data and compute. However, off-the-shelf implementations of DRL have seen mixed success, often plagued by high sensitivity to the hyperparameters used during training. In this paper, we show that by imposing policy regularizations, grounded in classical inventory concepts… ▽ More

    Submitted 19 March, 2026; originally announced March 2026.

  28. arXiv:2603.18468  [pdf

    cond-mat.mtrl-sci

    Multiscale simulations guided advances for all-optical phase-change waveguides

    Authors: Hanyi Zhang, Wanting Ma, Wen Zhou, Xueqi Xing, Junying Zhang, Tiankuo Huang, Ding Xu, Xiaozhe Wang, Riccardo Mazzarello, En Ma, Jiang-Jing Wang, Wei Zhang

    Abstract: Photonic computing using chalcogenide phase-change materials (PCMs) is under active development for energy-efficient artificial intelligence (AI) applications. A key requirement is to enable as many optically programmable levels per device as possible, while maintaining relatively low optical loss. In this work, we carry out multiscale simulations using density functional theory and finite-differe… ▽ More

    Submitted 9 April, 2026; v1 submitted 19 March, 2026; originally announced March 2026.

    Comments: 21 pages, 8 figures

  29. arXiv:2603.17746  [pdf, ps, other

    cs.CV

    Concept-to-Pixel: Prompt-Free Universal Medical Image Segmentation

    Authors: Haoyun Chen, Fenghe Tang, Wenxin Ma, Shaohua Kevin Zhou

    Abstract: Universal medical image segmentation seeks to use a single foundational model to handle diverse tasks across multiple imaging modalities. However, existing approaches often rely heavily on manual visual prompts or retrieved reference images, which limits their automation and robustness. In addition, naive joint training across modalities often fails to address large domain shifts. To address these… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

    Comments: 32 pages, code is available at: https://github.com/Yundi218/Concept-to-Pixel

  30. arXiv:2603.17136  [pdf, ps, other

    cond-mat.mtrl-sci astro-ph.EP

    Spin crossover in FeO under shock compression

    Authors: Lélia Libon, Alessandra Ravasio, Silvia Pandolfi, Yanyao Zhang, Xuehui Wei, Jean-Alexis Hernandez, Hong Yang, Amanda J. Chen, Tommaso Vinci, Alessandra Benuzzi-Mounaix, Clemens Prescher, François Soubiran, Hae Ja Lee, Eric Galtier, Nick Czapla, Wendy L. Mao, Arianna E. Gleason, Sang Heon Shim, Roberto Alonso-Mori, Guillaume Morard

    Abstract: FeO (wüstite), which exhibits complex electronic and structural properties with increasing pressure and temperature, is a key mineralogical phase for understanding deep planetary interiors. However, direct measurements of its spin state at high-pressure and temperature remain challenging in static compression experiments. Here, we employ laser-driven shock compression to extend the FeO principal H… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

    Comments: 35 pages, 6 figures, under review

  31. arXiv:2603.16966  [pdf, ps, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    CineSRD: Leveraging Visual, Acoustic, and Linguistic Cues for Open-World Visual Media Speaker Diarization

    Authors: Liangbin Huang, Xiaohua Liao, Chaoqun Cui, Shijing Wang, Zhaolong Huang, Yanlong Du, Wenji Mao

    Abstract: Traditional speaker diarization systems have primarily focused on constrained scenarios such as meetings and interviews, where the number of speakers is limited and acoustic conditions are relatively clean. To explore open-world speaker diarization, we extend this task to the visual media domain, encompassing complex audiovisual programs such as films and TV series. This new setting introduces sev… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

    Comments: Accepted to CVPR 2026

  32. arXiv:2603.16472  [pdf, ps, other

    cs.IT

    Directivity Enhancement of Movable Antenna Arrays with Mutual Coupling

    Authors: Wei Xu, Lipeng Zhu, Wenyan Ma, An Liu, Rui Zhang

    Abstract: In conventional antenna arrays, mutual coupling between antenna elements is often regarded as detrimental. However, under specific conditions, it can be harnessed to enhance the far-field directivity (i.e., beamforming gain). Theoretically, the directivity of an N-antenna superdirective array over the endfire direction can reach N^{2}, significantly exceeding the directivity of a traditional uncou… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

  33. arXiv:2603.15270  [pdf, ps, other

    cs.CL cs.AI

    From Documents to Spans: Code-Centric Learning for LLM-based ICD Coding

    Authors: Xu Zhang, Wenxin Ma, Chenxu Wu, Rongsheng Wang, Kun Zhang, S. Kevin Zhou

    Abstract: ICD coding is a critical yet challenging task in healthcare. Recently, LLM-based methods demonstrate stronger generalization than discriminative methods in ICD coding. However, fine-tuning LLMs for ICD coding faces three major challenges. First, existing public ICD coding datasets provide limited coverage of the ICD code space, restricting a model's ability to generalize to unseen codes. Second, n… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

  34. arXiv:2603.15221  [pdf, ps, other

    cs.LG cs.AI

    ADV-0: Closed-Loop Min-Max Adversarial Training for Long-Tail Robustness in Autonomous Driving

    Authors: Tong Nie, Yihong Tang, Junlin He, Yuewen Mei, Jie Sun, Lijun Sun, Wei Ma, Jian Sun

    Abstract: Deploying autonomous driving systems requires robustness against long-tail scenarios that are rare but safety-critical. While adversarial training offers a promising solution, existing methods typically decouple scenario generation from policy optimization and rely on heuristic surrogates. This leads to objective misalignment and fails to capture the shifting failure modes of evolving policies. Th… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

  35. arXiv:2603.14473  [pdf, ps, other

    cs.CL

    AI Can Learn Scientific Taste

    Authors: Jingqi Tong, Mingzhe Li, Hangcheng Li, Yongzhuo Yang, Yurong Mou, Weijie Ma, Zhiheng Xi, Hongji Chen, Xiaoran Liu, Qinyuan Cheng, Ming Zhang, Qiguang Chen, Weifeng Ge, Qipeng Guo, Tianlei Ying, Tianxiang Sun, Yining Zheng, Xinchi Chen, Jun Zhao, Ning Ding, Xuanjing Huang, Yugang Jiang, Xipeng Qiu

    Abstract: Great scientists have strong judgement and foresight, closely tied to what we call scientific taste. Here, we use the term to refer to the capacity to judge and propose research ideas with high potential impact. However, most relative research focuses on improving an AI scientist's executive capability, while enhancing an AI's scientific taste remains underexplored. In this work, we propose Reinfo… ▽ More

    Submitted 15 March, 2026; originally announced March 2026.

    Comments: 44 pages, 4 figures

    ACM Class: I.2.7

  36. arXiv:2603.14346  [pdf

    physics.optics

    Robust and Active Visible-Light Integrated Photonics on Thin-Film Lithium Tantalate for Underwater Optical Wireless Communications

    Authors: Changjian Guo, Xingjie Li, Xiaofeng Wu, Jiajie Deng, Wenchang Yang, Weilong Ma, Ziliang Ruan, Kaixuan Chen, Sailing He, Liu Liu

    Abstract: Visible-light integrated photonics enables compact platforms for sensing, precision metrology, and free-space data links at visible wavelengths. However, many applications remain limited by the lack of high-speed and robust modulators in the blue-green band. Here we report, both operating at 532 nm, thin-film lithium tantalate waveguides of propagation losses of dB/cm scale and modulators with a f… ▽ More

    Submitted 15 March, 2026; originally announced March 2026.

  37. arXiv:2603.13853  [pdf, ps, other

    cs.CL cs.AI

    APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

    Authors: Kun Chen, Qingchao Kong, Zhao Feifei, Wenji Mao

    Abstract: Retrieval-augmented generation (RAG), based on large language models (LLMs), serves as a vital approach to retrieving and leveraging external knowledge in various domain applications. When confronted with complex multi-hop questions, single-round retrieval is often insufficient for accurate reasoning and problem solving. To enhance search capabilities for complex tasks, most existing works integra… ▽ More

    Submitted 17 March, 2026; v1 submitted 14 March, 2026; originally announced March 2026.

  38. arXiv:2603.12631  [pdf, ps, other

    cs.MA

    Collaborative Multi-Agent Optimization for Personalized Memory System

    Authors: Wenyu Mao, Haoyang Liu, Zhao Liu, Haosong Tan, Yaorui Shi, Jiancan Wu, An Zhang, Xiang Wang

    Abstract: Memory systems are crucial to personalized LLMs by mitigating the context window limitation in capturing long-term user-LLM conversations. Typically, such systems leverage multiple agents to handle multi-granular memory construction and personalized memory retrieval tasks. To optimize the system, existing methods focus on specializing agents on their local tasks independently via prompt engineerin… ▽ More

    Submitted 13 March, 2026; originally announced March 2026.

  39. arXiv:2603.10871  [pdf, ps, other

    cs.RO

    FG-CLTP: Fine-Grained Contrastive Language Tactile Pretraining for Robotic Manipulation

    Authors: Wenxuan Ma, Chaofan Zhang, Yinghao Cai, Guocai Yao, Shaowei Cui, Shuo Wang

    Abstract: Recent advancements in integrating tactile sensing into vision-language-action (VLA) models have demonstrated transformative potential for robotic perception. However, existing tactile representations predominantly rely on qualitative descriptors (e.g., texture), neglecting quantitative contact states such as force magnitude, contact geometry, and principal axis orientation, which are indispensabl… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

    Comments: 9 pages, 6 figures

  40. arXiv:2603.10615  [pdf

    cond-mat.other

    Topological Tunneling Magnetoresistance Driven by Type-II Weyl-Like States in the Room-Temperature Half-Metal Mn2PC Monolayer

    Authors: Wei Ma, Yu-Ting Wang, Wen-Bo Sun, Zhiheng Lv, Shuai Shi, Jian-Hong Rong, Tie-Lei Song, Zhi-Feng Liu

    Abstract: We predict the tetragonal Mn2PC monolayer to be a room-temperature ferromagnetic half-metal with a Curie temperature of 554 K. The spin-up channel hosts type-II Weyl-like crossings at the Fermi level with highly anisotropic band dispersion, whereas the spin-down channel is a wide-gap semiconductor. Topological edge states obtained from tight-binding calculations confirm the non-trivial bulk topolo… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

    Comments: 16 pages, 4 figures

  41. arXiv:2603.10459  [pdf, ps, other

    cs.RO

    SUBTA: A Framework for Supported User-Guided Bimanual Teleoperation in Structured Assembly

    Authors: Xiao Liu, Prakash Baskaran, Songpo Li, Simon Manschitz, Wei Ma, Dirk Ruiken, Soshi Iba

    Abstract: In human-robot collaboration, shared autonomy enhances human performance through precise, intuitive support. Effective robotic assistance requires accurately inferring human intentions and understanding task structures to determine optimal support timing and methods. In this paper, we present SUBTA, a supported teleoperation system for bimanual assembly that couples learned intention estimation, s… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

    Comments: 8 pages, 7 figures, accepted at ICRA 2026

  42. arXiv:2603.10426  [pdf, ps, other

    cs.IT eess.SP

    3-D Trajectory Optimization for Robust Direction Sensing in Movable Antenna Systems

    Authors: Wenyan Ma, Lipeng Zhu, Xiaodan Shao, Rui Zhang

    Abstract: This paper presents a novel wireless sensing system where a movable antenna (MA) continuously moves and receives sensing signals within a three-dimensional (3-D) region to enhance sensing performance compared with conventional fixed-position antenna (FPA)-based sensing. We show that the performance of direction vector estimation for a target is fundamentally related to the 3-D MA trajectory in ter… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

  43. arXiv:2603.10000  [pdf, ps, other

    cs.CL cs.LG

    Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

    Authors: Yuling Jiao, Yanming Lai, Huazhen Lin, Wensen Ma, Houduo Qi, Defeng Sun

    Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency across diverse tasks, exhibiting emergent properties such as semantic prompt comprehension, In-Context Learning (ICL), and Chain-of-Thought (CoT) reasoning. Despite their empirical success, the theoretical mechanisms driving these phenomena remain poorly understood. This study dives into the foundations of these observations by… ▽ More

    Submitted 12 March, 2026; v1 submitted 16 February, 2026; originally announced March 2026.

  44. arXiv:2603.09656  [pdf, ps, other

    hep-th

    Kaluza-Klein mode mixing in braneworlds: constraints on scalar absorption and physical degrees of freedom

    Authors: Wen-Xuan Ma, Chun-E Fu

    Abstract: We investigate the mixing between Kaluza-Klein (KK) modes for a bulk U(1) gauge field within braneworld models. By demanding orthonormality and completeness for the KK basis functions, we demonstrate that the decoupling of mixed sectors, specifically of the vector-scalar and scalar-scalar types, imposes stringent constraints on the warp factors of codimension-d (d>1) backgrounds. We show that the… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

    Comments: 15 pages, 2 tables

  45. arXiv:2603.09465  [pdf, ps, other

    cs.CV cs.AI

    EvoDriveVLA: Evolving Autonomous Driving Vision-Language-Action Model via Collaborative Perception-Planning Distillation

    Authors: Jiajun Cao, Xiaoan Zhang, Xiaobao Wei, Liyuqiu Huang, Wang Zijian, Hanzhen Zhang, Zhengyu Jia, Wei Mao, Hao Wang, Xianming Liu, Shuchang Zhou, Yang Wang, Shanghang Zhang

    Abstract: Vision-Language-Action models have shown great promise for autonomous driving, yet they suffer from degraded perception after unfreezing the visual encoder and struggle with accumulated instability in long-term planning. To address these challenges, we propose EvoDriveVLA-a novel collaborative perception-planning distillation framework that integrates self-anchored perceptual constraints and oracl… ▽ More

    Submitted 13 March, 2026; v1 submitted 10 March, 2026; originally announced March 2026.

    Comments: 16 pages, 5 figures

  46. arXiv:2603.07055  [pdf, ps, other

    stat.ME econ.EM math.ST

    Integrating Heterogeneous Information in Randomized Experiments: A Unified Calibration Framework

    Authors: Wei Ma, Zeqi Wu, Zheng Zhang

    Abstract: In modern randomized experiments, large-scale data collection increasingly yields rich baseline covariates and auxiliary information from multiple sources. Such information offers opportunities for more precise treatment effect estimation, but it also raises the challenge of integrating heterogeneous information coherently without compromising validity. Covariate-adaptive randomization (CAR) is wi… ▽ More

    Submitted 7 March, 2026; originally announced March 2026.

  47. arXiv:2603.05591  [pdf, ps, other

    cs.CV

    Thinking with Spatial Code for Physical-World Video Reasoning

    Authors: Jieneng Chen, Wenxin Ma, Ruisheng Yuan, Yunzhi Zhang, Jiajun Wu, Alan Yuille

    Abstract: We introduce Thinking with Spatial Code, a framework that transforms RGB video into explicit, temporally coherent 3D representations for physical-world visual question answering. We highlight the empirical finding that our proposed spatial encoder can parse videos into structured spatial code with explicit 3D oriented bounding boxes and semantic labels, enabling large language models (LLMs) to rea… ▽ More

    Submitted 5 March, 2026; originally announced March 2026.

    Comments: Code at https://github.com/Beckschen/spatialcode

  48. arXiv:2603.03641  [pdf, ps, other

    physics.soc-ph eess.SY

    The Evolution of Eco-routing under Population Growth: Evidence from Six U.S. Cities

    Authors: Zhiheng Shi, Xiaohan Xu, Wei Ma, Kairui Feng, Bin He

    Abstract: Rapid urban population growth drives car travel demand, increasing transport carbon emissions and posing a critical challenge to sustainable development. Although existing studies have demonstrated that eco-routing can reduce individual emissions, research gaps remain. On the one hand, such personal reductions have a negligible impact on overall emissions, and cannot be simply aggregated to captur… ▽ More

    Submitted 3 March, 2026; originally announced March 2026.

  49. arXiv:2603.02730  [pdf, ps, other

    cs.IR

    APAO: Adaptive Prefix-Aware Optimization for Generative Recommendation

    Authors: Yuanqing Yu, Yifan Wang, Weizhi Ma, Zhiqiang Guo, Min Zhang

    Abstract: Generative recommendation has recently emerged as a promising paradigm in sequential recommendation. It formulates the task as an autoregressive generation process, predicting discrete tokens of the next item conditioned on user interaction histories. Existing generative recommendation models are typically trained with token-level likelihood objectives, such as cross-entropy loss, while employing… ▽ More

    Submitted 3 March, 2026; originally announced March 2026.

  50. arXiv:2603.02675  [pdf, ps, other

    cs.LG

    From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

    Authors: Shuyi Zhou, Zeen Song, Wenwen Qiang, Jiyan Sun, Yao Zhou, Yinlong Liu, Wei Ma

    Abstract: Large Language Models remain vulnerable to adversarial prefix attacks (e.g., ``Sure, here is'') despite robust standard safety. We diagnose this vulnerability as Shallow Safety Alignment, stemming from a pathology we term semantic representation decay: as the model generates compliant prefixes, its internal malicious intent signal fades. To address this, we propose Two-Stage Causal-GRPO (TSC-GRPO)… ▽ More

    Submitted 3 March, 2026; originally announced March 2026.