Skip to main content

Showing 1–50 of 635 results for author: Cai, R

.
  1. arXiv:2604.12775  [pdf, ps, other

    astro-ph.HE gr-qc hep-ph

    Gravitational Gertsenshtein-Zeldovich mechanism for the Association between GW190425 and FRB 20190425A

    Authors: Shao-Qin Wu, Jing-Rui Zhang, Rong-Gen Cai, Bing Zhang, Yun-Long Zhang

    Abstract: The temporal and spatial coincidence between the gravitational wave (GW) event GW190425 and the fast radio burst (FRB) event FRB 20190425A raises the intriguing possibility of a physical connection between the two. The widely discussed possibility invoking the collapse of a supermassive neutron star as the merger product suffers the inconsistency between the model prediction and the measured incli… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

    Comments: 6 pages, 1 figure

  2. arXiv:2604.12374  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

    Authors: NVIDIA, :, Aakshita Chandiramani, Aaron Blakeman, Abdullahi Olaoye, Abhibha Gupta, Abhilash Somasamudramath, Abhinav Khattar, Adeola Adesoba, Adi Renduchintala, Adil Asif, Aditya Agrawal, Aditya Vavre, Ahmad Kiswani, Aishwarya Padmakumar, Ajay Hotchandani, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Gronskiy, Alex Kondratenko, Alex Neefus, Alex Steiner, Alex Yang , et al. (522 additional authors not shown)

    Abstract: We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, a… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

  3. arXiv:2604.08141  [pdf, ps, other

    gr-qc astro-ph.CO hep-ph

    Detecting Chiral Gravitational Wave Background with a Dipole Pulsar Timing Array

    Authors: Baoyu Xu, Hanyu Jiang, Rong-Gen Cai, Misao Sasaki, Yun-Long Zhang

    Abstract: The pulsar timing array (PTA) is a powerful technique for detecting nanohertz gravitational wave backgrounds (GWBs). However, conventional PTAs lack sensitivity to parity violation in the GWB. In this work, we propose a dipole pulsar timing array system (dPTA). By deriving the overlap reduction functions (ORFs) from the cross-correlation of timing signals, we find that this system exhibits sensiti… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: 11 pages, 6 figures

  4. arXiv:2604.07343  [pdf, ps, other

    cs.CL cs.LG

    Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization

    Authors: Qiyao Ma, Dechen Gao, Rui Cai, Boqi Zhao, Hanchu Zhou, Junshan Zhang, Zhe Zhao

    Abstract: Pluralistic alignment has emerged as a critical frontier in the development of Large Language Models (LLMs), with reward models (RMs) serving as a central mechanism for capturing diverse human values. While benchmarks for general response quality are prevalent, evaluating how well reward models account for individual user preferences remains an open challenge. To bridge this gap, we introduce Pers… ▽ More

    Submitted 8 April, 2026; originally announced April 2026.

  5. arXiv:2604.06628  [pdf, ps, other

    cs.AI

    Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

    Authors: Qihan Ren, Peng Wang, Ruikun Cai, Shuai Shao, Dadi Guo, Yuejin Xie, Yafu Li, Quanshi Zhang, Xia Hu, Jing Shao, Dongrui Liu

    Abstract: A prevailing narrative in LLM post-training holds that supervised finetuning (SFT) memorizes while reinforcement learning (RL) generalizes. We revisit this claim for reasoning SFT with long chain-of-thought (CoT) supervision and find that cross-domain generalization is not absent but conditional, jointly shaped by optimization dynamics, training data, and base-model capability. Some reported failu… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: Preprint. Under review

  6. arXiv:2604.03870  [pdf, ps, other

    cs.CL

    Your Agent is More Brittle Than You Think: Uncovering Indirect Injection Vulnerabilities in Agentic LLMs

    Authors: Wenhui Zhu, Xuanzhao Dong, Xiwen Chen, Rui Cai, Peijie Qiu, Zhipeng Wang, Oana Frunza, Shao Tang, Jindong Gu, Yalin Wang

    Abstract: The rapid deployment of open-source frameworks has significantly advanced the development of modern multi-agent systems. However, expanded action spaces, including uncontrolled privilege exposure and hidden inter-system interactions, pose severe security challenges. Specifically, Indirect Prompt Injections (IPI), which conceal malicious instructions within third-party content, can trigger unauthor… ▽ More

    Submitted 4 April, 2026; originally announced April 2026.

  7. arXiv:2604.02324  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Grounded Token Initialization for New Vocabulary in LMs for Generative Recommendation

    Authors: Daiwei Chen, Zhoutong Fu, Chengming Jiang, Haichao Zhang, Ran Zhou, Tan Wang, Chunnan Yao, Guoyao Li, Rui Cai, Yihan Cao, Ruijie Jiang, Fedor Borisyuk, Jianqiang Shen, Jingwei Wu, Ramya Korlakai Vinayak

    Abstract: Language models (LMs) are increasingly extended with new learnable vocabulary tokens for domain-specific tasks, such as Semantic-ID tokens in generative recommendation. The standard practice initializes these new tokens as the mean of existing vocabulary embeddings, then relies on supervised fine-tuning to learn their representations. We present a systematic analysis of this strategy: through spec… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

  8. arXiv:2604.02204  [pdf, ps, other

    astro-ph.CO

    Non-minimally coupled quintessence with sign-switching interaction

    Authors: Jia-Qi Wang, Rong-Gen Cai, Zong-Kuan Guo, Yun-He Li, Shao-Jiang Wang, Xin Zhang

    Abstract: We propose a new non-minimally coupled quintessence model to account for the late-time dark energy dynamics indicated by recent DESI measurements. Within this framework, the quintessence density begins to decrease only when it starts to dominate the universe, which naturally accounts for the late-time onset of dark energy weakening. The coupling also induces a sign change in the effective energy t… ▽ More

    Submitted 9 April, 2026; v1 submitted 2 April, 2026; originally announced April 2026.

    Comments: v2, 10 pages, 3 tables, 5 figures

  9. arXiv:2604.01516  [pdf, ps, other

    hep-ph astro-ph.CO gr-qc

    Vacuum bubbles from cosmic ripples

    Authors: Zi-Yan Yuwen, Rong-Gen Cai, Shao-Jiang Wang

    Abstract: We investigate vacuum decays in the early Universe in the presence of curvature perturbations. For sufficiently large perturbations associated with over-densities, we find that the bounce solution develops an oscillating middle stage near the bubble wall. For small perturbations, we analytically show within the thin-wall approximation that an over- (under-) density would enhance (suppress) the vac… ▽ More

    Submitted 1 April, 2026; originally announced April 2026.

    Comments: 20 pages, 4 figures

  10. arXiv:2603.28699  [pdf, ps, other

    astro-ph.HE

    An Intertwined Short and Long GRB with 4-minute Separation

    Authors: Liang Li, Yu Wang, Bing Zhang, Ye Li, Shu-Rui Zhang, Jochen Greiner, Zhi-Ping Jin, Jin-Jun Geng, Hou-Jun Lv, Asaf Peer, Maria Dainotti, Tong Liu, Yi-Zhong Fan, Yong-Feng Huang, Zi-Gao Dai, Melin Kole, Wei-Hua Lei, Ye-Fei Yuan, Shuang-Nan Zhang, Felix Ryde, She-Sheng Xue, Rong-Gen Cai

    Abstract: Gamma-ray bursts (GRBs), the most energetic transients in the Universe, are traditionally classified into long-duration ($T_{90}>2$ s) and short-duration ($T_{90}<2$ s) events, associated with the core collapse of massive stars (Type II) and the merger of compact binary systems (Type I), respectively. The two classes exhibit distinct observational properties that serve as key diagnostic criteria f… ▽ More

    Submitted 3 April, 2026; v1 submitted 30 March, 2026; originally announced March 2026.

    Comments: 56 pages, 10 figures (including 43 panels), 9 tables

  11. arXiv:2603.26017  [pdf, ps, other

    cs.LG

    QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

    Authors: Siqiao Xue, Zhaoyang Zhu, Wei Zhang, Rongyao Cai, Rui Wang, Yixiang Mu, Fan Zhou, Jianguo Li, Peng Di, Hang Yu

    Abstract: Time series forecasting is critical across finance, healthcare, and cloud computing, yet progress is constrained by a fundamental bottleneck: the scarcity of large-scale, high-quality benchmarks. To address this gap, we introduce \textsc{QuitoBench}, a regime-balanced benchmark for time series forecasting with coverage across eight trend$\times$seasonality$\times$forecastability (TSF) regimes, des… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

    Comments: project site: https://hq-bench.github.io/quito/

  12. arXiv:2603.23961  [pdf, ps, other

    cs.LG cs.CV

    GRMLR: Knowledge-Enhanced Small-Data Learning for Deep-Sea Cold Seep Stage Inference

    Authors: Chenxu Zhou, Zelin Liu, Rui Cai, Houlin Gong, Yikang Yu, Jia Zeng, Yanru Pei, Liang Zhang, Weishu Zhao, Xiaofeng Gao

    Abstract: Deep-sea cold seep stage assessment has traditionally relied on costly, high-risk manned submersible operations and visual surveys of macrofauna. Although microbial communities provide a promising and more cost-effective alternative, reliable inference remains challenging because the available deep-sea dataset is extremely small ($n = 13$) relative to the microbial feature dimension ($p = 26$), ma… ▽ More

    Submitted 25 March, 2026; originally announced March 2026.

  13. arXiv:2603.18821  [pdf, ps, other

    gr-qc

    Thermodynamics of Kerr-Bertotti-Robinson black hole

    Authors: Li Hu, Rong-Gen Cai, Shao-Jiang Wang

    Abstract: We investigate the thermodynamic properties of the Kerr-Bertotti-Robinson black hole, an exact Petrov type D solution of Einstein-Maxwell theory describing a rotating black hole immersed in an external electromagnetic field. While the conserved angular momentum and electric charge can be computed straightforwardly, the conserved mass cannot be obtained through standard integrability methods due to… ▽ More

    Submitted 19 March, 2026; originally announced March 2026.

    Comments: 7 pages, no figure

  14. arXiv:2603.11067  [pdf, ps, other

    cs.CL cs.AI

    Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation

    Authors: Jingtao Wang, Yucong Wang, Jun Ding, Rui Cai, Xun Wang

    Abstract: Large language models (LLMs) achieve remarkable performance, yet further gains often require costly training. This has motivated growing interest in post-training techniques-especially training-free approaches that improve models at inference time without updating weights. Most training-free methods treat the model as a black box and improve outputs via input/output-level interventions, such as pr… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

  15. arXiv:2603.08299  [pdf

    cond-mat.mtrl-sci

    Atomic-resolution imaging of gold species at organic liquid-solid interfaces

    Authors: Sam Sullivan-Allsop, Nick Clark, Wendong Wang, Rongsheng Cai, William Thornley, David G. Hopkinson, James G. McHugh, Ben Davies, Samuel Pattisson, Nicholas F. Dummer, Rui Zhang, Matthew Lindley, Gareth Tainton, Jack Harrison, Hugo De Latour, Joseph Parker, Joshua Swindell, Eli G. Castanon, Amy Carl, David J. Lewis, Natalia Martsinovich, Christopher S. Allen, Mohsen Danaie, Andrew J. Logsdail, Vladimir Falko , et al. (4 additional authors not shown)

    Abstract: Understanding solid-liquid interfaces at the atomic-scale is key to improved performance of heterogeneous catalysts, electrodes and membranes. Here we combine unique specimen design, record atomic resolution in situ electron microscopy, and artificial intelligence-enabled analysis to achieve a step change in quantitative understanding of interfacial atomic behaviour. We create the first graphene l… ▽ More

    Submitted 9 March, 2026; originally announced March 2026.

    Comments: 13 pages, 5 figures

    Journal ref: Science 392, 6793, 77-82 (2026)

  16. Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL

    Authors: Bingfeng Chen, Shaobin Shi, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao

    Abstract: Generative language models have shown significant potential in single-turn Text-to-SQL. However, their performance does not extend equivalently to multi-turn Text-to-SQL. This is primarily due to generative language models' inadequacy in handling the complexities of context information and dynamic schema linking in multi-turn interactions. In this paper, we propose a framework named Track-SQL, whi… ▽ More

    Submitted 6 March, 2026; originally announced March 2026.

    Comments: Accepted at the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Long Paper, 19 pages

    Journal ref: Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 10690-10708. Association for Computational Linguistics, 2025

  17. arXiv:2603.04948  [pdf, ps, other

    cs.LG

    $\nabla$-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space

    Authors: Peihao Wang, Ruisi Cai, Zhen Wang, Hongyuan Mei, Qiang Liu, Pan Li, Zhangyang Wang

    Abstract: Scaling inference-time compute for Large Language Models (LLMs) has unlocked unprecedented reasoning capabilities. However, existing inference-time scaling methods typically rely on inefficient and suboptimal discrete search algorithms or trial-and-error prompting to improve the online policy. In this paper, we propose $\nabla$-Reasoner, an iterative generation framework that integrates differenti… ▽ More

    Submitted 5 March, 2026; originally announced March 2026.

    Comments: ICLR 2026

  18. arXiv:2603.03290  [pdf, ps, other

    cs.CL cs.AI cs.IR cs.LG

    AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

    Authors: Wenhui Zhu, Xiwen Chen, Zhipeng Wang, Jingjing Wang, Xuanzhao Dong, Minzhou Huang, Rui Cai, Hejian Sang, Hao Wang, Peijie Qiu, Yueyue Deng, Prayag Tiwari, Brendan Hogan Rappazzo, Yalin Wang

    Abstract: Long-horizon LLM agents require memory systems that remain accurate under fixed context budgets. However, existing systems struggle with two persistent challenges in long-term dialogue: (i) \textbf{disconnected evidence}, where multi-hop answers require linking facts distributed across time, and (ii) \textbf{state updates}, where evolving information (e.g., schedule changes) creates conflicts with… ▽ More

    Submitted 5 February, 2026; originally announced March 2026.

  19. arXiv:2602.24275  [pdf, ps, other

    cs.CV

    Hierarchical Action Learning for Weakly-Supervised Action Segmentation

    Authors: Junxian Huang, Ruichu Cai, Hao Zhu, Juntao Fang, Boyan Xu, Weilin Chen, Zijian Li, Shenghua Gao

    Abstract: Humans perceive actions through key transitions that structure actions across multiple abstraction levels, whereas machines, relying on visual features, tend to over-segment. This highlights the difficulty of enabling hierarchical reasoning in video understanding. Interestingly, we observe that lower-level visual and high-level action latent variables evolve at different rates, with low-level visu… ▽ More

    Submitted 27 February, 2026; originally announced February 2026.

    Journal ref: CVPR2026

  20. arXiv:2602.22926  [pdf, ps, other

    astro-ph.HE

    Pulse-resolved Classification and Characteristics of Long-duration GRBs with \emph{Swift}-BAT Data.II. Main Burst versus Extended Emission

    Authors: Liang Li, Xiao Wang, Zhi-Li Cui, Cheng-Long Xiao, Wen Li, Yu Wang, Zi-Gao Dai, Rong-Gen Cai

    Abstract: Long gamma-ray bursts (GRBs) frequently exhibit complex prompt emission structures with multiple temporally distinct episodes, such as a main emission (ME) phase followed by a weak extended emission (EE) tail. Whether these subcomponents from a common physical origin with similar classification properties, or instead represent fundamentally different emission mechanisms within a single event, rema… ▽ More

    Submitted 26 February, 2026; originally announced February 2026.

    Comments: 35 pages, 5 figures (including 71 panels), 7 tables

  21. arXiv:2602.17706  [pdf, ps, other

    cs.LG

    Parallel Complex Diffusion for Scalable Time Series Generation

    Authors: Rongyao Cai, Yuxi Wan, Kexin Zhang, Ming Jin, Zhiqiang Ge, Qingsong Wen, Yong Liu

    Abstract: Modeling long-range dependencies in time series generation poses a fundamental trade-off between representational capacity and computational efficiency. Traditional temporal diffusion models suffer from local entanglement and the $\mathcal{O}(L^2)$ cost of attention mechanisms. We address these limitations by introducing PaCoDi (Parallel Complex Diffusion), a spectral-native architecture that deco… ▽ More

    Submitted 10 February, 2026; originally announced February 2026.

  22. arXiv:2602.16323  [pdf, ps, other

    cs.HC

    Wearable AR for Restorative Breaks: How Interactive Narrative Experiences Support Relaxation for Young Adults

    Authors: Jindu Wang, Runze Cai, Shuchang Xu, Tianrui Hu, Huamin Qu, Shengdong Zhao, Ling-Ping Yuan

    Abstract: Young adults often take breaks from screen-intensive work by consuming digital content on mobile phones, which undermines rest through visual fatigue and inactivity. We introduce a design framework that embeds light break activities into media content on AR smart glasses, balancing engagement and recovery. The framework employs three strategies: (1) seamlessly guiding users by embedding activity c… ▽ More

    Submitted 18 February, 2026; originally announced February 2026.

  23. arXiv:2602.12684  [pdf, ps, other

    cs.RO cs.LG

    Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution

    Authors: Rui Cai, Jun Guo, Xinze He, Piaopiao Jin, Jie Li, Bingxuan Lin, Futeng Liu, Wei Liu, Fei Ma, Kun Ma, Feng Qiu, Heng Qu, Yifei Su, Qiao Sun, Dong Wang, Donghao Wang, Yunhong Wang, Rujie Wu, Diyun Xiang, Yu Yang, Hangjun Ye, Yuan Zhang, Quanyun Zhou

    Abstract: In this report, we introduce Xiaomi-Robotics-0, an advanced vision-language-action (VLA) model optimized for high performance and fast and smooth real-time execution. The key to our method lies in a carefully designed training recipe and deployment strategy. Xiaomi-Robotics-0 is first pre-trained on large-scale cross-embodiment robot trajectories and vision-language data, endowing it with broad an… ▽ More

    Submitted 25 March, 2026; v1 submitted 13 February, 2026; originally announced February 2026.

    Comments: Project page: https://xiaomi-robotics-0.github.io

  24. arXiv:2602.11527  [pdf, ps, other

    cs.AI

    CausalAgent: A Conversational Multi-Agent System for End-to-End Causal Inference

    Authors: Jiawei Zhu, Wei Chen, Ruichu Cai

    Abstract: Causal inference holds immense value in fields such as healthcare, economics, and social sciences. However, traditional causal analysis workflows impose significant technical barriers, requiring researchers to possess dual backgrounds in statistics and computer science, while manually selecting algorithms, handling data quality issues, and interpreting complex results. To address these challenges,… ▽ More

    Submitted 11 February, 2026; originally announced February 2026.

    Comments: Accepted by IUI 2026

  25. arXiv:2602.04337  [pdf, ps, other

    cs.CV cs.AI

    Fine-tuning Pre-trained Vision-Language Models in a Human-Annotation-Free Manner

    Authors: Qian-Wei Wang, Guanghao Meng, Ren Cai, Yaguang Song, Shu-Tao Xia

    Abstract: Large-scale vision-language models (VLMs) such as CLIP exhibit strong zero-shot generalization, but adapting them to downstream tasks typically requires costly labeled data. Existing unsupervised self-training methods rely on pseudo-labeling, yet often suffer from unreliable confidence filtering, confirmation bias, and underutilization of low-confidence samples. We propose Collaborative Fine-Tunin… ▽ More

    Submitted 4 February, 2026; originally announced February 2026.

  26. arXiv:2602.00620  [pdf, ps, other

    cs.LG cs.AI

    Rethinking Zero-Shot Time Series Classification: From Task-specific Classifiers to In-Context Inference

    Authors: Juntao Fang, Shifeng Xie, Shengbin Nie, Yuhui Ling, Yuming Liu, Zijian Li, Keli Zhang, Lujia Pan, Themis Palpanas, Ruichu Cai

    Abstract: The zero-shot evaluation of time series foundation models (TSFMs) for classification typically uses a frozen encoder followed by a task-specific classifier. However, this practice violates the training-free premise of zero-shot deployment and introduces evaluation bias due to classifier-dependent training choices. To address this issue, we propose TIC-FM, an in-context learning framework that trea… ▽ More

    Submitted 31 January, 2026; originally announced February 2026.

  27. arXiv:2601.21878  [pdf, ps, other

    gr-qc astro-ph.CO hep-ph

    Numerical simulations of primordial black hole formation via delayed first-order phase transitions

    Authors: Zhuan Ning, Xiang-Xi Zeng, Rong-Gen Cai, Shao-Jiang Wang

    Abstract: We perform fully nonlinear, spherically symmetric numerical simulations of superhorizon false-vacuum-domain (FVD) collapse in a coupled gravity-scalar-fluid system to study primordial black hole (PBH) formation during delayed first-order phase transitions (FOPTs). Using adaptive mesh refinement to resolve the bubble wall, we identify three dynamical outcomes: type B (supercritical) PBHs with an in… ▽ More

    Submitted 29 January, 2026; originally announced January 2026.

    Comments: 31 pages, 8 figures

  28. arXiv:2601.21693  [pdf, ps, other

    astro-ph.HE

    Pulse-resolved Classification and Characteristics of Long-duration GRBs with \emph{Swift}-BAT Data.I. Precursors versus Main Bursts

    Authors: Liang Li, Yu Wang, Jin-Jun Geng, Yong-Feng Huang, Rong-Gen Cai

    Abstract: We present a systematic pulse-by-pulse analysis of 22 long-duration GRBs observed by \emph{Swift}, each exhibiting a well-separated precursor before the main burst. We compare duration, spectral hardness ratio, minimum variability timescale (MVT), and spectral lag between these components. Both precursors and main bursts have durations and hardness broadly consistent with Type II GRBs. However, pr… ▽ More

    Submitted 29 January, 2026; originally announced January 2026.

    Comments: 33 pages, 6 figures (including 73 panels), 6 tables

  29. arXiv:2601.13622  [pdf, ps, other

    cs.CV cs.AI

    CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

    Authors: Donghee Lee, Rui Cai, Zhe Zhao

    Abstract: Large vision-language models (LVLMs) are typically trained using autoregressive language modeling objectives, which align visual representations with linguistic space. While effective for multimodal reasoning, this alignment can weaken vision-centric capabilities, causing LVLMs to underperform their base vision encoders on tasks such as image classification. To address this limitation, we propose… ▽ More

    Submitted 26 March, 2026; v1 submitted 20 January, 2026; originally announced January 2026.

  30. arXiv:2601.13540  [pdf, ps, other

    quant-ph

    Confined non-Hermitian skin effect in a semi-infinite Fock-state lattice

    Authors: Zhi Jiao Deng, Xing Yao Mi, Ruo Kun Cai, Chun Wang Wu, Ping Xing Chen

    Abstract: In this paper, we investigate the non-Hermitian skin effect in a semi-infinite Fock-state lattice, where the inherent coupling scales as \sqrt{n}. By analytically solving a non-uniform, non-reciprocal SSH model, we demonstrate that the intrinsic inhomogeneous coupling, in combination with nonreciprocity, fundamentally modifies the conventional skin effect. Instead of accumulating at the physical b… ▽ More

    Submitted 19 January, 2026; originally announced January 2026.

    Comments: 7 pages, 6 figures

  31. arXiv:2601.10770  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

    Authors: Runyuan Cai, Yu Lin, Yiming Wang, Chunlin Fu, Xiaodong Zeng

    Abstract: Traditional speech systems typically rely on separate, task-specific models for text-to-speech (TTS), automatic speech recognition (ASR), and voice conversion (VC), resulting in fragmented pipelines that limit scalability, efficiency, and cross-task generalization. In this paper, we present General-Purpose Audio (GPA), a unified audio foundation model that integrates multiple core speech tasks wit… ▽ More

    Submitted 15 January, 2026; originally announced January 2026.

  32. arXiv:2601.10159  [pdf, ps, other

    cs.CL

    What Gets Activated: Uncovering Domain and Driver Experts in MoE Language Models

    Authors: Guimin Hu, Meng Li, Qiwei Peng, Lijie Hu, Boyan Xu, Ruichu Cai

    Abstract: Most interpretability work focuses on layer- or neuron-level mechanisms in Transformers, leaving expert-level behavior in MoE LLMs underexplored. Motivated by functional specialization in the human brain, we analyze expert activation by distinguishing domain and driver experts. In this work, we study expert activation in MoE models across three public domains and address two key questions: (1) whi… ▽ More

    Submitted 20 January, 2026; v1 submitted 15 January, 2026; originally announced January 2026.

  33. arXiv:2601.07699  [pdf, ps, other

    cond-mat.mtrl-sci physics.ins-det

    Noise2Void for Denoising Atomic Resolution Scanning Transmission Electron Microscopy Images

    Authors: William Thornley, Sam Sullivan-Allsop, Rongsheng Cai, Nick Clark, Roman Gorbachev, Sarah J. Haigh

    Abstract: The Noise2Void technique is demonstrated for successful denoising of atomic-resolution scanning transmission electron microscopy (STEM) images. The technique is applied to denoising atomic resolution images and videos of gold adatoms on a graphene surface within a graphene liquid cell, with the denoised experimental data qualitatively demonstrating improved visibility of both the Au adatoms and th… ▽ More

    Submitted 12 January, 2026; originally announced January 2026.

    Comments: 51 pages, 17 figures

  34. arXiv:2601.07123  [pdf, ps, other

    cs.AI

    ENTRA: Entropy-Based Redundancy Avoidance in Large Language Model Reasoning

    Authors: Ruichu Cai, Haopeng Du, Qingwen Lin, Yutong Chen, Zijian Li, Boyan Xu

    Abstract: Large Reasoning Models (LRMs) often suffer from overthinking, generating unnecessarily long reasoning chains even for simple tasks. This leads to substantial computational overhead with limited performance gain, primarily due to redundant verification and repetitive generation. While prior work typically constrains output length or optimizes correctness, such coarse supervision fails to guide mode… ▽ More

    Submitted 11 January, 2026; originally announced January 2026.

  35. arXiv:2601.06927  [pdf

    cond-mat.mes-hall

    Cryogenic interface-state filling and tunneling mechanisms in strained Ge/SiGe heterostructures

    Authors: Jingrui Ma, Yuan Kang, Rui Wu, Zheng Liu, Zong-Hu Li, Tian-Yue Hao, Zhen-Zhen Kong, Gui-Lei Wang, Yong-Qiang Xu, Ran-Ran Cai, Bao-Chuan Wang, Hai-Ou Li, Gang Cao, Guo-Ping Guo

    Abstract: Traps at the semiconductor-oxide interface are considered as a major source of instability in strained Ge/SiGe quantum devices, yet the quantified study of their cryogenic behavior remains limited. In this work, we investigate interface-state trapping using Hall-bar field-effect transistors fabricated on strained Ge/SiGe heterostructures. Combining transport measurements with long-term stabilizati… ▽ More

    Submitted 11 January, 2026; originally announced January 2026.

    Comments: 14 pages, 6 figures

  36. arXiv:2601.03479  [pdf, ps, other

    cs.IR cs.AI

    Efficient Sequential Recommendation for Long Term User Interest Via Personalization

    Authors: Qiang Zhang, Hanchao Yu, Ivan Ji, Chen Yuan, Yi Zhang, Chihuang Liu, Xiaolong Wang, Christopher E. Lambert, Ren Chen, Chen Kovacs, Xinzhu Bei, Renqin Cai, Rui Li, Lizhu Zhang, Xiangjun Fan, Qunshu Zhang, Benyu Zhang

    Abstract: Recent years have witnessed success of sequential modeling, generative recommender, and large language model for recommendation. Though the scaling law has been validated for sequential models, it showed inefficiency in computational capacity when considering real-world applications like recommendation, due to the non-linear(quadratic) increasing nature of the transformer model. To improve the eff… ▽ More

    Submitted 6 January, 2026; originally announced January 2026.

    Comments: ICDM 2025

  37. arXiv:2512.21151  [pdf, ps, other

    gr-qc astro-ph.CO hep-ph

    Acoustic gravitational waves from primordial curvature perturbations

    Authors: Zhuan Ning, Zi-Yan Yuwen, Xiang-Xi Zeng, Rong-Gen Cai, Shao-Jiang Wang

    Abstract: Standard perturbative calculations of scalar-induced gravitational waves (SIGWs) have neglected nonperturbative effects in the large-amplitude regime. We develop a hybrid numerical framework to signify nonperturbative effects on the stochastic gravitational wave (GW) background sourced by primordial curvature perturbations, focusing on the acoustic channel (fluid motions). Fully general-relativist… ▽ More

    Submitted 24 December, 2025; originally announced December 2025.

    Comments: 29 pages, 8 figures, a second companion paper (with direct numerical simulations for the acoustic gravitational waves) to the letter arXiv:2504.11275 and the long paper arXiv:2504.12243

  38. arXiv:2512.19379  [pdf, ps, other

    cs.LG cs.AI cs.MM

    OmniMER: Auxiliary-Enhanced LLM Adaptation for Indonesian Multimodal Emotion Recognition

    Authors: Xueming Yan, Boyan Xu, Yaochu Jin, Lixian Xiao, Wenlong Ye, Runyang Cai, Zeqi Zheng, Jingfa Liu, Aimin Yang, Yongduan Song

    Abstract: Indonesian, spoken by over 200 million people, remains underserved in multimodal emotion recognition research despite its dominant presence on Southeast Asian social media platforms. We introduce IndoMER, the first multimodal emotion recognition benchmark for Indonesian, comprising 1,944 video segments from 203 speakers with temporally aligned text, audio, and visual annotations across seven emoti… ▽ More

    Submitted 10 February, 2026; v1 submitted 22 December, 2025; originally announced December 2025.

  39. arXiv:2512.18610  [pdf, ps, other

    cs.LG

    The Procrustean Bed of Time Series: The Optimization Bias of Point-wise Loss

    Authors: Rongyao Cai, Yuxi Wan, Kexin Zhang, Ming Jin, Hao Wang, Zhiqiang Ge, Daoyi Dong, Yong Liu, Qingsong Wen

    Abstract: Optimizing time series models via point-wise loss functions (e.g., MSE) relying on a heuristic point-wise i.i.d. assumption disregards the causal temporal structure. Focusing on the core independence issue under covariance stationarity, this paper aims to provide a first-principles analysis of the Expectation of Optimization Bias (EOB). Our analysis reveals a fundamental paradigm paradox: The more… ▽ More

    Submitted 1 February, 2026; v1 submitted 21 December, 2025; originally announced December 2025.

    Comments: 54 pages

  40. Dark matter in ALFALFA galaxies: Investigating galaxy-halo connection

    Authors: Meng Yang, Ling Zhu, Niankun Yu, Yu Lei, Runsheng Cai, Jie Wang, Zheng Zheng

    Abstract: This paper aims to investigate the galaxy-halo connection using a large sample of individual galaxies with $\mathrm{H\,I}$ integrated spectra. We determine their dark matter content by applying a dynamical method based on $\mathrm{H\,I}$ line widths measured with the curve-of-growth technique, together with inclination corrections inferred from optical images. We build a sample of 2453 gas-rich pr… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: 12 pages, 12 figures, accepted by A&A

    Journal ref: A&A 706, A64 (2026)

  41. arXiv:2512.12087  [pdf, ps, other

    cs.CL

    BLASST: Dynamic BLocked Attention Sparsity via Softmax Thresholding

    Authors: Jiayi Yuan, Cameron Shinn, Kai Xu, Jingze Cui, George Klimiashvili, Guangxuan Xiao, Perkz Zheng, Bo Li, Yuxin Zhou, Zhouhai Ye, Weijie You, Tian Zheng, Dominic Brown, Pengbo Wang, Markus Hoehnerbach, Richard Cai, Julien Demouth, John D. Owens, Xia Hu, Song Han, Timmy Liu, Huizi Mao

    Abstract: The growing demand for long-context inference capabilities in Large Language Models (LLMs) has intensified the computational and memory bottlenecks inherent to the self-attention mechanism. To address this challenge, we introduce BLASST, a drop-in, dynamic sparse attention mechanism that accelerates inference by using only a fixed scalar threshold to skip attention blocks. Our method targets pract… ▽ More

    Submitted 6 April, 2026; v1 submitted 12 December, 2025; originally announced December 2025.

  42. arXiv:2512.08403  [pdf, ps, other

    cs.SD

    DFALLM: Achieving Generalizable Multitask Deepfake Detection by Optimizing Audio LLM Components

    Authors: Yupei Li, Li Wang, Yuxiang Wang, Lei Wang, Rizhao Cai, Jie Shi, Björn W. Schuller, Zhizheng Wu

    Abstract: Audio deepfake detection has recently garnered public concern due to its implications for security and reliability. Traditional deep learning methods have been widely applied to this task but often lack generalisability when confronted with newly emerging spoofing techniques and more tasks such as spoof attribution recognition rather than simple binary classification. In principle, Large Language… ▽ More

    Submitted 15 December, 2025; v1 submitted 9 December, 2025; originally announced December 2025.

  43. arXiv:2512.04031  [pdf, ps, other

    astro-ph.IM astro-ph.HE cs.AI

    Large Language Models for Limited Noisy Data: A Gravitational Wave Identification Study

    Authors: Yixuan Li, Yuhao Lu, Yang Liu, Liang Li, R. Ruffini, Di Li, Rong-Gen Cai, Xiaoyan Zhu, Wenbin Lin, Yu Wang

    Abstract: This work investigates whether large language models (LLMs) offer advantages over traditional neural networks for astronomical data processing, in regimes with non-Gaussian, non-stationary noise and limited labeled samples. Gravitational wave observations provide an suitable test case, using only 90 LIGO events, finetuned LLMs achieve 97.4\% accuracy for identifying signals. Further experiments sh… ▽ More

    Submitted 11 January, 2026; v1 submitted 3 December, 2025; originally announced December 2025.

    Comments: 10 pages, 5 figures, submitted to ApJ

  44. arXiv:2511.22686  [pdf, ps, other

    cs.CV

    Emergent Extreme-View Geometry in 3D Foundation Models

    Authors: Yiwen Zhang, Joseph Tung, Ruojin Cai, David Fouhey, Hadar Averbuch-Elor

    Abstract: 3D foundation models (3DFMs) have recently transformed 3D vision, enabling joint prediction of depths, poses, and point maps directly from images. Yet their ability to reason under extreme, non-overlapping views remains largely unexplored. In this work, we study their internal representations and find that 3DFMs exhibit an emergent understanding of extreme-view geometry, despite never being traine… ▽ More

    Submitted 1 December, 2025; v1 submitted 27 November, 2025; originally announced November 2025.

    Comments: Project page is at https://ext-3dfms.github.io/

  45. arXiv:2511.21402  [pdf, ps, other

    cs.CL

    Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation

    Authors: Zhifeng Hao, Qibin Song, Ruichu Cai, Boyan Xu

    Abstract: Recent divide-and-conquer reasoning approaches, particularly those based on Chain-of-Thought (CoT), have substantially improved the Text-to-SQL capabilities of Large Language Models (LLMs). However, when applied to complex enterprise databases, such methods struggle to maintain coherent reasoning due to limited context capacity, unreliable schema linking, and weak grounding in database semantics.… ▽ More

    Submitted 26 November, 2025; originally announced November 2025.

  46. arXiv:2511.19606  [pdf, ps, other

    gr-qc astro-ph.CO astro-ph.GA hep-ph

    Periodic gravitational lensing by oscillating boson stars

    Authors: Xing-Yu Yang, Tan Chen, Rong-Gen Cai

    Abstract: We show that oscillating (real-scalar) boson stars can act as strictly periodic gravitational lenses and generically host an \emph{oscillating radial caustic}. Sources near this caustic cross it every half period, producing achromatic phase-locked photometric spikes synchronized with an astrometric wobble, providing a promising target for time-domain astronomy. Event-number estimation indicates a… ▽ More

    Submitted 24 November, 2025; originally announced November 2025.

    Comments: 8 pages, 9 figures

  47. arXiv:2511.16664  [pdf, ps, other

    cs.CL

    Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs

    Authors: Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan, Ruisi Cai, Marcin Chochowski, Ameya Sunil Mahabaleshwarkar, Yoshi Suhara, Oluwatobi Olabiyi, Daniel Korzekwa, Mostofa Patwary, Mohammad Shoeybi, Jan Kautz, Bryan Catanzaro, Ashwath Aithal, Nima Tajbakhsh, Pavlo Molchanov

    Abstract: Training a family of large language models targeting multiple scales and deployment objectives is prohibitively expensive, requiring separate training runs for each different size. Recent work on model compression through pruning and knowledge distillation has reduced this cost; however, this process still incurs hundreds of billions of tokens worth of training cost per compressed model. In this p… ▽ More

    Submitted 20 November, 2025; originally announced November 2025.

  48. arXiv:2511.16518  [pdf, ps, other

    cs.RO cs.CL cs.CV

    MiMo-Embodied: X-Embodied Foundation Model Technical Report

    Authors: Xiaoshuai Hao, Lei Zhou, Zhijian Huang, Zhiwen Hou, Yingbo Tang, Lingfeng Zhang, Guang Li, Zheng Lu, Shuhuai Ren, Xianhui Meng, Yuchen Zhang, Jing Wu, Jinghui Lu, Chenxu Dang, Jiayi Guan, Jianhua Wu, Zhiyi Hou, Hanbing Li, Shumeng Xia, Mingliang Zhou, Yinan Zheng, Zihao Yue, Shuhao Gu, Hao Tian, Yuannan Shen , et al. (19 additional authors not shown)

    Abstract: We open-source MiMo-Embodied, the first cross-embodied foundation model to successfully integrate and achieve state-of-the-art performance in both Autonomous Driving and Embodied AI. MiMo-Embodied sets new records across 17 embodied AI benchmarks in Task Planning, Affordance Prediction and Spatial Understanding, while also excelling in 12 autonomous driving benchmarks across Environmental Percepti… ▽ More

    Submitted 20 November, 2025; originally announced November 2025.

    Comments: Code: https://github.com/XiaomiMiMo/MiMo-Embodied Model: https://huggingface.co/XiaomiMiMo/MiMo-Embodied-7B

  49. arXiv:2511.16244  [pdf, ps, other

    astro-ph.CO gr-qc

    Constraining interacting dark energy models with black hole superradiance

    Authors: Zhen-Hong Lyu, Rong-Gen Cai, Shao-Jiang Wang, Xiang-Xi Zeng

    Abstract: The recent preference for a dynamical dark energy (DE) from the Dark Energy Spectroscopic Instrument seems to call for interactions between DE and dark matter (DM), either from direct DE-DM interaction or indirect interaction induced by modified gravity. Therefore, an independent probe for these kinds of DE-DM interactions would be appealing from observational aspects. In this paper, we propose th… ▽ More

    Submitted 7 April, 2026; v1 submitted 20 November, 2025; originally announced November 2025.

    Comments: 20 pages, 4 figures. Added minor corrections. Version accepted for publication in PRD

  50. arXiv:2511.11696  [pdf, ps, other

    cs.LG cs.CV cs.CY

    Toward Dignity-Aware AI: Next-Generation Elderly Monitoring from Fall Detection to ADL

    Authors: Xun Shao, Aoba Otani, Yuto Hirasuka, Runji Cai, Seng W. Loke

    Abstract: This position paper envisions a next-generation elderly monitoring system that moves beyond fall detection toward the broader goal of Activities of Daily Living (ADL) recognition. Our ultimate aim is to design privacy-preserving, edge-deployed, and federated AI systems that can robustly detect and understand daily routines, supporting independence and dignity in aging societies. At present, ADL-sp… ▽ More

    Submitted 12 February, 2026; v1 submitted 12 November, 2025; originally announced November 2025.

    Comments: This is the author's preprint version of a paper accepted for presentation at EAI MONAMI 2025 (to appear in Springer LNICST). The final authenticated version will be available online at Springer Link upon publication

    MSC Class: 68T07 ACM Class: I.2.11; C.2.4; K.4.1