Skip to main content

Showing 1–50 of 995 results for author: Xiao, T

.
  1. arXiv:2604.14889  [pdf, ps, other

    cs.AI

    MemoSight: Unifying Context Compression and Multi Token Prediction for Reasoning Acceleration

    Authors: Xinyu Liu, Xin Liu, Bo Jin, Runsong Zhao, Pengcheng Huang, Junhao Ruan, Bei Li, Chunyang Xiao, Tong Xiao, Jingbo Zhu

    Abstract: While Chain-of-thought (CoT) reasoning enables LLMs to solve challenging reasoning problems, as KV cache grows linearly with the number of generated tokens, CoT reasoning faces scaling issues in terms of speed and memory usage. In this work, we propose MemoSight (Memory-Foresight-based reasoning), a unified framework that integrates both context compression and multi-token prediction to mitigate t… ▽ More

    Submitted 16 April, 2026; originally announced April 2026.

  2. arXiv:2604.14683  [pdf, ps, other

    cs.AI

    DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation

    Authors: Qianqian Xie, Qingheng Xiong, He Zhu, Tiantian Xia, Xueming Han, Fanyu Meng, Jiakai Wang, Zhiqi Bai, Chengkang Jiang, Zhaohui Wang, Yubin Guo, Yuqing Wen, Jiayang Mao, Zijie Zhang, Shihao Li, Yanghai Wang, Yuxiang Ren, Junlan Feng, Jiaheng Liu

    Abstract: Deep Research Agents (DRAs) aim to solve complex, long-horizon research tasks involving planning, retrieval, multimodal understanding, and report generation, yet their evaluation remains challenging due to dynamic web environments and ambiguous task definitions. We propose DR$^{3}$-Eval, a realistic and reproducible benchmark for evaluating deep research agents on multimodal, multi-file report gen… ▽ More

    Submitted 16 April, 2026; originally announced April 2026.

  3. arXiv:2604.13938  [pdf, ps, other

    cs.CV

    ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding

    Authors: Tianze Xia, Zijian Ning, Zonglin Zhao, Mingjia Wang

    Abstract: Subject-driven image generation has shown great success in creating personalized content, but its capabilities are largely confined to single subjects in common poses. Current approaches face a fundamental conflict when handling multiple subjects with complex, distinct actions: preserving individual identities while enforcing precise pose structures. This challenge often leads to identity fusion a… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

  4. arXiv:2604.10531  [pdf, ps, other

    cs.LG cs.AI

    PepBenchmark: A Standardized Benchmark for Peptide Machine Learning

    Authors: Jiahui Zhang, Rouyi Wang, Kuangqi Zhou, Tianshu Xiao, Lingyan Zhu, Yaosen Min, Yang Wang

    Abstract: Peptide therapeutics are widely regarded as the "third generation" of drugs, yet progress in peptide Machine Learning (ML) are hindered by the absence of standardized benchmarks. Here we present PepBenchmark, which unifies datasets, preprocessing, and evaluation protocols for peptide drug discovery. PepBenchmark comprises three components: (1) PepBenchData, a well-curated collection comprising 29… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

    Journal ref: ICLR 2026

  5. arXiv:2604.10496  [pdf, ps, other

    cs.LG

    CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts

    Authors: Xiangyang Yin, Xingyu Liu, Tianhua Xia, Bo Bao, Vithursan Thangarasa, Valavan Manohararajah, Eric Sather, Sai Qian Zhang

    Abstract: Outliers have emerged as a fundamental bottleneck in preserving accuracy for low-precision large models, particularly within Mixture-of-Experts (MoE) architectures that are increasingly central to large-scale language modeling. Under post-training quantization (PTQ), these outliers induce substantial quantization errors, leading to severe accuracy degradation. While recent rotation-based smoothing… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

  6. arXiv:2604.09617  [pdf, ps, other

    cs.AI cs.IR

    AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation

    Authors: Haoxuan Zhang, Ruochi Li, Zhenni Liang, Mehri Sattari, Phat Vo, Collin Qu, Ting Xiao, Junhua Ding, Yang Zhang, Haihua Chen

    Abstract: Transparent and standardized documentation is essential for building trustworthy generative AI (GAI) systems. However, existing automated methods for generating model and data cards still face three major challenges: (i) static templates, as most systems rely on fixed query templates that cannot adapt to diverse paper structures or evolving documentation requirements; (ii) information scarcity, si… ▽ More

    Submitted 16 March, 2026; originally announced April 2026.

    Comments: This paper has been accepted to the main conference of WWW 2026

  7. arXiv:2604.05519  [pdf, ps, other

    eess.AS cs.HC cs.LG cs.SD eess.SP

    Active noise cancellation on open-ear smart glasses

    Authors: Kuang Yuan, Freddy Yifei Liu, Tong Xiao, Yiwen Song, Chengyi Shen, Saksham Bhutani, Justin Chan, Swarun Kumar

    Abstract: Smart glasses are becoming an increasingly prevalent wearable platform, with audio as a key interaction modality. However, hearing in noisy environments remains challenging because smart glasses are equipped with open-ear speakers that do not seal the ear canal. Furthermore, the open-ear design is incompatible with conventional active noise cancellation (ANC) techniques, which rely on an error mic… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

  8. arXiv:2603.28524  [pdf, ps, other

    quant-ph physics.comp-ph

    SesQ: A Surface Electrostatic Simulator for Precise Energy Participation Ratio Simulation in Superconducting Qubits

    Authors: Ziang Wang, Shuyuan Guan, Feng Wu, Xiaohang Zhang, Qiong Li, Jianxin Chen, Xin Wan, Tian Xia, Hui-Hai Zhao

    Abstract: An accurate and efficient numerical electromagnetic model for superconducting qubits is essential for characterizing and minimizing design-dependent dielectric losses. The energy participation ratio (EPR) is the commonly adopted metric used to evaluate these losses, but its calculation presents a severe multiscale computational challenge. Conventional finite element method (FEM) requires 3D volume… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 15 pages, 14 figures, 3 tables

  9. arXiv:2603.28367  [pdf, ps, other

    cs.CV

    Rethinking Structure Preservation in Text-Guided Image Editing with Visual Autoregressive Models

    Authors: Tao Xia, Jiawei Liu, Yukun Zhang, Ting Liu, Wei Wang, Lei Zhang

    Abstract: Visual autoregressive (VAR) models have recently emerged as a promising family of generative models, enabling a wide range of downstream vision tasks such as text-guided image editing. By shifting the editing paradigm from noise manipulation in diffusion-based methods to token-level operations, VAR-based approaches achieve better background preservation and significantly faster inference. However,… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  10. arXiv:2603.27992  [pdf, ps, other

    cond-mat.stat-mech

    Scaling of Long-Range Loop-Erased Random Walks

    Authors: Tianning Xiao, Xianzhi Pan, Zhijie Fan, Youjin Deng

    Abstract: We study the scaling properties of long-range loop-erased random walks (LR-LERW), where the underlying random walker performs Lévy-flight-like jumps with a power-law step-length distribution $P(\mathbf{r})\sim |\mathbf{r}|^{-(d+σ)}$. Using extensive Monte Carlo simulations, we measure the scaling relation $N \sim R^{d_N}$ between the loop-erased step number $N$ and the spatial extent $R$, and dete… ▽ More

    Submitted 29 March, 2026; originally announced March 2026.

  11. arXiv:2603.25108  [pdf, ps, other

    cs.CV

    MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning

    Authors: Chenglong Wang, Yifu Huo, Yang Gan, Qiaozhi He, Qi Meng, Bei Li, Yan Wang, Junfu Liu, Tianhua Zhou, Jingbo Zhu, Tong Xiao

    Abstract: Recent advances in multimodal reward modeling have been largely driven by a paradigm shift from discriminative to generative approaches. Building on this progress, recent studies have further employed reinforcement learning from verifiable rewards (RLVR) to enhance multimodal reward models (MRMs). Despite their success, RLVR-based training typically relies on labeled multimodal preference data, wh… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

    Comments: Accepted by CVPR 2026

  12. arXiv:2603.21705  [pdf, ps, other

    cs.LG

    Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

    Authors: Tian Xia

    Abstract: Model merging has emerged as a practical approach to combine capabilities of specialized large language models (LLMs) without additional training. In the Long-to-Short (L2S) scenario, merging a base model with a long-chain-of-thought reasoning model aims to preserve reasoning accuracy while reducing output length. Existing methods rely on Task Arithmetic and its variants, which implicitly assume t… ▽ More

    Submitted 23 March, 2026; originally announced March 2026.

    Comments: 14 pages, NeurIPS 2026 submission

  13. arXiv:2603.21659  [pdf, ps, other

    cs.AR

    IMMSched: Interruptible Multi-DNN Scheduling via Parallel Multi-Particle Optimizing Subgraph Isomorphism

    Authors: Boran Zhao, Hetian Liu, Zihang Yuan, Yanbin Hu, Wenzhe Zhao, Tian Xia, Pengju Ren

    Abstract: The growing demand for multi-DNN workloads with unpredictable task arrival times has highlighted the need for interruptible scheduling on edge accelerators. However, existing preemptive frameworks typically assume known task arrival times and rely on CPU-based offline scheduling, which incurs heavy runtime overhead and struggles to handle unpredictable task arrivals. Even worse, prior studies have… ▽ More

    Submitted 23 March, 2026; originally announced March 2026.

  14. arXiv:2603.21213  [pdf, ps, other

    cs.CV cs.AI

    Positional Segmentor-Guided Counterfactual Fine-Tuning for Spatially Localized Image Synthesis

    Authors: Tian Xia, Matthew Sinclair, Andreas Schuh, Fabio De Sousa Ribeiro, Raghav Mehta, Rajat Rasal, Esther Puyol-Antón, Samuel Gerber, Kersten Petersen, Michiel Schaap, Ben Glocker

    Abstract: Counterfactual image generation enables controlled data augmentation, bias mitigation, and disease modeling. However, existing methods guided by external classifiers or regressors are limited to subject-level factors (e.g., age) and fail to produce localized structural changes, often resulting in global artifacts. Pixel-level guidance using segmentation masks has been explored, but requires user-d… ▽ More

    Submitted 22 March, 2026; originally announced March 2026.

  15. arXiv:2603.20218  [pdf, ps, other

    cs.CL cs.LG

    An experimental study of KV cache reuse strategies in chunk-level caching systems

    Authors: Samuel Cestola, Tianxiang Xia, Zheng Weiyan, Zheng Pengfei, Diego Didona

    Abstract: Retrieval-augmented generation improves large language models' accuracy by adding relevant retrieved text to the prompt. Chunk level caching (CLC) accelerates inference by precomputing KV caches for these retrieved chunks and reusing them. However, these caches miss cross-attention dependencies between chunks, which can reduce output quality. Several methods try to improve CLC accuracy using diffe… ▽ More

    Submitted 3 March, 2026; originally announced March 2026.

    ACM Class: I.2.7

  16. arXiv:2603.19733  [pdf, ps, other

    cs.CL

    PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction

    Authors: Runsong Zhao, Shilei Liu, Jiwei Tang, Langming Liu, Haibin Chen, Weidong Zhang, Yujin Yuan, Tong Xiao, Jingbo Zhu, Wenbo Su, Bo Zheng

    Abstract: While context compression can mitigate the growing inference costs of Large Language Models (LLMs) by shortening contexts, existing methods that specify a target compression ratio or length suffer from unpredictable performance degradation, hindering their reliable deployment. We introduce a paradigm shift to Performance-oriented Context Compression (PoC), where developers specify an acceptable pe… ▽ More

    Submitted 20 March, 2026; originally announced March 2026.

  17. arXiv:2603.19564  [pdf, ps, other

    cs.LG

    Wearable Foundation Models Should Go Beyond Static Encoders

    Authors: Yu Yvonne Wu, Yuwei Zhang, Hyungjun Yoon, Ting Dang, Dimitris Spathis, Tong Xia, Qiang Yang, Jing Han, Dong Ma, Sung-Ju Lee, Cecilia Mascolo

    Abstract: Wearable foundation models (WFMs), trained on large volumes of data collected by affordable, always-on devices, have demonstrated strong performance on short-term, well-defined health monitoring tasks, including activity recognition, fitness tracking, and cardiovascular signal assessment. However, most existing WFMs primarily map short temporal windows to predefined labels via static encoders, emp… ▽ More

    Submitted 19 March, 2026; originally announced March 2026.

    Comments: 13 pages

  18. arXiv:2603.19097  [pdf, ps, other

    cs.CL cs.AI

    DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering

    Authors: Yilin Wang, Yuchun Fan, Jiaoyang Li, Ziming Zhu, Yongyu Mu, Qiaozhi He, Tong Xiao, Jingbo Zhu

    Abstract: Retrieval-augmented generation (RAG) systems have made significant progress in solving complex multi-hop question answering (QA) tasks in the English scenario. However, RAG systems inevitably face the application scenario of retrieving across multilingual corpora and queries, leaving several open challenges. The first one involves the absence of benchmarks that assess RAG systems' capabilities und… ▽ More

    Submitted 19 March, 2026; originally announced March 2026.

    Comments: Accepted by ICASSP 2026

  19. arXiv:2603.17556  [pdf, ps, other

    physics.ins-det hep-ex

    Characterization of Deconvolution-Based PMT Waveform Reconstruction Under Large Charge Dynamic Range and Varying Scintillation Time Profiles

    Authors: Xingyi Lin, Jinghuan Xu, Yongbo Huang, Jingzhe Tang, Tianying Xiao, Yingke Li

    Abstract: Photomultiplier tubes (PMTs) are widely used as photon sensors for neutrino and dark matter detection. Accurate charge and time information extracted from PMT waveforms is crucial for event reconstruction. An algorithm based on deconvolution technology was proposed and applied to the reconstruction of PMT waveforms. This study further investigated the reliability of the deconvolution algorithm whe… ▽ More

    Submitted 20 March, 2026; v1 submitted 18 March, 2026; originally announced March 2026.

  20. arXiv:2603.16483  [pdf, ps, other

    cs.CL

    On the Emotion Understanding of Synthesized Speech

    Authors: Yuan Ge, Haishu Zhao, Aokai Hao, Junxiang Zhang, Bei Li, Xiaoqian Liu, Chenglong Wang, Jianjin Wang, Bingsen Zhou, Bingyu Liu, Jingbo Zhu, Zhengtao Yu, Tong Xiao

    Abstract: Emotion is a core paralinguistic feature in voice interaction. It is widely believed that emotion understanding models learn fundamental representations that transfer to synthesized speech, making emotion understanding results a plausible reward or evaluation metric for assessing emotional expressiveness in speech synthesis. In this work, we critically examine this assumption by systematically eva… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

  21. arXiv:2603.16206  [pdf, ps, other

    cs.LG cs.CL

    Offline Exploration-Aware Fine-Tuning for Long-Chain Mathematical Reasoning

    Authors: Yongyu Mu, Jiali Zeng, Fandong Meng, JingBo Zhu, Tong Xiao

    Abstract: Through encouraging self-exploration, reinforcement learning from verifiable rewards (RLVR) has significantly advanced the mathematical reasoning capabilities of large language models. As the starting point for RLVR, the capacity of supervised fine-tuning (SFT) to memorize new chain-of-thought trajectories provides a crucial initialization that shapes the subsequent exploration landscape. However,… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

    Comments: Working in process

  22. arXiv:2603.14682  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Giant anomalous Hall conductivity in frustrated magnet EuCo2Al9

    Authors: Sheng Xu, Jian-Feng Zhang, Shu-Xiang Li, Junfa Lin, Xiaobai Ma, Wenyun Yang, Jun-Jian Mi, Zheng Li, Tian-Hao Li, Yue-Yang Wu, Jiang Ma, Qian Tao, Wen-He Jiao, Xiaofeng Xu, Zengwei Zhu, Yuanfeng Xu, Hanjie Guo, Tian-Long Xia, Zhu-An Xu

    Abstract: The interaction between conduction electrons and localized magnetic moments profoundly influences the electrical and magnetic properties of materials, giving rise to a variety of fascinating physical phenomena and quantum effects. Here, we discover a giant anomalous Hall effect (AHE) in a frustrated Eu-based magnet, exhibiting a giant anomalous Hall conductivity (AHC) of 31000 Ω-1cm-1 and a remark… ▽ More

    Submitted 15 March, 2026; originally announced March 2026.

    Comments: 15 pages, 5 figures. To appear in Materials Today

  23. arXiv:2603.13480  [pdf, ps, other

    hep-ph astro-ph.HE

    Blazar Constraints on Axions through New Spectral Modulation Searches in 1ES 1959+650 & B2 1811+31

    Authors: Andrea Giovanni De Marchi, Orion Ning, Tianzhuo Xiao

    Abstract: Blazars are unique astrophysical environments whose high-energy $γ$-ray spectra are susceptible to modulations in the presence of ultralight axions. We search for these modulations, induced by axion-photon mixing, in Fermi-LAT spectral data of previously unexplored blazar targets, focusing in particular on blazars 1ES 1959+650 and B2 1811+31, whose flare states provide a clean testbed for axion ac… ▽ More

    Submitted 13 March, 2026; originally announced March 2026.

    Comments: 14 pages + references, 11 figures, 1 table

  24. arXiv:2603.11327  [pdf, ps, other

    cs.LG cs.CL

    Meta-Reinforcement Learning with Self-Reflection for Agentic Search

    Authors: Teng Xiao, Yige Yuan, Hamish Ivison, Huaisheng Zhu, Faeze Brahman, Nathan Lambert, Pradeep Dasigi, Noah A. Smith, Hannaneh Hajishirzi

    Abstract: This paper introduces MR-Search, an in-context meta reinforcement learning (RL) formulation for agentic search with self-reflection. Instead of optimizing a policy within a single independent episode with sparse rewards, MR-Search trains a policy that conditions on past episodes and adapts its search strategy across episodes. MR-Search learns to learn a search strategy with self-reflection, allowi… ▽ More

    Submitted 18 March, 2026; v1 submitted 11 March, 2026; originally announced March 2026.

    Comments: 23 pages, Preprint

  25. arXiv:2603.09221  [pdf, ps, other

    cs.LG

    Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control

    Authors: Peihao Wang, Shan Yang, Xijun Wang, Tesi Xiao, Xin Liu, Changlong Yu, Yu Lou, Pan Li, Zhangyang Wang, Ming Lin, René Vidal

    Abstract: Associative memory has long underpinned the design of sequential models. Beyond recall, humans reason by projecting future states and selecting goal-directed actions, a capability that modern language models increasingly require but do not natively encode. While prior work uses reinforcement learning or test-time training, planning remains external to the model architecture. We formulate reasoning… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

  26. arXiv:2603.07599  [pdf, ps, other

    cs.CL

    StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

    Authors: Haishu Zhao, Aokai Hao, Yuan Ge, Zhenqiang Hong, Tong Xiao, Jingbo Zhu

    Abstract: Speech language models (SLMs) have significantly extended the interactive capability of text-based Large Language Models (LLMs) by incorporating paralinguistic information. For more realistic interactive experience with customized styles, current SLMs have managed to interpret and control speaking style intensity from user prompts during the dialogue process. However, there remains a lack of syste… ▽ More

    Submitted 8 March, 2026; originally announced March 2026.

  27. arXiv:2603.06542  [pdf, ps, other

    cs.SD cs.AI

    RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering

    Authors: Gaia A. Bertolino, Yuwei Zhang, Tong Xia, Domenico Talia, Cecilia Mascolo

    Abstract: Conversational generative AI is rapidly entering healthcare, where general-purpose models must integrate heterogeneous patient signals and support diverse interaction styles while producing clinically meaningful outputs. In respiratory care, non-invasive audio, such as recordings captured via mobile microphones, enables scalable screening and longitudinal monitoring, but the heterogeneity challeng… ▽ More

    Submitted 6 March, 2026; originally announced March 2026.

  28. arXiv:2603.04491  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Giant Magnetocrystalline Anisotropy in Honeycomb Iridate NiIrO3 with Large Coercive Field Exceeding 17 T

    Authors: Chuanhui Zhu, Pengfei Tan, Xiao-Sheng Ni, Jingchun Gao, Yuting Chang, Mei-Huan Zhao, Zheng Deng, Shuang Zhao, Tao Xia, Jinjin Yang, Changqing Jin, Junfeng Wang, Chengliang Lu, Yisheng Chai, Dao-Xin Yao, Man-Rong Li

    Abstract: The realization of unconventional quantum phases in frustrated and spin-orbit coupled materials remains at the forefront of quantum materials research. Here we report the synthesis and discovery of NiIrO3, the first honeycomb iridate with coupled 3d-5d magnetic sublattices, through a soft topotactic reaction. Structural analysis reveals an ilmenite-type stacking of edge-sharing NiO6 and IrO6 octah… ▽ More

    Submitted 4 March, 2026; originally announced March 2026.

  29. arXiv:2603.02932  [pdf

    cond-mat.mes-hall

    A simple scheme to realize the Rice-Mele model in acoustic system

    Authors: Tianzhi Xia, Xiying Fan, Qi Chen, Yuanlei Zhang, Zhe Li

    Abstract: The Rice-Mele (RM) model, as a paradigmatic extension of the Su-Schrieffer-Heeger (SSH) chain, plays a pivotal role in understanding topological phases and quantized adiabatic transport in one-dimensional systems. Its realization in acoustic systems, however, has been hindered by the need for simultaneous precise modulation of on-site potentials and couplings. In this work, we demonstrate a method… ▽ More

    Submitted 4 March, 2026; v1 submitted 3 March, 2026; originally announced March 2026.

    Comments: 10 pages, 4 figures, article in press (Chinese Physics B). https://doi.org/10.1088/1674-1056/ae3473. v2: Added references[17-19] to acknowledge the prior work on shift currents

  30. arXiv:2603.02266  [pdf, ps, other

    cs.SD cs.AI eess.AS

    When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning

    Authors: Ruixiang Mao, Xiangnan Ma, Dan Chen, Ziming Zhu, Yuan Ge, Aokai Hao, Haishu Zhao, Yifu Huo, Qing Yang, Kaiyan Chang, Xiaoqian Liu, Chenglong Wang, Qiaozhi He, Tong Xiao, Jingbo Zhu

    Abstract: Test-Time Scaling has shown notable efficacy in addressing complex problems through scaling inference compute. However, within Large Audio-Language Models (LALMs), an unintuitive phenomenon exists: post-training models for structured reasoning trajectories results in marginal or even negative gains compared to post-training for direct answering. To investigate it, we introduce CAFE, an evaluation… ▽ More

    Submitted 28 February, 2026; originally announced March 2026.

    Comments: Under Review

  31. arXiv:2603.00155  [pdf, ps, other

    cs.CV cs.AI cs.IR

    EfficientPosterGen: Semantic-aware Efficient Poster Generation via Token Compression and Accurate Violation Detection

    Authors: Wenxin Tang, Jingyu Xiao, Yanpei Gong, Fengyuan Ran, Tongchuan Xia, Junliang Liu, Man Ho Lam, Wenxuan Wang, Michael R. Lyu

    Abstract: Automated academic poster generation aims to distill lengthy research papers into concise, visually coherent presentations. Existing Multimodal Large Language Models (MLLMs) based approaches, however, suffer from three critical limitations: low information density in full-paper inputs, excessive token consumption, and unreliable layout verification. We present EfficientPosterGen, an end-to-end fra… ▽ More

    Submitted 25 February, 2026; originally announced March 2026.

  32. arXiv:2603.00058  [pdf, ps, other

    cs.CY cs.AI

    PaperRepro: Automated Computational Reproducibility Assessment for Social Science Papers

    Authors: Linhao Zhang, Tong Xia, Jinghua Piao, Lizhen Cui, Yong Li

    Abstract: Computational reproducibility is essential for the credibility of scientific findings, particularly in the social sciences, where findings often inform real-world decisions. Manual reproducibility assessment is costly and time-consuming, as it is nontrivial to reproduce the reported findings using the authors' released code and data. Recent advances in large models (LMs) have inspired agent-based… ▽ More

    Submitted 10 February, 2026; originally announced March 2026.

  33. arXiv:2602.22584  [pdf, ps, other

    cs.CL

    Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

    Authors: Wenwei Li, Ming Xu, Tianle Xia, Lingxiang Hu, Yiding Sun, Linfang Shang, Liqun Liu, Peng Shu, Huan Yu, Jie Jiang

    Abstract: Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. Although Retrieval-Augmented Generation (RAG) is widely adopted, deploying it in production remains challenging because industrial knowledge is inherently relational, frequently updated, and insufficient… ▽ More

    Submitted 25 February, 2026; originally announced February 2026.

  34. arXiv:2602.22576  [pdf, ps, other

    cs.CL cs.IR cs.LG

    Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training

    Authors: Tianle Xia, Ming Xu, Lingxiang Hu, Yiding Sun, Wenwei Li, Linfang Shang, Liqun Liu, Peng Shu, Huan Yu, Jie Jiang

    Abstract: Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by incorporating external knowledge, yet traditional single-round retrieval struggles with complex multi-step reasoning. Agentic RAG addresses this by enabling LLMs to dynamically decide when and what to retrieve, but current RL-based training methods suffer from sparse outcome rewards that discard intermediate signals and… ▽ More

    Submitted 25 February, 2026; originally announced February 2026.

  35. arXiv:2602.18709  [pdf, ps, other

    cs.CV cs.RO

    IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping

    Authors: Tingyang Xiao, Liu Liu, Wei Feng, Zhengyu Zou, Xiaolin Zhou, Wei Sui, Hao Li, Dingwen Zhang, Zhizhong Su

    Abstract: Geometry foundation models have significantly advanced dense geometric SLAM, yet existing systems often lack deep semantic understanding and robust loop closure capabilities. Meanwhile, contemporary semantic mapping approaches are frequently hindered by decoupled architectures and fragile data association. We propose IRIS-SLAM, a novel RGB semantic SLAM system that leverages unified geometric-inst… ▽ More

    Submitted 27 March, 2026; v1 submitted 20 February, 2026; originally announced February 2026.

  36. arXiv:2602.18452  [pdf, ps, other

    cs.SD cs.LG eess.AS

    RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity

    Authors: Gaia A. Bertolino, Yuwei Zhang, Tong Xia, Domenico Talia, Cecilia Mascolo

    Abstract: As conversational multimodal AI tools are increasingly adopted to process patient data for health assessment, robust benchmarks are needed to measure progress and expose failure modes under realistic conditions. Despite the importance of respiratory audio for mobile health screening, respiratory audio question answering remains underexplored, with existing studies evaluated narrowly and lacking re… ▽ More

    Submitted 5 March, 2026; v1 submitted 4 February, 2026; originally announced February 2026.

  37. arXiv:2602.15918  [pdf, ps, other

    cs.CV cs.AI

    EarthSpatialBench: Benchmarking Spatial Reasoning Capabilities of Multimodal LLMs on Earth Imagery

    Authors: Zelin Xu, Yupu Zhang, Saugat Adhikari, Saiful Islam, Tingsong Xiao, Zibo Liu, Shigang Chen, Da Yan, Zhe Jiang

    Abstract: Benchmarking spatial reasoning in multimodal large language models (MLLMs) has attracted growing interest in computer vision due to its importance for embodied AI and other agentic systems that require precise interaction with the physical world. However, spatial reasoning on Earth imagery has lagged behind, as it uniquely involves grounding objects in georeferenced images and quantitatively reaso… ▽ More

    Submitted 17 February, 2026; originally announced February 2026.

  38. arXiv:2602.14257  [pdf, ps, other

    cs.CL cs.AI cs.IR cs.LG

    AD-Bench: A Real-World, Trajectory-Aware Advertising Analytics Benchmark for LLM Agents

    Authors: Lingxiang Hu, Yiding Sun, Tianle Xia, Wenwei Li, Ming Xu, Liqun Liu, Peng Shu, Huan Yu, Jie Jiang

    Abstract: While Large Language Model (LLM) agents have achieved remarkable progress in complex reasoning tasks, evaluating their performance in real-world environments has become a critical problem. Current benchmarks, however, are largely restricted to idealized simulations, failing to address the practical demands of specialized domains like advertising and marketing analytics. In these fields, tasks are… ▽ More

    Submitted 15 February, 2026; originally announced February 2026.

    Comments: 15 pages, 11 figures

  39. arXiv:2602.13551  [pdf, ps, other

    cs.CL

    Small Reward Models via Backward Inference

    Authors: Yike Wang, Faeze Brahman, Shangbin Feng, Teng Xiao, Hannaneh Hajishirzi, Yulia Tsvetkov

    Abstract: Reward models (RMs) play a central role throughout the language model (LM) pipeline, particularly in non-verifiable domains. However, the dominant LLM-as-a-Judge paradigm relies on the strong reasoning capabilities of large models, while alternative approaches require reference responses or explicit rubrics, limiting flexibility and broader accessibility. In this work, we propose FLIP (FLipped Inf… ▽ More

    Submitted 25 February, 2026; v1 submitted 13 February, 2026; originally announced February 2026.

  40. arXiv:2602.11645  [pdf

    cond-mat.mtrl-sci

    Epitaxial Growth and Anomalous Hall Effect in High-Quality Altermagnetic $α$-MnTe Thin Films

    Authors: Tian-Hao Shao, Xingze Dai, Wenyu Hu, Ming-Yuan Zhu, Yuanqiang He, Lin-He Yang, Jingjing Liu, Meng Yang, Xiang-Rui Liu, Jing-Jing Shi, Tian-Yi Xiao, Yu-Jie Hao, Xiao-Ming Ma, Yue Dai, Meng Zeng, Qinwu Gao, Gan Wang, Junxue Li, Chao Wang, Chang Liu

    Abstract: The recent identification of $α$-MnTe as a candidate altermagnet has attracted considerable interest, particularly for its potential application in magnetic random-access memory. However, the development of high-quality thin films - essential for practical implementation - has remained limited. Here, we report the epitaxial growth of centimeter-scale $α$-MnTe thin films on InP(111) substrates via… ▽ More

    Submitted 12 February, 2026; originally announced February 2026.

    Comments: 27 pages, 5 figures. Submitted on Jan. 21, 2026

  41. arXiv:2602.09311  [pdf, ps, other

    cs.SE

    Cross-Project Flakiness: A Case Study of the OpenStack Ecosystem

    Authors: Tao Xiao, Dong Wang, Shane McIntosh, Hideaki Hata, Yasutaka Kamei

    Abstract: Automated regression testing is a cornerstone of modern software development, often contributing directly to code review and Continuous Integration (CI). Yet some tests suffer from flakiness, where their outcomes vary non-deterministically. Flakiness erodes developer trust in test results, wastes computational resources, and undermines CI reliability. While prior research has examined test flakine… ▽ More

    Submitted 9 February, 2026; originally announced February 2026.

  42. arXiv:2602.09082  [pdf, ps, other

    cs.CV cs.AI cs.CL cs.LG

    UI-Venus-1.5 Technical Report

    Authors: Venus Team, Changlong Gao, Zhangxuan Gu, Yulin Liu, Xinyu Qiu, Shuheng Shen, Yue Wen, Tianyu Xia, Zhenyu Xu, Zhengwen Zeng, Beitong Zhou, Xingran Zhou, Weizhi Chen, Sunhao Dai, Jingya Dou, Yichen Gong, Yuan Guo, Zhenlin Guo, Feng Li, Qian Li, Jinzhen Lin, Yuqi Zhou, Linchao Zhu, Liang Chen, Zhenyu Guo , et al. (2 additional authors not shown)

    Abstract: GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging. In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world applications. The proposed model family comprises two dense variants (2B and 8B) and one mixture-o… ▽ More

    Submitted 24 February, 2026; v1 submitted 9 February, 2026; originally announced February 2026.

  43. arXiv:2602.08559  [pdf, ps, other

    cs.IR

    QARM V2: Quantitative Alignment Multi-Modal Recommendation for Reasoning User Sequence Modeling

    Authors: Tian Xia, Jiaqi Zhang, Yueyang Liu, Hongjian Dou, Tingya Yin, Jiangxia Cao, Xulei Liang, Tianlu Xie, Lihao Liu, Xiang Chen, Shen Wang, Changxin Lao, Haixiang Gan, Jinkai Yu, Keting Cen, Lu Hao, Xu Zhang, Qiqiang Zhong, Zhongbo Sun, Yiyu Wang, Shuang Yang, Mingxin Wen, Xiangyu Wu, Shaoguo Liu, Tingting Gao , et al. (3 additional authors not shown)

    Abstract: With the evolution of large language models (LLMs), there is growing interest in leveraging their rich semantic understanding to enhance industrial recommendation systems (RecSys). Traditional RecSys relies on ID-based embeddings for user sequence modeling in the General Search Unit (GSU) and Exact Search Unit (ESU) paradigm, which suffers from low information density, knowledge isolation, and wea… ▽ More

    Submitted 9 February, 2026; originally announced February 2026.

    Comments: Work in progress

  44. arXiv:2602.07621  [pdf, ps, other

    cs.CL

    SciClaimEval: Cross-modal Claim Verification in Scientific Papers

    Authors: Xanh Ho, Yun-Ang Wu, Sunisth Kumar, Tian Cheng Xia, Florian Boudin, Andre Greiner-Petter, Akiko Aizawa

    Abstract: We present SciClaimEval, a new scientific dataset for the claim verification task. Unlike existing resources, SciClaimEval features authentic claims, including refuted ones, directly extracted from published papers. To create refuted claims, we introduce a novel approach that modifies the supporting evidence (figures and tables), rather than altering the claims or relying on large language models… ▽ More

    Submitted 13 February, 2026; v1 submitted 7 February, 2026; originally announced February 2026.

    Comments: Accepted at LREC 2026; 12 pages; data is available at https://sciclaimeval.github.io/

  45. arXiv:2602.07320  [pdf, ps, other

    cs.LG

    Incorruptible Neural Networks: Training Models that can Generalize to Large Internal Perturbations

    Authors: Philip Jacobson, Ben Feinberg, Suhas Kumar, Sapan Agarwal, T. Patrick Xiao, Christopher Bennett

    Abstract: Flat regions of the neural network loss landscape have long been hypothesized to correlate with better generalization properties. A closely related but distinct problem is training models that are robust to internal perturbations to their weights, which may be an important need for future low-power hardware platforms. In this paper, we explore the usage of two methods, sharpness-aware minimization… ▽ More

    Submitted 6 February, 2026; originally announced February 2026.

  46. arXiv:2602.02792  [pdf

    quant-ph physics.chem-ph

    Experimental Quantification of Spin-Phonon Coupling in Molecular Qubits using Inelastic Neutron Scattering

    Authors: Stefan H. Lohaus, Kay T. Xia, Yongqiang Cheng, Ryan G. Hadt

    Abstract: Electronic spin superposition states enable nanoscale sensing through their sensitivity to the local environment, yet their sensitivity to vibrational motion also limits their coherence times. In molecular spin systems, chemical tunability and atomic-scale resolution are accompanied by a dense, thermally accessible phonon spectrum that introduces efficient spin relaxation pathways. Despite extensi… ▽ More

    Submitted 2 February, 2026; originally announced February 2026.

    Comments: 21 pages, 5 figures, 1 table

  47. arXiv:2602.01766  [pdf, ps, other

    cs.LG cs.AI

    CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling

    Authors: Runsong Zhao, Shilei Liu, Jiwei Tang, Langming Liu, Haibin Chen, Weidong Zhang, Yujin Yuan, Tong Xiao, Jingbo Zhu, Wenbo Su, Bo Zheng

    Abstract: The quadratic complexity and indefinitely growing key-value (KV) cache of standard Transformers pose a major barrier to long-context processing. To overcome this, we introduce the Collaborative Memory Transformer (CoMeT), a novel architecture that enables LLMs to handle arbitrarily long sequences with constant memory usage and linear time complexity. Designed as an efficient, plug-in module, CoMeT… ▽ More

    Submitted 2 February, 2026; originally announced February 2026.

  48. arXiv:2602.01078  [pdf, ps, other

    cs.AI

    AutoHealth: An Uncertainty-Aware Multi-Agent System for Autonomous Health Data Modeling

    Authors: Tong Xia, Weibin Li, Gang Liu, Yong Li

    Abstract: LLM-based agents have demonstrated strong potential for autonomous machine learning, yet their applicability to health data remains limited. Existing systems often struggle to generalize across heterogeneous health data modalities, rely heavily on predefined solution templates with insufficient adaptation to task-specific objectives, and largely overlook uncertainty estimation, which is essential… ▽ More

    Submitted 1 February, 2026; originally announced February 2026.

  49. arXiv:2602.00760  [pdf, ps, other

    cs.CL

    APR: Penalizing Structural Redundancy in Large Reasoning Models via Anchor-based Process Rewards

    Authors: Kaiyan Chang, Chenwei Zhu, Yingfeng Luo, Yifu Huo, Chenglong Wang, Xiaoqian Liu, Qiaozhi He, Tong Xiao, Zhengtao Yu, Jingbo Zhu

    Abstract: Test-Time Scaling (TTS) has significantly enhanced the capabilities of Large Reasoning Models (LRMs) but introduces a critical side-effect known as Overthinking. We conduct a preliminary study to rethink this phenomenon from a fine-grained perspective. We observe that LRMs frequently conduct repetitive self-verification without revision even after obtaining the final answer during the reasoning pr… ▽ More

    Submitted 9 February, 2026; v1 submitted 31 January, 2026; originally announced February 2026.

    Comments: Under Review

  50. arXiv:2601.22580  [pdf, ps, other

    cs.CL cs.AI cs.LG

    SpanNorm: Reconciling Training Stability and Performance in Deep Transformers

    Authors: Chao Wang, Bei Li, Jiaqi Zhang, Xinyu Liu, Yuchun Fan, Linkun Lyu, Xin Chen, Jingang Wang, Tong Xiao, Peng Pei, Xunliang Cai

    Abstract: The success of Large Language Models (LLMs) hinges on the stable training of deep Transformer architectures. A critical design choice is the placement of normalization layers, leading to a fundamental trade-off: the ``PreNorm'' architecture ensures training stability at the cost of potential performance degradation in deep models, while the ``PostNorm'' architecture offers strong performance but s… ▽ More

    Submitted 30 January, 2026; originally announced January 2026.