Skip to main content

Showing 1–50 of 357 results for author: Du, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.10912  [pdf, ps, other

    cs.CV

    TAMISeg: Text-Aligned Multi-scale Medical Image Segmentation with Semantic Encoder Distillation

    Authors: Qiang Gao, Yi Wang, Yong Zhang, Yong Li, Yongbing Deng, Lan Du, Cunjian Chen

    Abstract: Medical image segmentation remains challenging due to limited fine-grained annotations, complex anatomical structures, and image degradation from noise, low contrast, or illumination variation. We propose TAMISeg, a text-guided segmentation framework that incorporates clinical language prompts and semantic distillation as auxiliary semantic cues to enhance visual understanding and reduce reliance… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

    Comments: Accepted by IEEE International Conference on Multimedia and Expo (ICME), 2026

  2. arXiv:2604.04236  [pdf, ps, other

    cs.PL cs.AR

    NEURA: A Unified and Retargetable Compilation Framework for Coarse-Grained Reconfigurable Architectures

    Authors: Shangkun Li, Jinming Ge, Diyuan Tao, Zeyu Li, Jiawei Liang, Linfeng Du, Jiang Xu, Wei Zhang, Cheng Tan

    Abstract: Coarse-Grained Reconfigurable Architectures (CGRAs) are a promising and versatile accelerator platform, offering a balance between the performance and efficiency of specialized accelerators and the software programmability. However, their full potential is severely hindered by control flow in accelerated kernels, as the control flow (e.g., loops, branches) is fundamentally incompatible with the pa… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

    Comments: Accepted by PLDI 2026

  3. arXiv:2604.02369  [pdf, ps, other

    cs.NI cs.AI

    Beyond Message Passing: A Semantic View of Agent Communication Protocols

    Authors: Dun Yuan, Fuyuan Lyu, Ye Yuan, Weixu Zhang, Bowei He, Jiayi Geng, Linfeng Du, Zipeng Sun, Yankai Chen, Changjiang Han, Jikun Kang, Xi Chen, Haolun Wu, Xue Liu

    Abstract: Agent communication protocols are becoming critical infrastructure for large language model (LLM) systems that must use tools, coordinate with other agents, and operate across heterogeneous environments. This work presents a human-inspired perspective on this emerging landscape by organizing agent communication into three layers: communication, syntactic, and semantic. Under this framework, we sys… ▽ More

    Submitted 13 April, 2026; v1 submitted 29 March, 2026; originally announced April 2026.

  4. arXiv:2603.28334  [pdf, ps, other

    cs.LG cs.DC

    Key-Embedded Privacy for Decentralized AI in Biomedical Omics

    Authors: Rongyu Zhang, Hongyu Dong, Gaole Dai, Ziqi Qiao, Shenli Zheng, Yuan Zhang, Aosong Cheng, Xiaowei Chi, Jincai Luo, Pin Li, Li Du, Dan Wang, Yuan Du, Xudong Xing, Jianxu Chen, Shanghang Zhang

    Abstract: The rapid adoption of data-driven methods in biomedicine has intensified concerns over privacy, governance, and regulation, limiting raw data sharing and hindering the assembly of representative cohorts for clinically relevant AI. This landscape necessitates practical, efficient privacy solutions, as cryptographic defenses often impose heavy overhead and differential privacy can degrade performanc… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  5. arXiv:2603.28253  [pdf, ps, other

    cs.LG cs.AI

    MR-ImagenTime: Multi-Resolution Time Series Generation through Dual Image Representations

    Authors: Xianyong Xu, Yuanjun Zuo, Zhihong Huang, Yihan Qin, Haoxian Xu, Leilei Du, Haotian Wang

    Abstract: Time series forecasting is vital across many domains, yet existing models struggle with fixed-length inputs and inadequate multi-scale modeling. We propose MR-CDM, a framework combining hierarchical multi-resolution trend decomposition, an adaptive embedding mechanism for variable-length inputs, and a multi-scale conditional diffusion process. Evaluations on four real-world datasets demonstrate th… ▽ More

    Submitted 7 April, 2026; v1 submitted 30 March, 2026; originally announced March 2026.

  6. arXiv:2603.28239  [pdf, ps, other

    cs.AR

    A Switch-Centric In-Network Architecture for Accelerating LLM Inference in Shared-Memory Network

    Authors: Aojie Jiang, Kang Zhu, Zhiheng Zhang, Zhengxu Su, Juntao Liu, Yuan Du, Li Du

    Abstract: In-network computing techniques, exemplified by NVLink SHARP (NVLS), offer a promising approach to addressing the communication bottlenecks in LLM inference by offloading collective operations such as All-Reduce to switches. However, the accelerator-centric architecture of NVLS suffers from two fundamental limitations: 1) it relies on GPU load instructions to trigger in-switch reduction, which mea… ▽ More

    Submitted 8 April, 2026; v1 submitted 30 March, 2026; originally announced March 2026.

  7. arXiv:2603.24506  [pdf, ps, other

    cs.CV

    Toward Physically Consistent Driving Video World Models under Challenging Trajectories

    Authors: Jiawei Zhou, Zhenxin Zhu, Lingyi Du, Linye Lyu, Lijun Zhou, Zhanqian Wu, Hongcheng Luo, Zhuotao Tian, Bing Wang, Guang Chen, Hangjun Ye, Haiyang Sun, Yu Li

    Abstract: Video generation models have shown strong potential as world models for autonomous driving simulation. However, existing approaches are primarily trained on real-world driving datasets, which mostly contain natural and safe driving scenarios. As a result, current models often fail when conditioned on challenging or counterfactual trajectories-such as imperfect trajectories generated by simulators… ▽ More

    Submitted 1 April, 2026; v1 submitted 25 March, 2026; originally announced March 2026.

  8. arXiv:2603.19255  [pdf, ps, other

    cs.CL cs.AI

    LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

    Authors: Wei Zhang, Lintong Du, Yuanhe Zhang, Zhenhong Zhou, Kun Wang, Li Sun, Sen Su

    Abstract: Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise control of output length remains a persistent challenge. Existing methods primarily attempt to enforce length constraints by externally imposing length signals or optimization objectives, while largely overlooking the underlying limitation: the model's intrinsic deficit in length cognitio… ▽ More

    Submitted 25 February, 2026; originally announced March 2026.

    Comments: 19 pages, 6 figures

  9. arXiv:2603.15432  [pdf, ps, other

    cs.CV

    Gym-V: A Unified Vision Environment System for Agentic Vision Research

    Authors: Fanqing Meng, Lingxiao Du, Jiawei Gu, Jiaqi Liao, Linjie Li, Zijian Wu, Xiangyan Liu, Ziqi Zhao, Mengkang Hu, Zichen Liu, Jiaheng Zhang, Michael Qizhe Shieh

    Abstract: As agentic systems increasingly rely on reinforcement learning from verifiable rewards, standardized ``gym'' infrastructure has become essential for rapid iteration, reproducibility, and fair comparison. Vision agents lack such infrastructure, limiting systematic study of what drives their learning and where current models fall short. We introduce \textbf{Gym-V}, a unified platform of 179 procedur… ▽ More

    Submitted 8 April, 2026; v1 submitted 16 March, 2026; originally announced March 2026.

  10. arXiv:2603.01145  [pdf, ps, other

    cs.AI

    AutoSkill: Experience-Driven Lifelong Learning via Skill Self-Evolution

    Authors: Yutao Yang, Junsong Li, Qianjun Pan, Bihao Zhan, Yuxuan Cai, Lin Du, Jie Zhou, Kai Chen, Qin Chen, Xin Li, Bo Zhang, Liang He

    Abstract: In practical LLM applications, users repeatedly express stable preferences and requirements, such as reducing hallucinations, following institutional writing conventions, or avoiding overly technical wording, yet such interaction experience is seldom consolidated into reusable knowledge. Consequently, LLM agents often fail to accumulate personalized capabilities across sessions. We present AutoSki… ▽ More

    Submitted 4 March, 2026; v1 submitted 1 March, 2026; originally announced March 2026.

  11. arXiv:2602.23787  [pdf, ps, other

    cs.AR

    FPPS: An FPGA-Based Point Cloud Processing System

    Authors: Xiaofeng Zhou, Linfeng Du, Hanwei Fan, Wei Zhang

    Abstract: Point cloud processing is a computational bottleneck in autonomous driving systems, especially for real-time applications, while energy efficiency remains a critical system constraint. This work presents FPPS, an FPGA-accelerated point cloud processing system designed to optimize the iterative closest point (ICP) algorithm, a classic cornerstone of 3D localization and perception pipelines. Evaluat… ▽ More

    Submitted 27 February, 2026; originally announced February 2026.

  12. arXiv:2602.15312  [pdf

    cs.CL econ.EM

    Extracting Consumer Insight from Text: A Large Language Model Approach to Emotion and Evaluation Measurement

    Authors: Stephan Ludwig, Peter J. Danaher, Xiaohao Yang, Yu-Ting Lin, Ehsan Abedin, Dhruv Grewal, Lan Du

    Abstract: Accurately measuring consumer emotions and evaluations from unstructured text remains a core challenge for marketing research and practice. This study introduces the Linguistic eXtractor (LX), a fine-tuned, large language model trained on consumer-authored text that also has been labeled with consumers' self-reported ratings of 16 consumption-related emotions and four evaluation constructs: trust,… ▽ More

    Submitted 16 February, 2026; originally announced February 2026.

  13. arXiv:2602.13541  [pdf, ps, other

    cs.CR

    DWBench: Holistic Evaluation of Watermark for Dataset Copyright Auditing

    Authors: Xiao Ren, Xinyi Yu, Linkang Du, Min Chen, Yuanchao Shu, Zhou Su, Yunjun Gao, Zhikun Zhang

    Abstract: The surging demand for large-scale datasets in deep learning has heightened the need for effective copyright protection, given the risks of unauthorized use to data owners. Although the dataset watermark technique holds promise for auditing and verifying usage, existing methods are hindered by inconsistent evaluations, which impede fair comparisons and assessments of real-world viability. To addre… ▽ More

    Submitted 13 February, 2026; originally announced February 2026.

    Comments: 19 pages, 4 figures

  14. arXiv:2602.08676  [pdf, ps, other

    cs.LG cs.AI

    LLaDA2.1: Speeding Up Text Diffusion via Token Editing

    Authors: Tiwei Bie, Maosong Cao, Xiang Cao, Bingsen Chen, Fuyuan Chen, Kun Chen, Lun Du, Daozhuo Feng, Haibo Feng, Mingliang Gong, Zhuocheng Gong, Yanmei Gu, Jian Guan, Kaiyuan Guan, Hongliang He, Zenan Huang, Juyong Jiang, Zhonghui Jiang, Zhenzhong Lan, Chengxi Li, Jianguo Li, Zehuan Li, Huabin Liu, Lin Liu, Guoshan Lu , et al. (25 additional authors not shown)

    Abstract: While LLaDA2.0 showcased the scaling potential of 100B-level block-diffusion models and their inherent parallelization, the delicate equilibrium between decoding speed and generation quality has remained an elusive frontier. Today, we unveil LLaDA2.1, a paradigm shift designed to transcend this trade-off. By seamlessly weaving Token-to-Token (T2T) editing into the conventional Mask-to-Token (M2T)… ▽ More

    Submitted 13 February, 2026; v1 submitted 9 February, 2026; originally announced February 2026.

    Comments: 11 pages, 3 figures

  15. arXiv:2602.07812  [pdf, ps, other

    cs.CL

    LLMs Know More About Numbers than They Can Say

    Authors: Fengting Yuchi, Li Du, Jason Eisner

    Abstract: Although state-of-the-art LLMs can solve math problems, we find that they make errors on numerical comparisons with mixed notation: "Which is larger, $5.7 \times 10^2$ or $580$?" This raises a fundamental question: Do LLMs even know how big these numbers are? We probe the hidden states of several smaller open-source LLMs. A single linear projection of an appropriate hidden layer encodes the log-ma… ▽ More

    Submitted 17 February, 2026; v1 submitted 7 February, 2026; originally announced February 2026.

    Comments: EACL 2026 (Oral), camera-ready version with GitHub link

  16. arXiv:2602.05671  [pdf

    cs.HC

    (Computer) Vision in Action: Comparing Remote Sighted Assistance and a Multimodal Voice Agent in Inspection Sequences

    Authors: Damien Rudaz, Barbara Nino Carreras, Sara Merlino, Brian L. Due, Barry Brown

    Abstract: Does human-AI assistance unfold in the same way as human-human assistance? This research explores what can be learned from the expertise of blind individuals and sighted volunteers to inform the design of multimodal voice agents and address the enduring challenge of proactivity. Drawing on granular analysis of two representative fragments from a larger corpus, we contrast the practices co-produced… ▽ More

    Submitted 5 February, 2026; originally announced February 2026.

    Comments: Conditionally accepted at CHI 2026, 32 pages, 8 figures

  17. arXiv:2602.03289  [pdf, ps, other

    cs.SC

    On the Summability Problem of Multivariate Rational Functions in the Mixed Case

    Authors: Shaoshi Chen, Lixin Du, Hanqian Fang, Yisen Wang

    Abstract: Continuing previous work, this paper focuses on the summability problem of multivariate rational functions in the mixed case in which both shift and $q$-shift operators can appear. Our summability criteria rely on three ingredients including orbital decompositions, Sato's isotropy groups, and difference transformations. This work settles the rational case of the long-term project aimed at developi… ▽ More

    Submitted 3 February, 2026; originally announced February 2026.

  18. arXiv:2602.02276  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Kimi K2.5: Visual Agentic Intelligence

    Authors: Kimi Team, Tongtong Bai, Yifan Bai, Yiping Bao, S. H. Cai, Yuan Cao, Y. Charles, H. S. Che, Cheng Chen, Guanduo Chen, Huarong Chen, Jia Chen, Jiahao Chen, Jianlong Chen, Jun Chen, Kefan Chen, Liang Chen, Ruijue Chen, Xinhao Chen, Yanru Chen, Yanxu Chen, Yicun Chen, Yimin Chen, Yingjiang Chen, Yuankun Chen , et al. (301 additional authors not shown)

    Abstract: We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5… ▽ More

    Submitted 2 February, 2026; originally announced February 2026.

    Comments: Kimi K2.5 tech report

  19. arXiv:2602.02212  [pdf, ps, other

    cs.CV

    MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models

    Authors: Zheyuan Zhou, Liang Du, Zixun Sun, Xiaoyu Zhou, Ruimin Ye, Qihao Chen, Yinda Chen, Lemiao Qiu

    Abstract: Despite significant progress in Visual-Language-Action (VLA), in highly complex and dynamic environments that involve real-time unpredictable interactions (such as 3D open worlds and large-scale PvP games), existing approaches remain inefficient at extracting action-critical signals from redundant sensor streams. To tackle this, we introduce MAIN-VLA, a framework that explicitly Models the Abstrac… ▽ More

    Submitted 2 February, 2026; originally announced February 2026.

  20. arXiv:2602.01334  [pdf, ps, other

    cs.CV

    What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-Zoom

    Authors: Yan Ma, Weiyu Zhang, Tianle Li, Linge Du, Xuyang Shen, Pengfei Liu

    Abstract: Vision tool-use reinforcement learning (RL) can equip vision-language models with visual operators such as crop-and-zoom and achieves strong performance gains, yet it remains unclear whether these gains are driven by improvements in tool use or evolving intrinsic capabilities.We introduce MED (Measure-Explain-Diagnose), a coarse-to-fine framework that disentangles intrinsic capability changes from… ▽ More

    Submitted 1 February, 2026; originally announced February 2026.

    Comments: code: https://github.com/GAIR-NLP/Med

  21. arXiv:2602.01201  [pdf, ps, other

    cs.HC

    Talk to Me, Not the Slides: A Real-Time Wearable Assistant for Improving Eye Contact in Presentations

    Authors: Lingyu Du, Xucong Zhang, Guohao Lan

    Abstract: Effective eye contact is a cornerstone of successful public speaking. It strengthens the speaker's credibility and fosters audience engagement. Yet, managing effective eye contact is a skill that demands extensive training and practice, often posing a significant challenge for novice speakers. In this paper, we present SpeakAssis, the first real-time, in-situ wearable system designed to actively a… ▽ More

    Submitted 1 February, 2026; originally announced February 2026.

  22. arXiv:2601.17842  [pdf

    cs.CL

    EFT-CoT: A Multi-Agent Chain-of-Thought Framework for Emotion-Focused Therapy

    Authors: Lanqing Du, Yunong Li, YuJie Long, Shihong Chen

    Abstract: The use of large language models (LLMs) for Mental Health Question Answering (MHQA) offers a promising way to alleviate shortages in mental health resources. However, prior work has mainly relied on Cognitive Behavioral Therapy (CBT) and predominantly follows a top-down strategy centered on rational cognitive restructuring, providing limited support for embodied experience and primary emotion proc… ▽ More

    Submitted 8 March, 2026; v1 submitted 25 January, 2026; originally announced January 2026.

  23. arXiv:2601.15773  [pdf, ps, other

    cs.LG

    Next Generation Active Learning: Mixture of LLMs in the Loop

    Authors: Yuanyuan Qi, Xiaohao Yang, Jueqing Lu, Guoxiang Guo, Joanne Enticott, Gang Liu, Lan Du

    Abstract: With the rapid advancement and strong generalization capabilities of large language models (LLMs), they have been increasingly incorporated into the active learning pipelines as annotators to reduce annotation costs. However, considering the annotation quality, labels generated by LLMs often fall short of real-world applicability. To address this, we propose a novel active learning framework, Mixt… ▽ More

    Submitted 22 January, 2026; originally announced January 2026.

  24. arXiv:2601.12298  [pdf, ps, other

    cs.AR

    CD-PIM: A High-Bandwidth and Compute-Efficient LPDDR5-Based PIM for Low-Batch LLM Acceleration on Edge-Device

    Authors: Ye Lin, Chao Fang, Xiaoyong Song, Qi Wu, Anying Jiang, Yichuan Bai, Li Du

    Abstract: Edge deployment of low-batch large language models (LLMs) faces critical memory bandwidth bottlenecks when executing memory-intensive general matrix-vector multiplications (GEMV) operations. While digital processing-in-memory (PIM) architectures promise to accelerate GEMV operations, existing PIM-equipped edge devices still suffer from three key limitations: limited bandwidth improvement, componen… ▽ More

    Submitted 18 January, 2026; originally announced January 2026.

    Comments: To appear in 2026 Design, Automation and Test in Europe Conference (DATE 2026)

  25. arXiv:2601.12078  [pdf, ps, other

    cs.CL cs.IR

    Optimizing User Profiles via Contextual Bandits for Retrieval-Augmented LLM Personalization

    Authors: Linfeng Du, Ye Yuan, Zichen Zhao, Fuyuan Lyu, Emiliano Penaloza, Xiuying Chen, Zipeng Sun, Jikun Kang, Laurent Charlin, Xue Liu, Haolun Wu

    Abstract: Large Language Models (LLMs) excel at general-purpose tasks, yet adapting their responses to individual users remains challenging. Retrieval augmentation provides a lightweight alternative to fine-tuning by conditioning LLMs on user history records, and existing approaches typically select these records based on semantic relevance. We argue that relevance serves as an unreliable proxy for utility:… ▽ More

    Submitted 17 January, 2026; originally announced January 2026.

  26. arXiv:2601.07224  [pdf, ps, other

    cs.AI cs.LG

    Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

    Authors: Yang Zhao, Yangou Ouyang, Xiao Ding, Hepeng Wang, Bibo Cai, Kai Xiong, Jinglong Gao, Zhouhao Sun, Li Du, Bing Qin, Ting Liu

    Abstract: While Hybrid Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has become the standard paradigm for training LLM agents, effective mechanisms for data allocation between these stages remain largely underexplored. Current data arbitration strategies often rely on surface-level heuristics that fail to diagnose intrinsic learning needs. Since SFT targets pattern consolidation throu… ▽ More

    Submitted 12 April, 2026; v1 submitted 12 January, 2026; originally announced January 2026.

    Comments: ACL2026 Main Conference

  27. arXiv:2601.07208  [pdf, ps, other

    cs.LG cs.CL

    MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization

    Authors: Yang Zhao, Hepeng Wang, Xiao Ding, Yangou Ouyang, Bibo Cai, Kai Xiong, Jinglong Gao, Zhouhao Sun, Li Du, Bing Qin, Ting Liu

    Abstract: Group-Relative Policy Optimization (GRPO) has emerged as an efficient paradigm for aligning Large Language Models (LLMs), yet its efficacy is primarily confined to domains with verifiable ground truths. Extending GRPO to open-domain settings remains a critical challenge, as unconstrained generation entails multi-faceted and often conflicting objectives - such as creativity versus factuality - wher… ▽ More

    Submitted 12 April, 2026; v1 submitted 12 January, 2026; originally announced January 2026.

    Comments: ACL 2026 Main Conference

  28. arXiv:2601.05930  [pdf, ps, other

    cs.CL cs.AI cs.LG cs.MA

    Can We Predict Before Executing Machine Learning Agents?

    Authors: Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang

    Abstract: Autonomous machine learning agents have revolutionized scientific discovery, yet they remain constrained by a Generate-Execute-Feedback paradigm. Previous approaches suffer from a severe Execution Bottleneck, as hypothesis evaluation relies strictly on expensive physical execution. To bypass these physical constraints, we internalize execution priors to substitute costly runtime checks with instan… ▽ More

    Submitted 7 April, 2026; v1 submitted 9 January, 2026; originally announced January 2026.

    Comments: ACL 2026

  29. arXiv:2601.03992  [pdf, ps, other

    cs.DC cs.AI

    A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems

    Authors: Qi Wu, Chao Fang, Jiayuan Chen, Ye Lin, Yueqi Zhang, Yichuan Bai, Yuan Du, Li Du

    Abstract: Mixture-of-Experts (MoE) models facilitate edge deployment by decoupling model capacity from active computation, yet their large memory footprint drives the need for GPU systems with near-data processing (NDP) capabilities that offload experts to dedicated processing units. However, deploying MoE models on such edge-based GPU-NDP systems faces three critical challenges: 1) severe load imbalance ac… ▽ More

    Submitted 7 January, 2026; originally announced January 2026.

    Comments: To appear in 2026 Design, Automation and Test in Europe Conference (DATE 2026)

  30. arXiv:2601.03676  [pdf, ps, other

    cs.CL cs.AI

    Towards Compositional Generalization of LLMs via Skill Taxonomy Guided Data Synthesis

    Authors: Yifan Wei, Li Du, Xiaoyan Yu, Yang Feng, Angsheng Li

    Abstract: Large Language Models (LLMs) and agent-based systems often struggle with compositional generalization due to a data bottleneck in which complex skill combinations follow a long-tailed, power-law distribution, limiting both instruction-following performance and generalization in agent-centric tasks. To address this challenge, we propose STEPS, a Skill Taxonomy guided Entropy-based Post-training dat… ▽ More

    Submitted 7 January, 2026; originally announced January 2026.

    Comments: The code and data for our methods and experiments are available at https://github.com/weiyifan1023/STEPS

  31. arXiv:2512.24873  [pdf, ps, other

    cs.AI cs.CL

    Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

    Authors: Weixun Wang, XiaoXiao Xu, Wanhe An, Fangwen Dai, Wei Gao, Yancheng He, Ju Huang, Qiang Ji, Hanqi Jin, Xiaoyang Li, Yang Li, Zhongwen Li, Shirong Lin, Jiashun Liu, Zenan Liu, Tao Luo, Dilxat Muhtar, Yuanbin Qu, Jiaqiang Shi, Qinghui Sun, Yingshui Tan, Hao Tang, Runze Wang, Yi Wang, Zhaoguo Wang , et al. (65 additional authors not shown)

    Abstract: Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-source community lacks a principled, end-to-end ecosystem to streamline agent development. We introduce the Agentic Learning Ecosystem (ALE), a foundational infrastructure that optimizes the production p… ▽ More

    Submitted 11 March, 2026; v1 submitted 31 December, 2025; originally announced December 2025.

    Comments: 36 pages, 15 figures

  32. arXiv:2512.24867  [pdf, ps, other

    cs.CL cs.AI

    Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

    Authors: Yiming Liang, Yizhi Li, Yantao Du, Ge Zhang, Jiayi Zhou, Yuchen Wu, Yinzhu Piao, Denghui Cao, Tong Sun, Ziniu Li, Li Du, Bo Lei, Jiaheng Liu, Chenghua Lin, Zhaoxiang Zhang, Wenhao Huang, Jiajun Zhang

    Abstract: Benchmarks play a crucial role in tracking the rapid advancement of large language models (LLMs) and identifying their capability boundaries. However, existing benchmarks predominantly curate questions at the question level, suffering from three fundamental limitations: vulnerability to data contamination, restriction to single-knowledge-point assessment, and reliance on costly domain expert annot… ▽ More

    Submitted 6 January, 2026; v1 submitted 31 December, 2025; originally announced December 2025.

  33. arXiv:2512.23519  [pdf, ps, other

    cs.CV

    IdentityStory: Taming Your Identity-Preserving Generator for Human-Centric Story Generation

    Authors: Donghao Zhou, Jingyu Lin, Guibao Shen, Quande Liu, Jialin Gao, Lihao Liu, Lan Du, Cunjian Chen, Chi-Wing Fu, Xiaowei Hu, Pheng-Ann Heng

    Abstract: Recent visual generative models enable story generation with consistent characters from text, but human-centric story generation faces additional challenges, such as maintaining detailed and diverse human face consistency and coordinating multiple characters across different images. This paper presents IdentityStory, a framework for human-centric story generation that ensures consistent character… ▽ More

    Submitted 29 December, 2025; originally announced December 2025.

    Comments: Accepted by AAAI2026 (Project page: https://correr-zhou.github.io/IdentityStory)

  34. arXiv:2512.18256  [pdf, ps, other

    cs.AI cs.LO

    MSC-180: A Benchmark for Automated Formal Theorem Proving from Mathematical Subject Classification

    Authors: Sirui Li, Wangyue Lu, Xiaorui Shi, Ke Weng, Haozhe Sun, Minghe Yu, Tiancheng Zhang, Ge Yu, Hengyu Liu, Lun Du

    Abstract: Automated Theorem Proving (ATP) represents a core research direction in artificial intelligence for achieving formal reasoning and verification, playing a significant role in advancing machine intelligence. However, current large language model (LLM)-based theorem provers suffer from limitations such as restricted domain coverage and weak generalization in mathematical reasoning. To address these… ▽ More

    Submitted 20 December, 2025; originally announced December 2025.

  35. arXiv:2512.15745  [pdf, ps, other

    cs.LG cs.AI cs.CL

    LLaDA2.0: Scaling Up Diffusion Language Models to 100B

    Authors: Tiwei Bie, Maosong Cao, Kun Chen, Lun Du, Mingliang Gong, Zhuochen Gong, Yanmei Gu, Jiaqi Hu, Zenan Huang, Zhenzhong Lan, Chengxi Li, Chongxuan Li, Jianguo Li, Zehuan Li, Huabin Liu, Lin Liu, Guoshan Lu, Xiaocheng Lu, Yuxin Ma, Jianfeng Tan, Lanning Wei, Ji-Rong Wen, Yipeng Xing, Xiaolu Zhang, Junbo Zhao , et al. (6 additional authors not shown)

    Abstract: This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establishing a new paradigm for frontier-scale deployment. Instead of costly training from scratch, LLaDA2.0 upholds knowledge inheritance, progressive adaption and efficiency-aware design principle, and sea… ▽ More

    Submitted 23 December, 2025; v1 submitted 10 December, 2025; originally announced December 2025.

    Comments: 19 pages

  36. arXiv:2512.14557  [pdf, ps, other

    cs.CR

    PrivATE: Differentially Private Average Treatment Effect Estimation for Observational Data

    Authors: Quan Yuan, Xiaochen Li, Linkang Du, Min Chen, Mingyang Sun, Yunjun Gao, Shibo He, Jiming Chen, Zhikun Zhang

    Abstract: Causal inference plays a crucial role in scientific research across multiple disciplines. Estimating causal effects, particularly the average treatment effect (ATE), from observational data has garnered significant attention. However, computing the ATE from real-world observational data poses substantial privacy risks to users. Differential privacy, which offers strict theoretical guarantees, has… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: To appear in the NDSS 2026 Symposium, February 2026, San Diego, CA, USA

  37. arXiv:2512.14439  [pdf, ps, other

    cs.CR cs.CV

    VICTOR: Dataset Copyright Auditing in Video Recognition Systems

    Authors: Quan Yuan, Zhikun Zhang, Linkang Du, Min Chen, Mingyang Sun, Yunjun Gao, Shibo He, Jiming Chen

    Abstract: Video recognition systems are increasingly being deployed in daily life, such as content recommendation and security monitoring. To enhance video recognition development, many institutions have released high-quality public datasets with open-source licenses for training advanced models. At the same time, these datasets are also susceptible to misuse and infringement. Dataset copyright auditing is… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: To appear in the NDSS Symposium 2026, February 2026, San Diego, CA, USA

  38. arXiv:2512.11342  [pdf, ps, other

    cs.LG

    DAPO: Design Structure-Aware Pass Ordering in High-Level Synthesis with Graph Contrastive and Reinforcement Learning

    Authors: Jinming Ge, Linfeng Du, Likith Anaparty, Shangkun Li, Tingyuan Liang, Afzal Ahmad, Vivek Chaturvedi, Sharad Sinha, Zhiyao Xie, Jiang Xu, Wei Zhang

    Abstract: High-Level Synthesis (HLS) tools are widely adopted in FPGA-based domain-specific accelerator design. However, existing tools rely on fixed optimization strategies inherited from software compilations, limiting their effectiveness. Tailoring optimization strategies to specific designs requires deep semantic understanding, accurate hardware metric estimation, and advanced search algorithms -- capab… ▽ More

    Submitted 12 December, 2025; originally announced December 2025.

    Comments: Accepted by DATE 2026

  39. arXiv:2512.10735  [pdf, ps, other

    cs.LG cs.AI

    LGAN: An Efficient High-Order Graph Neural Network via the Line Graph Aggregation

    Authors: Lin Du, Lu Bai, Jincheng Li, Lixin Cui, Hangyuan Du, Lichi Zhang, Yuting Chen, Zhao Li

    Abstract: Graph Neural Networks (GNNs) have emerged as a dominant paradigm for graph classification. Specifically, most existing GNNs mainly rely on the message passing strategy between neighbor nodes, where the expressivity is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Although a number of k-WL-based GNNs have been proposed to overcome this limitation, their computational cost increases ra… ▽ More

    Submitted 11 December, 2025; originally announced December 2025.

  40. arXiv:2512.04527  [pdf, ps, other

    cs.AR cs.DC

    FLEX: Leveraging FPGA-CPU Synergy for Mixed-Cell-Height Legalization Acceleration

    Authors: Xingyu Liu, Jiawei Liang, Linfeng Du, Yipu Zhang, Chaofang Ma, Hanwei Fan, Jiang Xu, Wei Zhang

    Abstract: In this work, we present FLEX, an FPGA-CPU accelerator for mixed-cell-height legalization tasks. We address challenges from the following perspectives. First, we optimize the task assignment strategy and perform an efficient task partition between FPGA and CPU to exploit their complementary strengths. Second, a multi-granularity pipelining technique is employed to accelerate the most time-consumin… ▽ More

    Submitted 4 December, 2025; originally announced December 2025.

  41. arXiv:2512.02409  [pdf, ps, other

    cs.LG cs.AI

    Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles

    Authors: Yizhou Zhang, Lun Du

    Abstract: Large-scale neural models are increasingly trained with data pruning, synthetic data generation, cross-model distillation, reinforcement learning from human feedback (RLHF), and difficulty-based sampling. While several of these data-centric strategies reliably improve training efficiency and downstream performance, others fail to provide meaningful gains -- most notably self-generated synthetic da… ▽ More

    Submitted 1 December, 2025; originally announced December 2025.

  42. arXiv:2512.01822  [pdf, ps, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    InnoGym: Benchmarking the Innovation Potential of AI Agents

    Authors: Jintian Zhang, Kewei Xu, Jingsheng Zheng, Zhuoyun Yu, Yuqi Zhu, Yujie Luo, Lanning Wei, Shuofei Qiao, Lun Du, Da Zheng, Shumin Deng, Huajun Chen, Ningyu Zhang

    Abstract: LLMs and Agents have achieved impressive progress in code generation, mathematical reasoning, and scientific discovery. However, existing benchmarks primarily measure correctness, overlooking the diversity of methods behind solutions. True innovation depends not only on producing correct answers but also on the originality of the approach. We present InnoGym, the first benchmark and framework desi… ▽ More

    Submitted 28 February, 2026; v1 submitted 1 December, 2025; originally announced December 2025.

    Comments: ICLR 2026

  43. arXiv:2511.12495  [pdf, ps, other

    cs.IR cs.SI

    Task-Aware Retrieval Augmentation for Dynamic Recommendation

    Authors: Zhen Tao, Xinke Jiang, Qingshuai Feng, Haoyu Zhang, Lun Du, Yuchen Fang, Hao Miao, Bangquan Xie, Qingqiang Sun

    Abstract: Dynamic recommendation systems aim to provide personalized suggestions by modeling temporal user-item interactions across time-series behavioral data. Recent studies have leveraged pre-trained dynamic graph neural networks (GNNs) to learn user-item representations over temporal snapshot graphs. However, fine-tuning GNNs on these graphs often results in generalization issues due to temporal discrep… ▽ More

    Submitted 16 November, 2025; originally announced November 2025.

    Comments: AAAI 2026

  44. arXiv:2511.08395  [pdf, ps, other

    cs.AR

    DRACO: Co-design for DSP-Efficient Rigid Body Dynamics Accelerator

    Authors: Xingyu Liu, Jiawei Liang, Yipu Zhang, Linfeng Du, Chaofang Ma, Hui Yu, Jiang Xu, Wei Zhang

    Abstract: We propose a hardware-efficient RBD accelerator based on FPGA, introducing three key innovations. First, we propose a precision-aware quantization framework that reduces DSP demand while preserving motion accuracy. This is also the first study to systematically evaluate quantization impact on robot control and motion for hardware acceleration. Second, we leverage a division deferring optimization… ▽ More

    Submitted 22 November, 2025; v1 submitted 11 November, 2025; originally announced November 2025.

  45. arXiv:2511.01934  [pdf, ps, other

    cs.LG cs.AI

    Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch

    Authors: Yirong Zeng, Xiao Ding, Yutai Hou, Yuxian Wang, Li Du, Juyi Dai, Qiuyang Ding, Duyu Tang, Dandan Tu, Weiwen Liu, Bing Qin, Ting Liu

    Abstract: Training tool-augmented LLMs has emerged as a promising approach to enhancing language models' capabilities for complex tasks. The current supervised fine-tuning paradigm relies on constructing extensive domain-specific datasets to train models. However, this approach often struggles to generalize effectively to unfamiliar or intricate tool-use scenarios. Recently, reinforcement learning (RL) para… ▽ More

    Submitted 10 November, 2025; v1 submitted 2 November, 2025; originally announced November 2025.

    Comments: EMNLP 2025 finding

  46. arXiv:2510.17795  [pdf, ps, other

    cs.CL cs.AI cs.LG cs.MA cs.SE

    What Makes AI Research Replicable? Executable Knowledge Graphs as Scientific Knowledge Representations

    Authors: Yujie Luo, Zhuoyun Yu, Xuehai Wang, Yuqi Zhu, Ningyu Zhang, Lanning Wei, Lun Du, Da Zheng, Huajun Chen

    Abstract: Replicating AI research is a crucial yet challenging task for large language model (LLM) agents. Existing approaches often struggle to generate executable code, primarily due to insufficient background knowledge and the limitations of retrieval-augmented generation (RAG) methods, which fail to capture latent technical details hidden in referenced papers. Furthermore, previous approaches tend to ov… ▽ More

    Submitted 21 January, 2026; v1 submitted 20 October, 2025; originally announced October 2025.

    Comments: Work in progress

  47. arXiv:2510.13257  [pdf, ps, other

    cs.CR

    GRIDAI: Generating and Repairing Intrusion Detection Rules via Collaboration among Multiple LLM-based Agents

    Authors: Jiarui Li, Yuhan Chai, Lei Du, Chenyun Duan, Hao Yan, Zhaoquan Gu

    Abstract: Rule-based network intrusion detection systems play a crucial role in the real-time detection of Web attacks. However, most existing works primarily focus on automatically generating detection rules for new attacks, often overlooking the relationships between new attacks and existing rules, which leads to significant redundancy within the ever-expanding ruleset. To address this issue, we propose G… ▽ More

    Submitted 15 October, 2025; originally announced October 2025.

  48. arXiv:2510.08666  [pdf, ps, other

    cs.CL cs.AI

    dInfer: An Efficient Inference Framework for Diffusion Language Models

    Authors: Yuxin Ma, Lun Du, Lanning Wei, Kun Chen, Qian Xu, Kangyu Wang, Guofeng Feng, Guoshan Lu, Lin Liu, Xiaojing Qi, Xinyuan Zhang, Zhen Tao, Haibo Feng, Ziyun Jiang, Ying Xu, Zenan Huang, Yihong Zhuang, Haokai Xu, Jiaqi Hu, Zhenzhong Lan, Junbo Zhao, Jianguo Li, Da Zheng

    Abstract: Diffusion-based large language models (dLLMs) have emerged as a promising alternative to autoregressive (AR) LLMs, leveraging denoising-based generation to enable inherent parallelism. Even more and more open-sourced dLLM models emerge, yet their widespread adoption remains constrained by the lack of a standardized and efficient inference framework. We present dInfer, an efficient and extensible f… ▽ More

    Submitted 22 October, 2025; v1 submitted 9 October, 2025; originally announced October 2025.

  49. arXiv:2510.01436  [pdf, ps, other

    cs.SC

    Symmetric Division of Linear Ordinary Differential Operators

    Authors: Lixin Du, Manuel Kauers

    Abstract: The symmetric product of two ordinary linear differential operators $L_1,L_2$ is an operator whose solution set contains the product $f_1f_2$ of any solution $f_1$ of $L_1$ and any solution $f_2$ of~$L_2$. It is well known how to compute the symmetric product of two given operators $L_1,L_2$. In this paper we consider the corresponding division problem: given a symmetric product $L$ and one of its… ▽ More

    Submitted 1 October, 2025; originally announced October 2025.

  50. arXiv:2510.00732  [pdf, ps, other

    cs.AI

    EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty

    Authors: Yuchen Tian, Ruiyuan Huang, Xuanwu Wang, Jing Ma, Zengfeng Huang, Ziyang Luo, Hongzhan Lin, Da Zheng, Lun Du

    Abstract: Large Language Models (LLMs) for formal theorem proving have shown significant promise, yet they often lack generalizability and are fragile to even minor transformations of problem statements. To address this limitation, we introduce a novel data augmentation pipeline designed to enhance model robustness from two perspectives: symmetry and difficulty. From the symmetry perspective, we propose two… ▽ More

    Submitted 1 October, 2025; originally announced October 2025.