Skip to main content

Showing 1–50 of 84 results for author: Yue, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.04074  [pdf, ps, other

    cs.AI cs.LG

    FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

    Authors: Hang Xu, Ling Yue, Chaoqian Ouyang, Yuchen Liu, Libin Zheng, Shaowu Pan, Shimin Di, Min-Ling Zhang

    Abstract: Peer review in machine learning is under growing pressure from rising submission volume and limited reviewer time. Most LLM-based reviewing systems read only the manuscript and generate comments from the paper's own narrative. This makes their outputs sensitive to presentation quality and leaves them weak when the evidence needed for review lies in related work or released code. We present FactRev… ▽ More

    Submitted 7 April, 2026; v1 submitted 5 April, 2026; originally announced April 2026.

  2. arXiv:2603.22386  [pdf, ps, other

    cs.AI cs.CL

    From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

    Authors: Ling Yue, Kushal Raj Bhandari, Ching-Yun Ko, Dhaval Patel, Shuxin Lin, Nianjun Zhou, Jianxi Gao, Pin-Yu Chen, Shaowu Pan

    Abstract: Large language model (LLM)-based systems are becoming increasingly popular for solving tasks by constructing executable workflows that interleave LLM calls, information retrieval, tool use, code execution, memory updates, and verification. This survey reviews recent methods for designing and optimizing such workflows, which we treat as agentic computation graphs (ACGs). We organize the literature… ▽ More

    Submitted 23 March, 2026; originally announced March 2026.

  3. arXiv:2603.17368  [pdf, ps, other

    cs.AI

    Towards Safer Large Reasoning Models by Promoting Safety Decision-Making before Chain-of-Thought Generation

    Authors: Jianan Chen, Zhifang Zhang, Shuo He, Linan Yue, Lei Feng, Minling Zhang

    Abstract: Large reasoning models (LRMs) achieved remarkable performance via chain-of-thought (CoT), but recent studies showed that such enhanced reasoning capabilities are at the expense of significantly degraded safety capabilities. In this paper, we reveal that LRMs' safety degradation occurs only after CoT is enabled, and this degradation is not observed when CoT is disabled. This observation motivates u… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

  4. arXiv:2603.09290  [pdf, ps, other

    cs.SE cs.CE cs.MA

    ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization

    Authors: Shimin Di, Xujie Yuan, Hanghui Guo, Chaoqian Ouyang, Zhangze Chen, Ling Yue, Libin Zheng, Jia Zhu, Shaowu Pan, Jian Yin, Min-Ling Zhang, Yong Rui

    Abstract: Reusing and invoking existing code remains costly and unreliable, as most practical tools are embedded in heterogeneous code repositories and lack standardized, executable interfaces. Although large language models (LLMs) and Model Context Protocol (MCP)-based tool invocation frameworks enable natural language task execution, current approaches rely heavily on manual tool curation and standardizat… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

    Comments: 20 pages

  5. arXiv:2603.09163  [pdf, ps, other

    cs.RO

    SPAN-Nav: Generalized Spatial Awareness for Versatile Vision-Language Navigation

    Authors: Jiahang Liu, Tianyu Xu, Jiawei Chen, Lu Yue, Jiazhao Zhang, Zhiyong Wang, Minghan Li, Qisheng Zhao, Anqi Li, Qi Su, Zhizheng Zhang, He Wang

    Abstract: Recent embodied navigation approaches leveraging Vision-Language Models (VLMs) demonstrate strong generalization in versatile Vision-Language Navigation (VLN). However, reliable path planning in complex environments remains challenging due to insufficient spatial awareness. In this work, we introduce SPAN-Nav, an end-to-end foundation model designed to infuse embodied navigation with universal 3D… ▽ More

    Submitted 9 March, 2026; originally announced March 2026.

  6. arXiv:2602.21612  [pdf, ps, other

    cs.RO

    Jumping Control for a Quadrupedal Wheeled-Legged Robot via NMPC and DE Optimization

    Authors: Xuanqi Zeng, Lingwei Zhang, Linzhu Yue, Zhitao Song, Hongbo Zhang, Tianlin Zhang, Yun-Hui Liu

    Abstract: Quadrupedal wheeled-legged robots combine the advantages of legged and wheeled locomotion to achieve superior mobility, but executing dynamic jumps remains a significant challenge due to the additional degrees of freedom introduced by wheeled legs. This paper develops a mini-sized wheeled-legged robot for agile motion and presents a novel motion control framework that integrates the Nonlinear Mode… ▽ More

    Submitted 25 February, 2026; originally announced February 2026.

    Comments: 8 pages, 12 figures

  7. arXiv:2602.09485  [pdf, ps, other

    cs.AI

    Bridging Efficiency and Transparency: Explainable CoT Compression in Multimodal Large Reasoning Models

    Authors: Yizhi Wang, Linan Yue, Min-Ling Zhang

    Abstract: Long chains of thought (Long CoTs) are widely employed in multimodal reasoning models to tackle complex tasks by capturing detailed visual information. However, these Long CoTs are often excessively lengthy and contain redundant reasoning steps, which can hinder inference efficiency. Compressing these long CoTs is a natural solution, yet existing approaches face two major challenges: (1) they may… ▽ More

    Submitted 10 February, 2026; originally announced February 2026.

  8. arXiv:2601.23032  [pdf, ps, other

    cs.AI

    Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning

    Authors: Siyu Gong, Linan Yue, Weibo Gao, Fangzhou Yao, Shimin Di, Lei Feng, Min-Ling Zhang

    Abstract: Tool-Integrated Reasoning (TIR) enables large language models (LLMs) to solve complex tasks by interacting with external tools, yet existing approaches depend on high-quality synthesized trajectories selected by scoring functions and sparse outcome-based rewards, providing limited and biased supervision for learning TIR. To address these challenges, in this paper, we propose AutoTraj, a two-stage… ▽ More

    Submitted 30 January, 2026; originally announced January 2026.

  9. arXiv:2601.19259  [pdf, ps, other

    cs.CE

    Learning Collective Medication Effects via Multi-level Abstraction for Medication Recommendation

    Authors: Yanda Wang, Weitong Chen, Chao Tan, Ian Nabney, Lin Yue, Genlin Ji

    Abstract: Historical prescriptions and selected candidate drugs relevant to the current visit serve as important references for medication recommendation. However, in the absence of explicit intrinsic principles for semantic composition, existing methods treat synergistic drugs as independent entities and fail to capture their collective therapeutic effects, resulting in a mismatch between medication-level… ▽ More

    Submitted 27 January, 2026; originally announced January 2026.

  10. arXiv:2601.12766  [pdf, ps, other

    cs.CV eess.SY

    Spatial-VLN: Zero-Shot Vision-and-Language Navigation With Explicit Spatial Perception and Exploration

    Authors: Lu Yue, Yue Fan, Shiwei Lian, Yu Zhao, Jiaxin Yu, Liang Xie, Feitian Zhang

    Abstract: Zero-shot Vision-and-Language Navigation (VLN) agents leveraging Large Language Models (LLMs) excel in generalization but suffer from insufficient spatial perception. Focusing on complex continuous environments, we categorize key perceptual bottlenecks into three spatial challenges: door interaction,multi-room navigation, and ambiguous instruction execution, where existing methods consistently suf… ▽ More

    Submitted 19 January, 2026; originally announced January 2026.

  11. arXiv:2512.24609  [pdf

    cs.AI

    Reinforcement Learning-Augmented LLM Agents for Collaborative Decision Making and Performance Optimization

    Authors: Dong Qiu, Duo Xu, Limengxi Yue

    Abstract: Large Language Models (LLMs) perform well in language tasks but often lack collaborative awareness and struggle to optimize global performance in multi-agent settings. We present a reinforcement learning-augmented LLM agent framework that formulates cooperation as a decentralized partially observable Markov decision process (Dec-POMDP) and adopts centralized training with decentralized execution (… ▽ More

    Submitted 30 December, 2025; originally announced December 2025.

    Comments: Accepted by IEEE ICFTIC 2025

  12. arXiv:2512.18956  [pdf, ps, other

    cs.AI cs.LG

    Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection

    Authors: Yizhi Wang, Linan Yue, Min-Ling Zhang

    Abstract: Large Reasoning Models (LRMs) have demonstrated remarkable performance on complex reasoning tasks through long Chain-of-Thought (CoT) reasoning. Extending these successes to multimodal reasoning remains challenging due to the increased complexity of integrating diverse input modalities and the scarcity of high-quality long CoT training data. Existing multimodal datasets and CoT synthesis methods s… ▽ More

    Submitted 14 February, 2026; v1 submitted 21 December, 2025; originally announced December 2025.

  13. arXiv:2512.18571  [pdf, ps, other

    cs.AI cs.CV

    ESearch-R1: Learning Cost-Aware MLLM Agents for Interactive Embodied Search via Reinforcement Learning

    Authors: Weijie Zhou, Xuangtang Xiong, Ye Tian, Lijun Yue, Xinyu Wu, Wei Li, Chaoyang Zhao, Honghui Dong, Ming Tang, Jinqiao Wang, Zhengyou Zhang

    Abstract: Multimodal Large Language Models (MLLMs) have empowered embodied agents with remarkable capabilities in planning and reasoning. However, when facing ambiguous natural language instructions (e.g., "fetch the tool" in a cluttered room), current agents often fail to balance the high cost of physical exploration against the cognitive cost of human interaction. They typically treat disambiguation as a… ▽ More

    Submitted 20 December, 2025; originally announced December 2025.

  14. arXiv:2511.17532  [pdf, ps, other

    cs.NI cs.AI

    Denoising Refinement Diffusion Models for Simultaneous Generation of Multi-scale Mobile Network Traffic

    Authors: Xiaoqian Qi, Haoye Chai, Sichang Liu, Lei Yue, Raoyuan Pan, Yue Wang, Yong Li

    Abstract: The planning, management, and resource scheduling of cellular mobile networks require joint estimation of mobile traffic across different layers and nodes. Mobile traffic generation can proactively anticipate user demands and capture the dynamics of network load. However, existing methods mainly focus on generating traffic at a single spatiotemporal resolution, making it difficult to jointly model… ▽ More

    Submitted 24 November, 2025; v1 submitted 29 October, 2025; originally announced November 2025.

  15. arXiv:2511.17092  [pdf, ps, other

    cs.CV

    SPAGS: Sparse-View Articulated Object Reconstruction from Single State via Planar Gaussian Splatting

    Authors: Di Wu, Liu Liu, Xueyu Yuan, Wenxiao Chen, Lijun Yue, Liuzhu Chen, Yiming Tang, Meng Wang

    Abstract: Articulated objects are ubiquitous in daily environments, and their 3D reconstruction holds great significance across various fields. However, existing articulated object reconstruction methods typically require costly inputs such as multi-stage and multi-view observations. To address the limitations, we propose a category-agnostic articulated object reconstruction framework via planar Gaussian Sp… ▽ More

    Submitted 2 April, 2026; v1 submitted 21 November, 2025; originally announced November 2025.

    Comments: 10 pages, 7 figures

  16. arXiv:2511.16140  [pdf, ps, other

    cs.CV

    Real-Time 3D Object Detection with Inference-Aligned Learning

    Authors: Chenyu Zhao, Xianwei Zheng, Zimin Xia, Linwei Yue, Nan Xue

    Abstract: Real-time 3D object detection from point clouds is essential for dynamic scene understanding in applications such as augmented reality, robotics and navigation. We introduce a novel Spatial-prioritized and Rank-aware 3D object detection (SR3D) framework for indoor point clouds, to bridge the gap between how detectors are trained and how they are evaluated. This gap stems from the lack of spatial r… ▽ More

    Submitted 20 November, 2025; originally announced November 2025.

    Comments: Accepted by AAAI 2026

  17. arXiv:2511.14446  [pdf, ps, other

    cs.CV cs.AI

    Agentic Video Intelligence: A Flexible Framework for Advanced Video Exploration and Understanding

    Authors: Hong Gao, Yiming Bao, Xuezhen Tu, Yutong Xu, Yue Jin, Yiyang Mu, Bin Zhong, Linan Yue, Min-Ling Zhang

    Abstract: Video understanding requires not only visual recognition but also complex reasoning. While Vision-Language Models (VLMs) demonstrate impressive capabilities, they typically process videos largely in a single-pass manner with limited support for evidence revisit and iterative refinement. While recently emerging agent-based methods enable long-horizon reasoning, they either depend heavily on expensi… ▽ More

    Submitted 18 November, 2025; originally announced November 2025.

  18. arXiv:2510.24120  [pdf, ps, other

    cs.LG

    Graph-Guided Concept Selection for Efficient Retrieval-Augmented Generation

    Authors: Ziyu Liu, Yijing Liu, Jianfei Yuan, Minzhi Yan, Le Yue, Honghui Xiong, Yi Yang

    Abstract: Graph-based RAG constructs a knowledge graph (KG) from text chunks to enhance retrieval in Large Language Model (LLM)-based question answering. It is especially beneficial in domains such as biomedicine, law, and political science, where effective retrieval often involves multi-hop reasoning over proprietary documents. However, these methods demand numerous LLM calls to extract entities and relati… ▽ More

    Submitted 28 October, 2025; originally announced October 2025.

  19. arXiv:2510.17491  [pdf, ps, other

    cs.CL

    Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents

    Authors: Yihong Tang, Kehai Chen, Liang Yue, Jinxin Fan, Caishen Zhou, Xiaoguang Li, Yuyang Zhang, Mingming Zhao, Shixiong Kai, Kaiyang Guo, Xingshan Zeng, Wenjing Cun, Lifeng Shang, Min Zhang

    Abstract: With the rise of large language models (LLMs), LLM agents capable of autonomous reasoning, planning, and executing complex tasks have become a frontier in artificial intelligence. However, how to translate the research on general agents into productivity that drives industry transformations remains a significant challenge. To address this, this paper systematically reviews the technologies, applic… ▽ More

    Submitted 20 October, 2025; originally announced October 2025.

  20. arXiv:2509.20374  [pdf, ps, other

    cs.CL cs.AI

    CFDLLMBench: A Benchmark Suite for Evaluating Large Language Models in Computational Fluid Dynamics

    Authors: Nithin Somasekharan, Ling Yue, Yadi Cao, Weichao Li, Patrick Emami, Pochinapeddi Sai Bhargav, Anurag Acharya, Xingyu Xie, Shaowu Pan

    Abstract: Large Language Models (LLMs) have demonstrated strong performance across general NLP tasks, but their utility in automating numerical experiments of complex physical system -- a critical and labor-intensive component -- remains underexplored. As the major workhorse of computational science over the past decades, Computational Fluid Dynamics (CFD) offers a uniquely challenging testbed for evaluatin… ▽ More

    Submitted 10 October, 2025; v1 submitted 19 September, 2025; originally announced September 2025.

  21. arXiv:2509.18178  [pdf, ps, other

    cs.AI cs.CE cs.LG

    Foam-Agent 2.0: An End-to-End Composable Multi-Agent Framework for Automating CFD Simulation in OpenFOAM

    Authors: Ling Yue, Nithin Somasekharan, Tingwen Zhang, Yadi Cao, Shaowu Pan

    Abstract: Computational Fluid Dynamics (CFD) is an essential simulation tool in engineering, yet its steep learning curve and complex manual setup create significant barriers. To address these challenges, we introduce Foam-Agent, a multi-agent framework that automates the entire end-to-end OpenFOAM workflow from a single natural language prompt. Our key innovations address critical gaps in existing systems:… ▽ More

    Submitted 30 September, 2025; v1 submitted 17 September, 2025; originally announced September 2025.

  22. arXiv:2509.05941  [pdf, ps, other

    cs.SE cs.LG cs.MA

    Code2MCP: Transforming Code Repositories into MCP Services

    Authors: Chaoqian Ouyang, Ling Yue, Shimin Di, Libin Zheng, Linan Yue, Shaowu Pan, Jian Yin, Min-Ling Zhang

    Abstract: The Model Context Protocol (MCP) aims to create a standard for how Large Language Models use tools. However, most current research focuses on selecting tools from an existing pool. A more fundamental, yet largely overlooked, problem is how to populate this pool by converting the vast number of existing software projects into MCP-compatible services. To bridge this gap, we introduce Code2MCP, an ag… ▽ More

    Submitted 10 February, 2026; v1 submitted 7 September, 2025; originally announced September 2025.

  23. arXiv:2508.08547  [pdf, ps, other

    cs.CV

    Calibration Attention: Learning Reliability-Aware Representations for Vision Transformers

    Authors: Wenhao Liang, Wei Emma Zhang, Lin Yue, Miao Xu, Mingyu Guo, Olaf Maennel, Weitong Chen

    Abstract: Most calibration methods operate at the logit level, implicitly assuming that miscalibration can be corrected without changing the underlying representation. We challenge this assumption and propose \textbf{Calibration Attention (CalAttn)}, a \emph{representation-aware} calibration module for vision transformers that couples instance-wise temperature scaling to transformer token geometry under a p… ▽ More

    Submitted 19 January, 2026; v1 submitted 11 August, 2025; originally announced August 2025.

    Comments: UnderReview

  24. arXiv:2508.02120  [pdf, ps, other

    cs.AI

    Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

    Authors: Linan Yue, Yichao Du, Yizhi Wang, Weibo Gao, Fangzhou Yao, Li Wang, Ye Liu, Ziyu Xu, Qi Liu, Shimin Di, Min-Ling Zhang

    Abstract: Recently, Large Reasoning Models (LRMs) have gradually become a research hotspot due to their outstanding performance in handling complex tasks. Among them, DeepSeek R1 has garnered significant attention for its exceptional performance and open-source nature, driving advancements in the research of R1-style LRMs. Unlike traditional Large Language Models (LLMs), these models enhance logical deducti… ▽ More

    Submitted 4 August, 2025; originally announced August 2025.

  25. arXiv:2506.14278  [pdf, ps, other

    cs.RO

    Whole-Body Control Framework for Humanoid Robots with Heavy Limbs: A Model-Based Approach

    Authors: Tianlin Zhang, Linzhu Yue, Hongbo Zhang, Lingwei Zhang, Xuanqi Zeng, Zhitao Song, Yun-Hui Liu

    Abstract: Humanoid robots often face significant balance issues due to the motion of their heavy limbs. These challenges are particularly pronounced when attempting dynamic motion or operating in environments with irregular terrain. To address this challenge, this manuscript proposes a whole-body control framework for humanoid robots with heavy limbs, using a model-based approach that combines a kino-dynami… ▽ More

    Submitted 15 November, 2025; v1 submitted 17 June, 2025; originally announced June 2025.

  26. arXiv:2506.04953  [pdf, ps, other

    cs.CV

    APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval

    Authors: Hong Gao, Yiming Bao, Xuezhen Tu, Bin Zhong, Linan Yue, Minling Zhang

    Abstract: Current multimodal large language models (MLLMs) struggle with hour-level video understanding, facing significant challenges not only in modeling the substantial information volume of long videos but also in overcoming the memory wall and resource constraints during both training and inference. Although recent training-free approaches have alleviated resource demands by compressing visual features… ▽ More

    Submitted 15 November, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted by AAAI 2026

  27. arXiv:2506.02689  [pdf, ps, other

    cs.CL

    MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching

    Authors: Liang Yue, Yihong Tang, Kehai Chen, Jie Liu, Min Zhang

    Abstract: Instruction fine-tuning is crucial in NLP tasks, enhancing pretrained models' instruction-following capabilities and task-specific performance. However, obtaining high-quality fine-tuning data for large models is challenging due to data collection difficulties and high production costs. To address this, we propose MASTER, a novel data augmentation method that enriches original data through interac… ▽ More

    Submitted 3 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

  28. arXiv:2505.04997  [pdf, ps, other

    cs.AI cs.MA

    Foam-Agent: Towards Automated Intelligent CFD Workflows

    Authors: Ling Yue, Nithin Somasekharan, Tingwen Zhang, Yadi Cao, Zhangze Chen, Shimin Di, Shaowu Pan

    Abstract: Computational fluid dynamics (CFD) has been the main workhorse of computational physics. Yet its steep learning curve and fragmented, multi-stage workflow create significant barriers. To address these challenges, we present Foam-Agent, a multi-agent framework leveraging large language models (LLMs) to automate the end-to-end CFD workflow from a single natural language prompt. Foam-Agent orchestrat… ▽ More

    Submitted 4 March, 2026; v1 submitted 8 May, 2025; originally announced May 2025.

  29. arXiv:2505.02027  [pdf, ps, other

    cs.LG cs.AI cs.SI

    GraphPrompter: Multi-stage Adaptive Prompt Optimization for Graph In-Context Learning

    Authors: Rui Lv, Zaixi Zhang, Kai Zhang, Qi Liu, Weibo Gao, Jiawei Liu, Jiaxia Yan, Linan Yue, Fangzhou Yao

    Abstract: Graph In-Context Learning, with the ability to adapt pre-trained graph models to novel and diverse downstream graphs without updating any parameters, has gained much attention in the community. The key to graph in-context learning is to perform downstream graphs conditioned on chosen prompt examples. Existing methods randomly select subgraphs or edges as prompts, leading to noisy graph prompts and… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: 14 pages. IEEE International Conference on Data Engineering (ICDE'2025), accepted

  30. arXiv:2504.09843  [pdf, ps, other

    cs.CV cs.RO

    ST-Booster: An Iterative SpatioTemporal Perception Booster for Vision-and-Language Navigation in Continuous Environments

    Authors: Lu Yue, Dongliang Zhou, Liang Xie, Erwei Yin, Feitian Zhang

    Abstract: Vision-and-Language Navigation in Continuous Environments (VLN-CE) requires agents to navigate unknown, continuous spaces based on natural language instructions. Compared to discrete settings, VLN-CE poses two core perception challenges. First, the absence of predefined observation points leads to heterogeneous visual memories and weakened global spatial correlations. Second, cumulative reconstruc… ▽ More

    Submitted 2 December, 2025; v1 submitted 13 April, 2025; originally announced April 2025.

    Comments: 11 pages, 7 figures

  31. arXiv:2504.01872  [pdf, ps, other

    cs.CV

    CoMatcher: Multi-View Collaborative Feature Matching

    Authors: Jintao Zhang, Zimin Xia, Mingyue Dong, Shuhan Shen, Linwei Yue, Xianwei Zheng

    Abstract: This paper proposes a multi-view collaborative matching strategy for reliable track construction in complex scenarios. We observe that the pairwise matching paradigms applied to image set matching often result in ambiguous estimation when the selected independent pairs exhibit significant occlusions or extreme viewpoint changes. This challenge primarily stems from the inherent uncertainty in inter… ▽ More

    Submitted 20 August, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

    Comments: 15 pages, 7 figures, to be published in CVPR 2025

    ACM Class: I.4.8; I.2.10; I.5.4

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

  32. arXiv:2503.05107  [pdf, ps, other

    eess.IV cs.CV

    We Care Each Pixel: Calibrating on Medical Segmentation Model

    Authors: Wenhao Liang, Wei Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

    Abstract: Medical image segmentation is fundamental for computer-aided diagnostics, providing accurate delineation of anatomical structures and pathological regions. While common metrics such as Accuracy, DSC, IoU, and HD primarily quantify spatial agreement between predictions and ground-truth labels, they do not assess the calibration quality of segmentation models, which is crucial for clinical reliabili… ▽ More

    Submitted 13 June, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: Under Reviewing

  33. Enhancing Autonomous Vehicle-Pedestrian Interaction in Shared Spaces: The Impact of Intended Path-Projection

    Authors: Le Yue, Tram Thi Minh Tran, Xinyan Yu, Marius Hoggenmueller

    Abstract: External Human-Machine Interfaces (eHMIs) are critical for seamless interactions between autonomous vehicles (AVs) and pedestrians in shared spaces. However, they often struggle to adapt to these environments, where pedestrian movement is fluid and right-of-way is ambiguous. To address these challenges, we propose PaveFlow, an eHMI that projects the AV's intended path onto the ground in real time,… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  34. arXiv:2503.04184  [pdf

    cs.NI cs.AI cs.CL

    Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences

    Authors: Adnan Shahid, Adrian Kliks, Ahmed Al-Tahmeesschi, Ahmed Elbakary, Alexandros Nikou, Ali Maatouk, Ali Mokh, Amirreza Kazemi, Antonio De Domenico, Athanasios Karapantelakis, Bo Cheng, Bo Yang, Bohao Wang, Carlo Fischione, Chao Zhang, Chaouki Ben Issaid, Chau Yuen, Chenghui Peng, Chongwen Huang, Christina Chaccour, Christo Kurisummoottil Thomas, Dheeraj Sharma, Dimitris Kalogiros, Dusit Niyato, Eli De Poorter , et al. (110 additional authors not shown)

    Abstract: This white paper discusses the role of large-scale AI in the telecommunications industry, with a specific focus on the potential of generative AI to revolutionize network functions and user experiences, especially in the context of 6G systems. It highlights the development and deployment of Large Telecom Models (LTMs), which are tailored AI models designed to address the complex challenges faced b… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  35. arXiv:2503.01195  [pdf, other

    cs.LG cs.CV

    PostHoc FREE Calibrating on Kolmogorov Arnold Networks

    Authors: Wenhao Liang, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

    Abstract: Kolmogorov Arnold Networks (KANs) are neural architectures inspired by the Kolmogorov Arnold representation theorem that leverage B Spline parameterizations for flexible, locally adaptive function approximation. Although KANs can capture complex nonlinearities beyond those modeled by standard MultiLayer Perceptrons (MLPs), they frequently exhibit miscalibrated confidence estimates manifesting as o… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: Under reviewing

  36. arXiv:2501.11945  [pdf, other

    cs.RO

    Learning to Hop for a Single-Legged Robot with Parallel Mechanism

    Authors: Hongbo Zhang, Xiangyu Chu, Yanlin Chen, Yunxi Tang, Linzhu Yue, Yun-Hui Liu, Kwok Wai Samuel Au

    Abstract: This work presents the application of reinforcement learning to improve the performance of a highly dynamic hopping system with a parallel mechanism. Unlike serial mechanisms, parallel mechanisms can not be accurately simulated due to the complexity of their kinematic constraints and closed-loop structures. Besides, learning to hop suffers from prolonged aerial phase and the sparse nature of the r… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

  37. arXiv:2501.10332  [pdf, other

    cs.CY cs.AI

    Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems

    Authors: Weibo Gao, Qi Liu, Linan Yue, Fangzhou Yao, Rui Lv, Zheng Zhang, Hao Wang, Zhenya Huang

    Abstract: Personalized learning represents a promising educational strategy within intelligent educational systems, aiming to enhance learners' practice efficiency. However, the discrepancy between offline metrics and online performance significantly impedes their progress. To address this challenge, we introduce Agent4Edu, a novel personalized learning simulator leveraging recent advancements in human inte… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: Accepted by AAAI2025

  38. arXiv:2501.09283  [pdf, other

    cs.LG

    Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability

    Authors: Liangwewi Nathan Zheng, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

    Abstract: Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of… ▽ More

    Submitted 15 January, 2025; originally announced January 2025.

  39. arXiv:2501.04733  [pdf

    cs.AI cs.ET cs.LG physics.ao-ph

    AI-Driven Reinvention of Hydrological Modeling for Accurate Predictions and Interpretation to Transform Earth System Modeling

    Authors: Cuihui Xia, Lei Yue, Deliang Chen, Yuyang Li, Hongqiang Yang, Ancheng Xue, Zhiqiang Li, Qing He, Guoqing Zhang, Dambaru Ballab Kattel, Lei Lei, Ming Zhou

    Abstract: Traditional equation-driven hydrological models often struggle to accurately predict streamflow in challenging regional Earth systems like the Tibetan Plateau, while hybrid and existing algorithm-driven models face difficulties in interpreting hydrological behaviors. This work introduces HydroTrace, an algorithm-driven, data-agnostic model that substantially outperforms these approaches, achieving… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

  40. arXiv:2411.04494  [pdf, other

    cs.RO eess.SY

    Online Omnidirectional Jumping Trajectory Planning for Quadrupedal Robots on Uneven Terrains

    Authors: Linzhu Yue, Zhitao Song, Jinhu Dong, Zhongyu Li, Hongbo Zhang, Lingwei Zhang, Xuanqi Zeng, Koushil Sreenath, Yun-hui Liu

    Abstract: Natural terrain complexity often necessitates agile movements like jumping in animals to improve traversal efficiency. To enable similar capabilities in quadruped robots, complex real-time jumping maneuvers are required. Current research does not adequately address the problem of online omnidirectional jumping and neglects the robot's kinodynamic constraints during trajectory generation. This pape… ▽ More

    Submitted 9 November, 2024; v1 submitted 7 November, 2024; originally announced November 2024.

    Comments: Submitted to IJRR

  41. arXiv:2411.02066  [pdf, other

    cs.LG cs.AI

    Collaborative Cognitive Diagnosis with Disentangled Representation Learning for Learner Modeling

    Authors: Weibo Gao, Qi Liu, Linan Yue, Fangzhou Yao, Hao Wang, Yin Gu, Zheng Zhang

    Abstract: Learners sharing similar implicit cognitive states often display comparable observable problem-solving performances. Leveraging collaborative connections among such similar learners proves valuable in comprehending human learning. Motivated by the success of collaborative modeling in various domains, such as recommender systems, we aim to investigate how collaborative signals among learners contri… ▽ More

    Submitted 10 November, 2024; v1 submitted 4 November, 2024; originally announced November 2024.

    Comments: Accepted by NeurIPS2024

  42. arXiv:2410.19733  [pdf, ps, other

    cs.AI

    ReMe: Scaffolding Personalized Cognitive Training via Controllable LLM-Mediated Conversations

    Authors: Zilong Wang, Nan Chen, Luna K. Qiu, Ling Yue, Geli Guo, Yang Ou, Shiqi Jiang, Yuqing Yang, Lili Qiu

    Abstract: Global aging calls for scalable and engaging cognitive interventions. Computerized cognitive training (CCT) is a promising non-pharmacological approach, yet many unsupervised programs rely on rigid, hand-authored puzzles that are difficult to personalize and can hinder adherence. Large language models (LLMs) offer more natural interaction, but their open-ended generation complicates the controlled… ▽ More

    Submitted 27 March, 2026; v1 submitted 25 October, 2024; originally announced October 2024.

  43. arXiv:2410.12326  [pdf, other

    cs.LG

    Understanding Why Large Language Models Can Be Ineffective in Time Series Analysis: The Impact of Modality Alignment

    Authors: Liangwei Nathan Zheng, Chang George Dong, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

    Abstract: Large Language Models (LLMs) have demonstrated impressive performance in time series analysis and seems to understand the time temporal relationship well than traditional transformer-based approaches. However, since LLMs are not designed for time series tasks, simpler models like linear regressions can often achieve comparable performance with far less complexity. In this study, we perform extensi… ▽ More

    Submitted 26 May, 2025; v1 submitted 16 October, 2024; originally announced October 2024.

  44. arXiv:2410.12257  [pdf, other

    cs.LG

    Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics

    Authors: Liangwei Nathan Zheng, Zhengyang Li, Chang George Dong, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

    Abstract: Irregular Time Series Data (IRTS) has shown increasing prevalence in real-world applications. We observed that IRTS can be divided into two specialized types: Natural Irregular Time Series (NIRTS) and Accidental Irregular Time Series (AIRTS). Various existing methods either ignore the impacts of irregular patterns or statically learn the irregular dynamics of NIRTS and AIRTS data and suffer from l… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  45. arXiv:2410.12249  [pdf, other

    cs.LG

    Devil in the Tail: A Multi-Modal Framework for Drug-Drug Interaction Prediction in Long Tail Distinction

    Authors: Liangwei Nathan Zheng, Chang George Dong, Wei Emma Zhang, Xin Chen, Lin Yue, Weitong Chen

    Abstract: Drug-drug interaction (DDI) identification is a crucial aspect of pharmacology research. There are many DDI types (hundreds), and they are not evenly distributed with equal chance to occur. Some of the rarely occurred DDI types are often high risk and could be life-critical if overlooked, exemplifying the long-tailed distribution problem. Existing models falter against this distribution challenge… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  46. arXiv:2410.10621  [pdf, other

    cs.RO

    Traversability-Aware Legged Navigation by Learning from Real-World Visual Data

    Authors: Hongbo Zhang, Zhongyu Li, Xuanqi Zeng, Laura Smith, Kyle Stachowicz, Dhruv Shah, Linzhu Yue, Zhitao Song, Weipeng Xia, Sergey Levine, Koushil Sreenath, Yun-hui Liu

    Abstract: The enhanced mobility brought by legged locomotion empowers quadrupedal robots to navigate through complex and unstructured environments. However, optimizing agile locomotion while accounting for the varying energy costs of traversing different terrains remains an open challenge. Most previous work focuses on planning trajectories with traversability cost estimation based on human-labeled environm… ▽ More

    Submitted 11 November, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

  47. arXiv:2409.08516  [pdf, ps, other

    cs.CV

    AWF: Adaptive Weight Fusion for Enhanced Class Incremental Semantic Segmentation

    Authors: Zechao Sun, Shuying Piao, Haolin Jin, Chang Dong, Lin Yue, Weitong Chen, Luping Zhou

    Abstract: Class Incremental Semantic Segmentation (CISS) aims to mitigate catastrophic forgetting by maintaining a balance between previously learned and newly introduced knowledge. Existing methods, primarily based on regularization techniques like knowledge distillation, help preserve old knowledge but often face challenges in effectively integrating new knowledge, resulting in limited overall improvement… ▽ More

    Submitted 30 June, 2025; v1 submitted 12 September, 2024; originally announced September 2024.

    Comments: 10 pages,6 figures

  48. arXiv:2408.07340  [pdf, other

    cs.LG cs.AI

    Towards Few-shot Self-explaining Graph Neural Networks

    Authors: Jingyu Peng, Qi Liu, Linan Yue, Zaixi Zhang, Kai Zhang, Yunhao Sha

    Abstract: Recent advancements in Graph Neural Networks (GNNs) have spurred an upsurge of research dedicated to enhancing the explainability of GNNs, particularly in critical domains such as medicine. A promising approach is the self-explaining method, which outputs explanations along with predictions. However, existing self-explaining models require a large amount of training data, rendering them unavailabl… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  49. arXiv:2408.05696  [pdf, ps, other

    cs.LG q-bio.QM

    SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction

    Authors: Bohao Xu, Yingzhou Lu, Chenhao Li, Ling Yue, Xiao Wang, Tianfan Fu, Minjie Shen, Lulu Chen

    Abstract: In drug discovery, predicting the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of small-molecule drugs is critical for ensuring safety and efficacy. However, the process of accurately predicting these properties is often resource-intensive and requires extensive experimental data. To address this challenge, we propose SMILES-Mamba, a two-stage model that leverag… ▽ More

    Submitted 26 March, 2026; v1 submitted 11 August, 2024; originally announced August 2024.

  50. arXiv:2408.02600  [pdf, ps, other

    cs.CL

    BioMamba: Domain-Adaptive Biomedical Language Models

    Authors: Ling Yue, Mingzhi Zhu, Sixue Xing, Shaowu Pan, Vijil Chenthamarakshan, Yanbo Wang, Yunning Cao, Payel Das, Tianfan Fu

    Abstract: Background: Biomedical language models should improve performance on biomedical text while retaining general-domain language ability. For Mamba-based models, this trade-off has not been clearly studied across biomedical literature and clinical text. Methods: We developed BioMamba, a family of biomedical models obtained by continued pretraining of public Mamba2 checkpoints on PubMed, with small amo… ▽ More

    Submitted 17 March, 2026; v1 submitted 5 August, 2024; originally announced August 2024.