Skip to main content

Showing 1–50 of 121 results for author: Ren, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.07375  [pdf, ps, other

    cs.CY cs.HC

    Assessing the Feasibility of a Video-Based Conversational Chatbot Survey for Measuring Perceived Cycling Safety: A Pilot Study in New York City

    Authors: Feiyang Ren, Zhaoxi Zhang, Tamir Mendel, Takahiro Yabe

    Abstract: Bicycle safety is important for bikeability and transportation efficiency. However, conventional surveys often fall short in capturing how people actually perceive cycling environments because they rely heavily on respondents' recall rather than in-the-moment experience. By leveraging large language models (LLMs), this study proposes a new method of combining video-based surveys with a conversatio… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

  2. arXiv:2604.05887  [pdf, ps, other

    cs.AI

    HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference

    Authors: Bowen Zeng, Feiyang Ren, Jun Zhang, Xiaoling Gu, Ke Chen, Lidan Shou, Huan Li

    Abstract: Multimodal Large Language Models (MLLMs) have advanced unified reasoning over text, images, and videos, but their inference is hindered by the rapid growth of key-value (KV) caches. Each visual input expands into thousands of tokens, causing caches to scale linearly with context length and remain resident in GPU memory throughout decoding, which leads to prohibitive memory overhead and latency eve… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

  3. arXiv:2604.05546  [pdf, ps, other

    cs.CL

    Efficient Inference for Large Vision-Language Models: Bottlenecks, Techniques, and Prospects

    Authors: Jun Zhang, Yicheng Ji, Feiyang Ren, Yihang Li, Bowen Zeng, Zonghao Chen, Ke Chen, Lidan Shou, Gang Chen, Huan Li

    Abstract: Large Vision-Language Models (LVLMs) enable sophisticated reasoning over images and videos, yet their inference is hindered by a systemic efficiency barrier known as visual token dominance. This overhead is driven by a multi-regime interplay between high-resolution feature extraction, quadratic attention scaling, and memory bandwidth constraints. We present a systematic taxonomy of efficiency tech… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: Accepted to ACL 2026 Findings

  4. arXiv:2604.05363  [pdf, ps, other

    cs.CV

    Rethinking IRSTD: Single-Point Supervision Guided Encoder-only Framework is Enough for Infrared Small Target Detection

    Authors: Rixiang Ni, Boyang Li, Jun Chen, Yonghao Li, Feiyu Ren, Yuji Wang, Haoyang Yuan, Wujiao He, Wei An

    Abstract: Infrared small target detection (IRSTD) aims to separate small targets from clutter backgrounds. Extensive research is dedicated to the pixel-level supervision-guided "encoder-decoder" segmentation paradigm. Although having achieved promising performance, they neglect the fact that small targets only occupy a few pixels and are usually accompanied with blurred boundary caused by clutter background… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

  5. arXiv:2604.02713  [pdf, ps, other

    cs.CL

    Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

    Authors: Jiawen Deng, Wentao Zhang, Ziyun Jiao, Fuji Ren

    Abstract: Conversational AI is increasingly deployed in emotionally charged and ethically sensitive interactions. Previous research has primarily concentrated on emotional benchmarks or static safety checks, overlooking how alignment unfolds in evolving conversation. We explore the research question: what breakdowns arise when conversational agents confront emotionally and ethically sensitive behaviors, and… ▽ More

    Submitted 3 April, 2026; originally announced April 2026.

    Comments: 22 pages, ACM CHI 2026

  6. arXiv:2604.00368  [pdf, ps, other

    cs.DC

    TENT: A Declarative Slice Spraying Engine for Performant and Resilient Data Movement in Disaggregated LLM Serving

    Authors: Feng Ren, Ruoyu Qin, Teng Ma, Shangming Cai, Zheng Liu, Chao Lei, Dejiang Zhu, Ke Yang, Zheming Li, Jialei Cui, Weixiao Huang, Yikai Zhao, Yineng Zhang, Hao Wu, Xiang Gao, Yuhao Fu, Jinlei Jiang, Yongwei Wu, Mingxing Zhang

    Abstract: Modern GPU clusters are built upon a complex hierarchy of heterogeneous interconnects, ranging from multi-rail RDMA to proprietary fabrics such as Multi-Node NVLink and Ascend UB. Orchestrating these diverse links effectively remains a critical challenge in disaggregated LLM serving. Operating Mooncake TE on thousands of GPUs exposed a critical limitation shared by existing frameworks: imperative,… ▽ More

    Submitted 31 March, 2026; originally announced April 2026.

  7. arXiv:2603.16008  [pdf, ps, other

    cs.HC cs.CY

    CoDesignAI: An AI-Enabled Multi-Agent, Multi-User System for Collaborative Urban Design at the Conceptual Stage

    Authors: Zhaoxi Zhang, Ruolin Wu, Feiyang Ren, Sridevi Turaga, Tamir Mendel

    Abstract: Public participation has become increasingly important in collaborative urban design; yet, existing processes often face challenges in achieving efficient and scalable citizen engagement. To address this gap, this study explores how large language models (LLMs) can support cooperation among community members in participatory design. We introduce CoDesignAI, a collaborative urban design tool that c… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

  8. arXiv:2603.10637  [pdf, ps, other

    cs.NI

    Q-StaR: A Quasi-Static Routing Scheme for NoCs

    Authors: Yang Zhang, Yiren Zhao, Xu Wang, Fengyuan Ren

    Abstract: In networks-on-chip, static routing schemes are favored for their simplicity and predictability, but they cannot effectively balance network load due to the unawareness of runtime load distribution. Q-StaR discovers two factors (topology and traffic distribution) that determine the long-term trend of load distribution, and proposes N-Rank to extract this trend. The obtained information is used to… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

    Comments: 7 pages,9 figures

  9. arXiv:2603.10088  [pdf, ps, other

    cs.LG cs.AI

    ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

    Authors: Zijian Zhu, Fei Ren, Zhanhong Tan, Kaisheng Ma

    Abstract: Diffusion large language models (dLLMs) are emerging as a promising alternative to autoregressive models (ARMs) due to their ability to capture bidirectional context and the potential for parallel generation. Despite the advantages, dLLM inference remains computationally expensive as the full input context is processed at every iteration. In this work, we analyze the generation dynamics of dLLMs a… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

    Comments: Accepted at ICLR 2026

  10. arXiv:2603.00632  [pdf, ps, other

    cs.IR cs.LG

    Stop Treating Collisions Equally: Qualification-Aware Semantic ID Learning for Recommendation at Industrial Scale

    Authors: Zheng Hu, Yuxin Chen, Yongsen Pan, Xu Yuan, Yuting Yin, Daoyuan Wang, Boyang Xia, Zefei Luo, Hongyang Wang, Songhao Ni, Dongxu Liang, Jun Wang, Shimin Cai, Tao Zhou, Fuji Ren, Wenwu Ou

    Abstract: Semantic IDs (SIDs) are compact discrete representations derived from multimodal item features, serving as a unified abstraction for ID-based and generative recommendation. However, learning high-quality SIDs remains challenging due to two issues. (1) Collision problem: the quantized token space is prone to collisions, in which semantically distinct items are assigned identical or overly similar S… ▽ More

    Submitted 28 February, 2026; originally announced March 2026.

  11. arXiv:2602.03786  [pdf, ps, other

    cs.AI cs.CL

    AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

    Authors: Jianhao Ruan, Zhihao Xu, Yiran Peng, Fashen Ren, Zhaoyang Yu, Xinbing Liang, Jinyu Xiang, Yongru Chen, Bang Liu, Chenglin Wu, Yuyu Luo, Jiayi Zhang

    Abstract: Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby hurting adaptability. We address this challenge with a unified, framework-agnostic agent abstraction… ▽ More

    Submitted 7 February, 2026; v1 submitted 3 February, 2026; originally announced February 2026.

  12. arXiv:2601.16527  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CV

    Beyond Superficial Unlearning: Sharpness-Aware Robust Erasure of Hallucinations in Multimodal LLMs

    Authors: Xianya Fang, Feiyang Ren, Xiang Chen, Yu Tian, Zhen Bi, Haiyang Yu, Sheng-Jun Huang

    Abstract: Multimodal LLMs are powerful but prone to object hallucinations, which describe non-existent entities and harm reliability. While recent unlearning methods attempt to mitigate this, we identify a critical flaw: structural fragility. We empirically demonstrate that standard erasure achieves only superficial suppression, trapping the model in sharp minima where hallucinations catastrophically resurg… ▽ More

    Submitted 23 January, 2026; originally announced January 2026.

  13. arXiv:2601.07507  [pdf, ps, other

    cs.CL

    High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning

    Authors: Yongkang Liu, Xing Li, Mengjie Zhao, Shanru Zhang, Zijing Wang, Qian Li, Shi Feng, Feiliang Ren, Daling Wang, Hinrich Schütze

    Abstract: As the number of model parameters increases, parameter-efficient fine-tuning (PEFT) has become the go-to choice for tailoring pre-trained large language models. Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning, which is widely used to reduce resource requirements. However, decreasing the rank encounters challenges with limited representational capacit… ▽ More

    Submitted 12 January, 2026; originally announced January 2026.

    Comments: under review

  14. arXiv:2512.12530  [pdf, ps, other

    cs.OS

    Principled Performance Tunability in Operating System Kernels

    Authors: Zhongjie Chen, Wentao Zhang, Yulong Tang, Ran Shu, Fengyuan Ren, Tianyin Xu, Jing Liu

    Abstract: The Linux kernel source code contains numerous constant values that critically influence system performance. Many of these constants, which we term perf-consts, are magic numbers that encode brittle assumptions about hardware and workloads. As systems and workloads evolve, such constants often become suboptimal. Unfortunately, deployed kernels lack support for safe and efficient in-situ tuning of… ▽ More

    Submitted 13 December, 2025; originally announced December 2025.

    Comments: 12 pages

  15. arXiv:2511.10325  [pdf, ps, other

    cs.MM

    TMDC: A Two-Stage Modality Denoising and Complementation Framework for Multimodal Sentiment Analysis with Missing and Noisy Modalities

    Authors: Yan Zhuang, Minhao Liu, Yanru Zhang, Jiawen Deng, Fuji Ren

    Abstract: Multimodal Sentiment Analysis (MSA) aims to infer human sentiment by integrating information from multiple modalities such as text, audio, and video. In real-world scenarios, however, the presence of missing modalities and noisy signals significantly hinders the robustness and accuracy of existing models. While prior works have made progress on these issues, they are typically addressed in isolati… ▽ More

    Submitted 13 November, 2025; originally announced November 2025.

    Comments: Accepted by AAAI 2026

  16. arXiv:2510.24668  [pdf, ps, other

    cs.CL cs.AI

    InteractComp: Evaluating Search Agents With Ambiguous Queries

    Authors: Mingyi Deng, Lijun Huang, Yani Fan, Jiayi Zhang, Fashen Ren, Jinyi Bai, Fuzhen Yang, Dayi Miao, Zhaoyang Yu, Yifan Wu, Yanfei Zhang, Fengwei Teng, Yingjia Wan, Song Hu, Yude Li, Xin Jin, Conghao Hu, Haoyu Li, Qirui Fu, Tai Zhong, Xinyu Wang, Xiangru Tang, Nan Tang, Chenglin Wu, Yuyu Luo

    Abstract: Language agents have demonstrated remarkable potential in web search and information retrieval. However, these search agents assume user queries are complete and unambiguous, an assumption that diverges from reality where users begin with incomplete queries requiring clarification through interaction. Yet most agents lack interactive mechanisms during the search process, and existing benchmarks ca… ▽ More

    Submitted 28 October, 2025; originally announced October 2025.

  17. arXiv:2510.24577  [pdf

    cs.LG

    Physics-Informed Extreme Learning Machine (PIELM): Opportunities and Challenges

    Authors: He Yang, Fei Ren, Francesco Calabro, Hai-Sui Yu, Xiaohui Chen, Pei-Zhi Zhuang

    Abstract: We are delighted to see the recent development of physics-informed extreme learning machine (PIELM) for its higher computational efficiency and accuracy compared to other physics-informed machine learning (PIML) paradigms. Since a comprehensive summary or review of PIELM is currently unavailable, we would like to take this opportunity to share our perspectives and experiences on this promising res… ▽ More

    Submitted 2 November, 2025; v1 submitted 28 October, 2025; originally announced October 2025.

  18. arXiv:2510.21426  [pdf

    cs.LG

    A Rapid Physics-Informed Machine Learning Framework Based on Extreme Learning Machine for Inverse Stefan Problems

    Authors: Pei-Zhi Zhuang, Ming-Yue Yang, Fei Ren, Hong-Ya Yue, He Yang

    Abstract: The inverse Stefan problem, as a typical phase-change problem with moving boundaries, finds extensive applications in science and engineering. Recent years have seen the applications of physics-informed neural networks (PINNs) to solving Stefan problems, yet they still exhibit shortcomings in hyperparameter dependency, training efficiency, and prediction accuracy. To address this, this paper devel… ▽ More

    Submitted 24 October, 2025; originally announced October 2025.

  19. arXiv:2510.19980  [pdf, ps, other

    cs.LG cs.IT

    Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency

    Authors: Renzhao Liang, Sizhe Xu, Chenggang Xie, Jingru Chen, Feiyang Ren, Shu Yang, Takahiro Yabe

    Abstract: Time series forecasting plays a pivotal role in critical domains such as energy management and financial markets. Although deep learning-based approaches (e.g., MLP, RNN, Transformer) have achieved remarkable progress, the prevailing "long-sequence information gain hypothesis" exhibits inherent limitations. Through systematic experimentation, this study reveals a counterintuitive phenomenon: appro… ▽ More

    Submitted 22 October, 2025; originally announced October 2025.

    Comments: 20 pages, 4 figures. Accepted as Spotlight poster in NeurIPS 2025

  20. arXiv:2510.12293  [pdf

    cs.LG cs.NE physics.comp-ph

    General Fourier Feature Physics-Informed Extreme Learning Machine (GFF-PIELM) for High-Frequency PDEs

    Authors: Fei Ren, Sifan Wang, Pei-Zhi Zhuang, Hai-Sui Yu, He Yang

    Abstract: Conventional physics-informed extreme learning machine (PIELM) often faces challenges in solving partial differential equations (PDEs) involving high-frequency and variable-frequency behaviors. To address these challenges, we propose a general Fourier feature physics-informed extreme learning machine (GFF-PIELM). We demonstrate that directly concatenating multiple Fourier feature mappings (FFMs) a… ▽ More

    Submitted 14 October, 2025; originally announced October 2025.

  21. arXiv:2510.00698  [pdf

    cs.LG cs.NE physics.comp-ph physics.geo-ph

    Physics-Informed Extreme Learning Machine (PIELM) for Tunnelling-Induced Soil-Pile Interactions

    Authors: Fu-Chen Guo, Pei-Zhi Zhuang, Fei Ren, Hong-Ya Yue, He Yang

    Abstract: Physics-informed machine learning has been a promising data-driven and physics-informed approach in geotechnical engineering. This study proposes a physics-informed extreme learning machine (PIELM) framework for analyzing tunneling-induced soil-pile interactions. The pile foundation is modeled as an Euler-Bernoulli beam, and the surrounding soil is modeled as a Pasternak foundation. The soil-pile… ▽ More

    Submitted 1 October, 2025; originally announced October 2025.

  22. arXiv:2509.21151  [pdf, ps, other

    cs.CL cs.IR

    Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction

    Authors: Lei Hei, Tingjing Liao, Yingxin Pei, Yiyang Qi, Jiaqi Wang, Ruiting Li, Feiliang Ren

    Abstract: Relation extraction (RE) aims to identify semantic relations between entities in unstructured text. Although recent work extends traditional RE to multimodal scenarios, most approaches still adopt classification-based paradigms with fused multimodal features, representing relations as discrete labels. This paradigm has two significant limitations: (1) it overlooks structural constraints like entit… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

    Comments: Accepted by EMNLP 2025 Main Conference

  23. arXiv:2509.12875  [pdf, ps, other

    cs.AI

    LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning

    Authors: Jiaqi Wang, Binquan Ji, Haibo Luo, Yiyang Qi, Ruiting Li, Huiyan Wang, Yuantao Han, Cangyi Yang, jiaxu Zhang, Feiliang Ren

    Abstract: Complex Reasoning in Large Language Models can be dynamically optimized using Test-Time Scaling (TTS) to mitigate Overthinking. Methods such as Coconut, SoftCoT and its variant are effective in continuous latent space inference, the core bottleneck still lies in the efficient generation and utilization of high-quality Latent Thought. Drawing from the theory of SoftCoT++ that a larger variance in t… ▽ More

    Submitted 16 December, 2025; v1 submitted 16 September, 2025; originally announced September 2025.

  24. arXiv:2509.12811  [pdf, ps, other

    cs.CL

    ConvergeWriter: Data-Driven Bottom-Up Article Construction

    Authors: Binquan Ji, Jiaqi Wang, Ruiting Li, Xingchen Han, Yiyang Qi, Shichao Wang, Yifei Lu, Yuantao Han, Feiliang Ren

    Abstract: Large Language Models (LLMs) have shown remarkable prowess in text generation, yet producing long-form, factual documents grounded in extensive external knowledge bases remains a significant challenge. Existing "top-down" methods, which first generate a hypothesis or outline and then retrieve evidence, often suffer from a disconnect between the model's plan and the available knowledge, leading to… ▽ More

    Submitted 16 September, 2025; originally announced September 2025.

  25. arXiv:2509.11628  [pdf, ps, other

    cs.LG cs.AI cs.CV

    SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching

    Authors: Jiacheng Liu, Chang Zou, Yuanhuiyi Lyu, Fei Ren, Shaobo Wang, Kaixin Li, Linfeng Zhang

    Abstract: Diffusion models have revolutionized high-fidelity image and video synthesis, yet their computational demands remain prohibitive for real-time applications. These models face two fundamental challenges: strict temporal dependencies preventing parallelization, and computationally intensive forward passes required at each denoising step. Drawing inspiration from speculative decoding in large languag… ▽ More

    Submitted 15 September, 2025; originally announced September 2025.

    Comments: 15 pages, 9 figures, ACM Multimedia 2025

  26. arXiv:2508.17062  [pdf, ps, other

    cs.CV cs.AI

    SSG-Dit: A Spatial Signal Guided Framework for Controllable Video Generation

    Authors: Peng Hu, Yu Gu, Liang Luo, Fuji Ren

    Abstract: Controllable video generation aims to synthesize video content that aligns precisely with user-provided conditions, such as text descriptions and initial images. However, a significant challenge persists in this domain: existing models often struggle to maintain strong semantic consistency, frequently generating videos that deviate from the nuanced details specified in the prompts. To address this… ▽ More

    Submitted 23 August, 2025; originally announced August 2025.

  27. EchoLadder: Progressive AI-Assisted Design of Immersive VR Scenes

    Authors: Zhuangze Hou, Jingze Tian, Nianlong Li, Farong Ren, Can Liu

    Abstract: Mixed reality platforms allow users to create virtual environments, yet novice users struggle with both ideation and execution in spatial design. While existing AI models can automatically generate scenes based on user prompts, the lack of interactive control limits users' ability to iteratively steer the output. In this paper, we present EchoLadder, a novel human-AI collaboration pipeline that le… ▽ More

    Submitted 4 August, 2025; originally announced August 2025.

    Comments: To appear at UIST 2025

  28. arXiv:2507.17147  [pdf, ps, other

    cs.CL

    CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards

    Authors: Cheng Liu, Yifei Lu, Fanghua Ye, Jian Li, Xingyu Chen, Feiliang Ren, Zhaopeng Tu, Xiaolong Li

    Abstract: Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs). Existing approaches typically rely on prompt engineering or supervised fine-tuning to enable models to imitate character behaviors in specific scenarios, but often neglect the underlying \emph{cognitive} mechanisms driving these behaviors. Inspired by cognitive psychology, we… ▽ More

    Submitted 22 July, 2025; originally announced July 2025.

  29. arXiv:2506.17692  [pdf, ps, other

    cs.CL

    Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering

    Authors: Binquan Ji, Haibo Luo, Yifei Lu, Lei Hei, Jiaqi Wang, Tingjing Liao, Lingyu Wang, Shichao Wang, Feiliang Ren

    Abstract: Knowledge-intensive multi-hop question answering (QA) tasks, which require integrating evidence from multiple sources to address complex queries, often necessitate multiple rounds of retrieval and iterative generation by large language models (LLMs). However, incorporating many documents and extended contexts poses challenges -such as hallucinations and semantic drift-for lightweight LLMs with few… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  30. arXiv:2506.08381  [pdf

    physics.geo-ph cs.LG

    Physics-informed extreme learning machine for Terzaghi consolidation problems and interpretation of coefficient of consolidation based on CPTu data

    Authors: He Yang, Pin-Qiang Mo, Fei Ren, Hai-Sui Yu, Xueyu Geng, Pei-Zhi Zhuang

    Abstract: This paper conducts a preliminary study to investigate the feasibility of a physics-informed extreme learning machine (PIELM) for solving the Terzaghi consolidation equation and interpreting the coefficient of consolidation of soil from piezocone penetration tests (CPTu). In the PIELM framework, the target solution is approximated by a single-layer feed-forward extreme learning machine (ELM) netwo… ▽ More

    Submitted 6 February, 2026; v1 submitted 9 June, 2025; originally announced June 2025.

  31. arXiv:2505.20310  [pdf, ps, other

    cs.AI cs.MA

    Manalyzer: End-to-end Automated Meta-analysis with Multi-agent System

    Authors: Wanghan Xu, Wenlong Zhang, Fenghua Ling, Ben Fei, Yusong Hu, Runmin Ma, Bo Zhang, Fangxuan Ren, Jintai Lin, Wanli Ouyang, Lei Bai

    Abstract: Meta-analysis is a systematic research methodology that synthesizes data from multiple existing studies to derive comprehensive conclusions. This approach not only mitigates limitations inherent in individual studies but also facilitates novel discoveries through integrated data analysis. Traditional meta-analysis involves a complex multi-stage pipeline including literature retrieval, paper screen… ▽ More

    Submitted 20 January, 2026; v1 submitted 22 May, 2025; originally announced May 2025.

  32. arXiv:2505.15715  [pdf, ps, other

    cs.CL

    Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling

    Authors: He Hu, Yucheng Zhou, Juzheng Si, Qianning Wang, Hengheng Zhang, Fuji Ren, Fei Ma, Laizhong Cui, Qi Tian

    Abstract: Large language models (LLMs) hold significant potential for mental health support, capable of generating empathetic responses and simulating therapeutic conversations. However, existing LLM-based approaches often lack the clinical grounding necessary for real-world psychological counseling, particularly in explicit diagnostic reasoning aligned with standards like the DSM/ICD and incorporating dive… ▽ More

    Submitted 3 November, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  33. arXiv:2505.09261  [pdf, ps, other

    cs.CR

    Instantiating Standards: Enabling Standard-Driven Text TTP Extraction with Evolvable Memory

    Authors: Cheng Meng, ZhengWei Jiang, QiuYun Wang, XinYi Li, ChunYan Ma, FangMing Dong, FangLi Ren, BaoXu Liu

    Abstract: Extracting MITRE ATT\&CK Tactics, Techniques, and Procedures (TTPs) from natural language threat reports is crucial yet challenging. Existing methods primarily focus on performance metrics using data-driven approaches, often neglecting mechanisms to ensure faithful adherence to the official standard. This deficiency compromises reliability and consistency of TTP assignments, creating intelligence… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  34. arXiv:2504.17307  [pdf, ps, other

    cs.NI

    An Extensible Software Transport Layer for GPU Networking

    Authors: Yang Zhou, Zhongjie Chen, Ziming Mao, ChonLam Lao, Shuo Yang, Pravein Govindan Kannan, Jiaqi Gao, Yilong Zhao, Yongji Wu, Kaichao You, Fengyuan Ren, Zhiying Xu, Costin Raiciu, Ion Stoica

    Abstract: Fast-evolving machine learning (ML) workloads have increasing requirements for networking. However, host network transport on RDMA NICs is hard to evolve, causing problems for ML workloads. For example, single-path RDMA traffic is prone to flow collisions that severely degrade collective communication performance. We present UCCL, an extensible software transport layer to evolve GPU networking. UC… ▽ More

    Submitted 4 August, 2025; v1 submitted 24 April, 2025; originally announced April 2025.

  35. arXiv:2503.20840  [pdf, ps, other

    cs.AI cs.CL cs.LG cs.SE

    CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision

    Authors: Yifei Lu, Fanghua Ye, Jian Li, Qiang Gao, Cheng Liu, Haibo Luo, Nan Du, Xiaolong Li, Feiliang Ren

    Abstract: Tool invocation significantly enhances the capabilities of Large Language Models (LLMs), yet challenges persist, particularly in complex task scenarios. Current methods, such as instruction-enhanced reasoning and supervised fine-tuning, often result in unnecessarily long reasoning paths and face difficulties in verifying the correctness of intermediate steps. In this paper, we propose CodeTool, a… ▽ More

    Submitted 12 June, 2025; v1 submitted 26 March, 2025; originally announced March 2025.

    Comments: ACL2025

  36. arXiv:2503.11342  [pdf, other

    cs.CV

    Road Rage Reasoning with Vision-language Models (VLMs): Task Definition and Evaluation Dataset

    Authors: Yibing Weng, Yu Gu, Fuji Ren

    Abstract: Road rage, triggered by driving-related stimuli such as traffic congestion and aggressive driving, poses a significant threat to road safety. Previous research on road rage regulation has primarily focused on response suppression, lacking proactive prevention capabilities. With the advent of Vision-Language Models (VLMs), it has become possible to reason about trigger events visually and then enga… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  37. arXiv:2503.00974  [pdf, other

    cs.DC

    SAF: Scalable Acceleration Framework for dynamic and flexible scaling of FPGAs

    Authors: Masudul Hassan Quraishi, Michael Riera, Fengbo Ren, Aman Arora, Aviral Shrivastava

    Abstract: FPGAs are increasingly gaining traction in cloud and edge computing environments due to their hardware flexibility, low latency, and low energy consumption. However, the existing hardware stack of FPGA and the host-FPGA connectivity does not allow flexible scaling and simultaneous reconfiguration of multiple devices, which limits the adoption of FPGA at scale. In this paper, we present SAF -- an E… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  38. arXiv:2502.18013  [pdf, other

    cs.HC

    Exploring the Effects of Traditional Chinese Medicine Scents on Mitigating Driving Fatigue

    Authors: Nengyue Su, Liang Luo, Yu Gu, Fuji Ren

    Abstract: The rise of autonomous driving technology has led to concerns about inactivity-induced fatigue. This paper explores Traditional Chinese Medicine (TCM) scents for mitigating. Two human-involved studies have been conducted in a high-fidelity driving simulator. Study 1 maps six prevalent TCM scents onto the arousal/valence circumplex to select proper candidates, i.e., argy wormwood (with the highest… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  39. arXiv:2502.06855  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Self-Supervised Prompt Optimization

    Authors: Jinyu Xiang, Jiayi Zhang, Zhaoyang Yu, Xinbing Liang, Fengwei Teng, Jinhao Tu, Fashen Ren, Xiangru Tang, Sirui Hong, Chenglin Wu, Yuyu Luo

    Abstract: Well-designed prompts are crucial for enhancing Large language models' (LLMs) reasoning capabilities while aligning their outputs with task requirements across diverse domains. However, manually designed prompts require expertise and iterative experimentation. While existing prompt optimization methods aim to automate this process, they rely heavily on external references such as ground truth or b… ▽ More

    Submitted 21 August, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  40. arXiv:2502.00405  [pdf, ps, other

    math.CO cs.DM

    Spectral Sufficient Conditions for Graph Factors

    Authors: Fengyun Ren, Shumin Zhang, Ke Wang

    Abstract: The $\{K_{1,1}, K_{1,2},C_m: m\geq3\}$-factor of a graph is a spanning subgraph whose each component is an element of $\{K_{1,1}, K_{1,2},C_m: m\geq3\}$. In this paper, through the graph spectral methods, we establish the lower bound of the signless Laplacian spectral radius and the upper bound of the distance spectral radius to determine whether a graph admits a $\{K_2\}$-factor. We get a lower b… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

  41. arXiv:2501.15000  [pdf, ps, other

    cs.CL cs.IR

    MDEval: Evaluating and Enhancing Markdown Awareness in Large Language Models

    Authors: Zhongpu Chen, Yinfeng Liu, Long Shi, Xingyan Chen, Yu Zhao, Fuji Ren

    Abstract: Large language models (LLMs) are expected to offer structured Markdown responses for the sake of readability in web chatbots (e.g., ChatGPT). Although there are a myriad of metrics to evaluate LLMs, they fail to evaluate the readability from the view of output content structure. To this end, we focus on an overlooked yet important metric -- Markdown Awareness, which directly impacts the readabilit… ▽ More

    Submitted 26 August, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: WWW 2025

  42. arXiv:2501.13570  [pdf, other

    cs.NI

    Occamy: A Preemptive Buffer Management for On-chip Shared-memory Switches

    Authors: Danfeng Shan, Yunguang Li, Jinchao Ma, Zhenxing Zhang, Zeyu Liang, Xinyu Wen, Hao Li, Wanchun Jiang, Nan Li, Fengyuan Ren

    Abstract: Today's high-speed switches employ an on-chip shared packet buffer. The buffer is becoming increasingly insufficient as it cannot scale with the growing switching capacity. Nonetheless, the buffer needs to face highly intense bursts and meet stringent performance requirements for datacenter applications. This imposes rigorous demand on the Buffer Management (BM) scheme, which dynamically allocates… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

    Comments: 18 pages, 23 figures

  43. arXiv:2412.13544  [pdf, other

    cs.IR cs.AI

    Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models

    Authors: Zheng Hu, Zhe Li, Ziyun Jiao, Satoshi Nakagawa, Jiawen Deng, Shimin Cai, Tao Zhou, Fuji Ren

    Abstract: In recent years, knowledge graphs have been integrated into recommender systems as item-side auxiliary information, enhancing recommendation accuracy. However, constructing and integrating structural user-side knowledge remains a significant challenge due to the improper granularity and inherent scarcity of user-side features. Recent advancements in Large Language Models (LLMs) offer the potential… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

    Comments: Accepted at AAAI 2025

  44. arXiv:2412.07116  [pdf, other

    cs.LG cs.AI cs.CL

    A Review of Human Emotion Synthesis Based on Generative Technology

    Authors: Fei Ma, Yukan Li, Yifan Xie, Ying He, Yi Zhang, Hongwei Ren, Zhou Liu, Wei Yao, Fuji Ren, Fei Richard Yu, Shiguang Ni

    Abstract: Human emotion synthesis is a crucial aspect of affective computing. It involves using computational methods to mimic and convey human emotions through various modalities, with the goal of enabling more natural and effective human-computer interactions. Recent advancements in generative models, such as Autoencoders, Generative Adversarial Networks, Diffusion Models, Large Language Models, and Seque… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: 25 pages, 10 figures

  45. arXiv:2411.15180  [pdf, other

    cs.LG cs.AI q-bio.QM

    Multi-layer matrix factorization for cancer subtyping using full and partial multi-omics dataset

    Authors: Yingxuan Ren, Fengtao Ren, Bo Yang

    Abstract: Cancer, with its inherent heterogeneity, is commonly categorized into distinct subtypes based on unique traits, cellular origins, and molecular markers specific to each type. However, current studies primarily rely on complete multi-omics datasets for predicting cancer subtypes, often overlooking predictive performance in cases where some omics data may be missing and neglecting implicit relations… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  46. arXiv:2411.12676  [pdf, other

    cs.CV cs.LG

    IoT-Based 3D Pose Estimation and Motion Optimization for Athletes: Application of C3D and OpenPose

    Authors: Fei Ren, Chao Ren, Tianyi Lyu

    Abstract: This study proposes the IoT-Enhanced Pose Optimization Network (IE-PONet) for high-precision 3D pose estimation and motion optimization of track and field athletes. IE-PONet integrates C3D for spatiotemporal feature extraction, OpenPose for real-time keypoint detection, and Bayesian optimization for hyperparameter tuning. Experimental results on NTURGB+D and FineGYM datasets demonstrate superior p… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: 17 pages

  47. arXiv:2408.10841  [pdf, other

    cs.AI cs.CL

    DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

    Authors: Yuanhao Zeng, Fei Ren, Xinpeng Zhou, Yihang Wang, Yingxia Shao

    Abstract: Although instruction tuning is widely used to adjust behavior in Large Language Models (LLMs), extensive empirical evidence and research indicates that it is primarily a process where the model fits to specific task formats, rather than acquiring new knowledge or capabilities. We propose that this limitation stems from biased features learned during instruction tuning, which differ from ideal task… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures

  48. arXiv:2408.08709  [pdf, other

    cs.IR

    Multimodal Relational Triple Extraction with Query-based Entity Object Transformer

    Authors: Lei Hei, Ning An, Tingjing Liao, Qi Ma, Jiaqi Wang, Feiliang Ren

    Abstract: Multimodal Relation Extraction is crucial for constructing flexible and realistic knowledge graphs. Recent studies focus on extracting the relation type with entity pairs present in different modalities, such as one entity in the text and another in the image. However, existing approaches require entities and objects given beforehand, which is costly and impractical. To address the limitation, we… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 15 pages, 7 figures, preprint

  49. arXiv:2407.03640  [pdf, other

    cs.LG cs.CL cs.CV

    Generative Technology for Human Emotion Recognition: A Scope Review

    Authors: Fei Ma, Yucheng Yuan, Yifan Xie, Hongwei Ren, Ivan Liu, Ying He, Fuji Ren, Fei Richard Yu, Shiguang Ni

    Abstract: Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progre… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Under Review

  50. arXiv:2404.19154  [pdf, other

    cs.CL

    RTF: Region-based Table Filling Method for Relational Triple Extraction

    Authors: Ning An, Lei Hei, Yong Jiang, Weiping Meng, Jingjing Hu, Boran Huang, Feiliang Ren

    Abstract: Relational triple extraction is crucial work for the automatic construction of knowledge graphs. Existing methods only construct shallow representations from a token or token pair-level. However, previous works ignore local spatial dependencies of relational triples, resulting in a weakness of entity pair boundary detection. To tackle this problem, we propose a novel Region-based Table Filling met… ▽ More

    Submitted 13 June, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Rejected by EMNLP 2023