Skip to main content

Showing 1–50 of 270 results for author: Luo, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.08569  [pdf, ps, other

    cs.LG

    Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions

    Authors: Abhilasha Saroj, Shaked Regev, Guanhao Xu, Jinghui Yuan, Roy Luo, Ross Wang

    Abstract: Traffic simulation and digital-twin calibration is a challenging optimization problem with a limited simulation budget. Each trial requires an expensive simulation run, and the relationship between calibration inputs and model error is often nonconvex, and noisy. The problem becomes more difficult as the number of calibration parameters increases. We compare a commonly used automatic calibration m… ▽ More

    Submitted 25 March, 2026; originally announced April 2026.

  2. arXiv:2603.30038  [pdf, ps, other

    cs.CV

    Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

    Authors: Wenyi Li, Renkai Luo, Yue Yu, Huan-ang Gao, Mingju Gao, Li Yuan, Chaoyou Fu, Hao Zhao

    Abstract: AI-assisted coding has rapidly reshaped software practice and research workflows, yet today's models still struggle to produce correct code for complex 3D geometric vision. If models could reliably write such code, the research of our community would change substantially. To measure progress toward that goal, we introduce GeoCodeBench, a PhD-level benchmark that evaluates coding for 3D vision. Eac… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

    Comments: Accepted by CVPR 2026; Project page: https://geocodebench.github.io/

  3. arXiv:2603.28321  [pdf, ps, other

    cs.LG

    FairGC: Fairness-aware Graph Condensation

    Authors: Yihan Gao, Chenxi Huang, Wen Shi, Ke Sun, Ziqi Xu, Xikun Zhang, Mingliang Hou, Renqiang Luo

    Abstract: Graph condensation (GC) has become a vital strategy for scaling Graph Neural Networks by compressing massive datasets into small, synthetic node sets. While current GC methods effectively maintain predictive accuracy, they are primarily designed for utility and often ignore fairness constraints. Because these techniques are bias-blind, they frequently capture and even amplify demographic dispariti… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 6 pages, IJCNN 2026 accepted

  4. arXiv:2603.28315  [pdf, ps, other

    cs.CV cs.LG

    Prototype-Enhanced Multi-View Learning for Thyroid Nodule Ultrasound Classification

    Authors: Yangmei Chen, Zhongyuan Zhang, Xikun Zhang, Xinyu Hao, Mingliang Hou, Renqiang Luo, Ziqi Xu

    Abstract: Thyroid nodule classification using ultrasound imaging is essential for early diagnosis and clinical decision-making; however, despite promising performance on in-distribution data, existing deep learning methods often exhibit limited robustness and generalisation when deployed across different ultrasound devices or clinical environments. This limitation is mainly attributed to the pronounced hete… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 6 pages, IWCMC 2026 accepted

  5. arXiv:2603.28300  [pdf, ps, other

    cs.LG cs.AI

    NeiGAD: Augmenting Graph Anomaly Detection via Spectral Neighbor Information

    Authors: Qing Qing, Huafei Huang, Mingliang Hou, Renqiang Luo, Mohsen Guizani

    Abstract: Graph anomaly detection (GAD) aims to identify irregular nodes or structures in attributed graphs. Neighbor information, which reflects both structural connectivity and attribute consistency with surrounding nodes, is essential for distinguishing anomalies from normal patterns. Although recent graph neural network (GNN)-based methods incorporate such information through message passing, they often… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 6 pages, IWCMC 2026 accepted

  6. arXiv:2603.24578  [pdf, ps, other

    cs.CV eess.IV

    Vision-Language Models vs Human: Perceptual Image Quality Assessment

    Authors: Imran Mehmood, Imad Ali Shah, Ming Ronnier Luo, Brian Deegan

    Abstract: Psychophysical experiments remain the most reliable approach for perceptual image quality assessment (IQA), yet their cost and limited scalability encourage automated approaches. We investigate whether Vision Language Models (VLMs) can approximate human perceptual judgments across three image quality scales: contrast, colorfulness and overall preference. Six VLMs four proprietary and two openweigh… ▽ More

    Submitted 25 March, 2026; originally announced March 2026.

  7. arXiv:2603.17693  [pdf, ps, other

    cs.CV

    Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos

    Authors: Songtao Jiang, Sibo Song, Chenyi Zhou, Yuan Wang, Ruizhe Chen, Tongkun Guan, Ruilin Luo, Yan Zhang, Zhihang Tang, Yuchong Sun, Hang Zhang, Zhibo Yang, Shuai Bai, Junyang Lin, Zuozhu Liu

    Abstract: The transition from image to video understanding requires vision-language models (VLMs) to shift from recognizing static patterns to reasoning over temporal dynamics such as motion trajectories, speed changes, and state transitions. Yet current post-training methods fall short due to two critical limitations: (1) existing datasets often lack temporal-centricity, where answers can be inferred from… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

  8. arXiv:2603.16859  [pdf, ps, other

    cs.AI

    SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

    Authors: Tianyu Xie, Jinfa Huang, Yuexiao Ma, Rongfang Luo, Yan Yang, Wang Chen, Yuhui Zeng, Ruize Fang, Yixuan Zou, Xiawu Zheng, Jiebo Luo, Rongrong Ji

    Abstract: Omni-modal large language models (OLMs) redefine human-machine interaction by natively integrating audio, vision, and text. However, existing OLM benchmarks remain anchored to static, accuracy-centric tasks, leaving a critical gap in assessing social interactivity, the fundamental capacity to navigate dynamic cues in natural dialogues. To this end, we propose SocialOmni, a comprehensive benchmark… ▽ More

    Submitted 17 March, 2026; originally announced March 2026.

    Comments: Code is available at https://github.com/MAC-AutoML/SocialOmni and dataset is available at https://huggingface.co/datasets/alexisty/SocialOmni

  9. arXiv:2603.10757  [pdf, ps, other

    cs.CV

    CodePercept: Code-Grounded Visual STEM Perception for MLLMs

    Authors: Tongkun Guan, Zhibo Yang, Jianqiang Wan, Mingkun Yang, Zhengtao Guo, Zijian Hu, Ruilin Luo, Ruize Chen, Songtao Jiang, Peng Wang, Wei Shen, Junyang Lin, Xiaokang Yang

    Abstract: When MLLMs fail at Science, Technology, Engineering, and Mathematics (STEM) visual reasoning, a fundamental question arises: is it due to perceptual deficiencies or reasoning limitations? Through systematic scaling analysis that independently scales perception and reasoning components, we uncover a critical insight: scaling perception consistently outperforms scaling reasoning. This reveals percep… ▽ More

    Submitted 11 March, 2026; originally announced March 2026.

    Comments: Accepted by CVPR2026

  10. arXiv:2603.03825  [pdf, ps, other

    cs.CV cs.AI

    From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning

    Authors: Ruilin Luo, Chufan Shi, Yizhen Zhang, Cheng Yang, Songtao Jiang, Tongkun Guan, Ruizhe Chen, Ruihang Chu, Peng Wang, Mingkun Yang, Yujiu Yang, Junyang Lin, Zhibo Yang

    Abstract: The cold-start initialization stage plays a pivotal role in training Multimodal Large Reasoning Models (MLRMs), yet its mechanisms remain insufficiently understood. To analyze this stage, we introduce the Visual Attention Score (VAS), an attention-based metric that quantifies how much a model attends to visual tokens. We find that reasoning performance is strongly correlated with VAS (r=0.9616): m… ▽ More

    Submitted 4 March, 2026; originally announced March 2026.

    Comments: ICLR 2026 Poster

  11. arXiv:2603.01719  [pdf, ps, other

    stat.ML cs.LG

    Co-optimization for Adaptive Conformal Prediction

    Authors: Xiaoyi Su, Zhixin Zhou, Rui Luo

    Abstract: Conformal prediction (CP) provides finite-sample, distribution-free marginal coverage, but standard conformal regression intervals can be inefficient under heteroscedasticity and skewness. In particular, popular constructions such as conformalized quantile regression (CQR) often inherit a fixed notion of center and enforce equal-tailed errors, which can displace the interval away from high-density… ▽ More

    Submitted 2 March, 2026; originally announced March 2026.

  12. arXiv:2602.13964  [pdf, ps, other

    cs.CL

    HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

    Authors: Weiqi Zhai, Zhihai Wang, Jinghang Wang, Boyu Yang, Xiaogang Li, Xander Xu, Bohan Wang, Peng Wang, Xingzhe Wu, Anfeng Li, Qiyuan Feng, Yuhao Zhou, Shoulin Han, Wenjie Luo, Yiyuan Li, Yaxuan Wang, Ruixian Luo, Guojie Lin, Peiyao Xiao, Chengliang Xu, Ben Wang, Zeyu Wang, Zichao Chen, Jianan Ye, Yijie Hu , et al. (10 additional authors not shown)

    Abstract: Humanity's Last Exam (HLE) has become a widely used benchmark for evaluating frontier large language models on challenging, multi-domain questions. However, community-led analyses have raised concerns that HLE contains a non-trivial number of noisy items, which can bias evaluation results and distort cross-model comparisons. To address this challenge, we introduce HLE-Verified, a verified and revi… ▽ More

    Submitted 27 February, 2026; v1 submitted 14 February, 2026; originally announced February 2026.

    Comments: 14 pages, 10 figures

  13. arXiv:2602.12660  [pdf, ps, other

    cs.CL

    Learning Ordinal Probabilistic Reward from Preferences

    Authors: Longze Chen, Lu Wang, Renke Shan, Ze Gong, Run Luo, Jiaming Li, Jing Luo, Qiyao Wang, Min Yang

    Abstract: Reward models are crucial for aligning large language models (LLMs) with human values and intentions. Existing approaches follow either Generative (GRMs) or Discriminative (DRMs) paradigms, yet both suffer from limitations: GRMs typically demand costly point-wise supervision, while DRMs produce uncalibrated relative scores that lack probabilistic interpretation. To address these challenges, we int… ▽ More

    Submitted 2 March, 2026; v1 submitted 13 February, 2026; originally announced February 2026.

    Comments: 28 pages, 5 figures, ICLR 2026

  14. arXiv:2602.06718  [pdf, ps, other

    cs.CR cs.AI

    GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models

    Authors: Zuyao Xu, Yuqi Qiu, Lu Sun, FaSheng Miao, Fubin Wu, Xinyi Wang, Xiang Li, Haozhe Lu, ZhengZe Zhang, Yuxin Hu, Jialu Li, Jin Luo, Feng Zhang, Rui Luo, Xinran Liu, Yingxian Li, Jiaji Liu

    Abstract: Citations provide the basis for trusting scientific claims; when they are invalid or fabricated, this trust collapses. With the advent of Large Language Models (LLMs), this risk has intensified: LLMs are increasingly used for academic writing, yet their tendency to fabricate citations (``ghost citations'') poses a systemic threat to citation validity. To quantify this threat and inform mitigatio… ▽ More

    Submitted 6 February, 2026; originally announced February 2026.

  15. arXiv:2602.04078  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Principles of Lipschitz continuity in neural networks

    Authors: Róisín Luo

    Abstract: Deep learning has achieved remarkable success across a wide range of domains, significantly expanding the frontiers of what is achievable in artificial intelligence. Yet, despite these advances, critical challenges remain -- most notably, ensuring robustness to small input perturbations and generalization to out-of-distribution data. These critical challenges underscore the need to understand the… ▽ More

    Submitted 3 February, 2026; originally announced February 2026.

    Comments: Ph.D. Thesis

  16. arXiv:2602.02536  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CV

    From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation

    Authors: Tianle Gu, Kexin Huang, Lingyu Li, Ruilin Luo, Shiyang Huang, Zongqi Wang, Yujiu Yang, Yan Teng, Yingchun Wang

    Abstract: Safety moderation is pivotal for identifying harmful content. Despite the success of textual safety moderation, its multimodal counterparts remain hindered by a dual sparsity of data and supervision. Conventional reliance on binary labels lead to shortcut learning, which obscures the intrinsic classification boundaries necessary for effective multimodal discrimination. Hence, we propose a novel le… ▽ More

    Submitted 28 January, 2026; originally announced February 2026.

  17. WinFLoRA: Incentivizing Client-Adaptive Aggregation in Federated LoRA under Privacy Heterogeneity

    Authors: Mengsha Kou, Xiaoyu Xia, Ziqi Wang, Ibrahim Khalil, Runkun Luo, Jingwen Zhou, Minhui Xue

    Abstract: Large Language Models (LLMs) increasingly underpin intelligent web applications, from chatbots to search and recommendation, where efficient specialization is essential. Low-Rank Adaptation (LoRA) enables such adaptation with minimal overhead, while federated LoRA allows web service providers to fine-tune shared models without data sharing. However, in privacy-sensitive deployments, clients inject… ▽ More

    Submitted 1 February, 2026; originally announced February 2026.

    Comments: 12 pages

  18. arXiv:2602.00564  [pdf, ps, other

    cs.AI cs.CL

    Unmasking Reasoning Processes: A Process-aware Benchmark for Evaluating Structural Mathematical Reasoning in LLMs

    Authors: Xiang Zheng, Weiqi Zhai, Wei Wang, Boyu Yang, Wenbo Li, Ruixiang Luo, Haoxiang Sun, Yucheng Wang, Zhengze Li, Meng Wang, Yuetian Du, Guojie Lin, Yaxuan Wang, Xiaoxiao Xu, Yanhu Mo, Xuan Ren, Hu Wei, Bing Zhao

    Abstract: Recent large language models (LLMs) achieve near-saturation accuracy on many established mathematical reasoning benchmarks, raising concerns about their ability to diagnose genuine reasoning competence. This saturation largely stems from the dominance of template-based computation and shallow arithmetic decomposition in existing datasets, which underrepresent reasoning skills such as multi-constra… ▽ More

    Submitted 26 February, 2026; v1 submitted 31 January, 2026; originally announced February 2026.

    Comments: 8 pages, and 3 figures

  19. arXiv:2601.11393  [pdf, ps, other

    cs.CV

    Heterogeneous Uncertainty-Guided Composed Image Retrieval with Fine-Grained Probabilistic Learning

    Authors: Haomiao Tang, Jinpeng Wang, Minyi Zhao, Guanghao Meng, Ruisheng Luo, Long Chen, Shu-Tao Xia

    Abstract: Composed Image Retrieval (CIR) enables image search by combining a reference image with modification text. Intrinsic noise in CIR triplets incurs intrinsic uncertainty and threatens the model's robustness. Probabilistic learning approaches have shown promise in addressing such issues; however, they fall short for CIR due to their instance-level holistic modeling and homogeneous treatment of querie… ▽ More

    Submitted 22 January, 2026; v1 submitted 16 January, 2026; originally announced January 2026.

    Comments: Accepted for publication and oral presentation at AAAI 2026

  20. arXiv:2601.09478  [pdf, ps, other

    cs.IR cs.AI

    Bridging Semantic Understanding and Popularity Bias with LLMs

    Authors: Renqiang Luo, Dong Zhang, Yupeng Gao, Wen Shi, Mingliang Hou, Jiaying Liu, Zhe Wang, Shuo Yu

    Abstract: Semantic understanding of popularity bias is a crucial yet underexplored challenge in recommender systems, where popular items are often favored at the expense of niche content. Most existing debiasing methods treat the semantic understanding of popularity bias as a matter of diversity enhancement or long-tail coverage, neglecting the deeper semantic layer that embodies the causal origins of the b… ▽ More

    Submitted 18 January, 2026; v1 submitted 14 January, 2026; originally announced January 2026.

    Comments: 10 pages, 4 figs, WWW 2026 accepted

  21. arXiv:2601.09469  [pdf, ps, other

    cs.LG cs.AI

    FairGU: Fairness-aware Graph Unlearning in Social Networks

    Authors: Renqiang Luo, Yongshuai Yang, Huafei Huang, Qing Qing, Mingliang Hou, Ziqi Xu, Yi Yu, Jingjing Zhou, Feng Xia

    Abstract: Graph unlearning has emerged as a critical mechanism for supporting sustainable and privacy-preserving social networks, enabling models to remove the influence of deleted nodes and thereby better safeguard user information. However, we observe that existing graph unlearning techniques insufficiently protect sensitive attributes, often leading to degraded algorithmic fairness compared with traditio… ▽ More

    Submitted 18 January, 2026; v1 submitted 14 January, 2026; originally announced January 2026.

    Comments: 9 pages, 2 figs, WWW 2026 accepted

  22. arXiv:2601.09394  [pdf, ps, other

    cs.SI cs.AI

    FairGE: Fairness-Aware Graph Encoding in Incomplete Social Networks

    Authors: Renqiang Luo, Huafei Huang, Tao Tang, Jing Ren, Ziqi Xu, Mingliang Hou, Enyan Dai, Feng Xia

    Abstract: Graph Transformers (GTs) are increasingly applied to social network analysis, yet their deployment is often constrained by fairness concerns. This issue is particularly critical in incomplete social networks, where sensitive attributes are frequently missing due to privacy and ethical restrictions. Existing solutions commonly generate these incomplete attributes, which may introduce additional bia… ▽ More

    Submitted 18 January, 2026; v1 submitted 14 January, 2026; originally announced January 2026.

    Comments: 12 pages, WWW 2026

  23. arXiv:2601.09250  [pdf, ps, other

    cs.CL

    When to Invoke: Refining LLM Fairness with Toxicity Assessment

    Authors: Jing Ren, Bowen Li, Ziqi Xu, Renqiang Luo, Shuo Yu, Xin Ye, Haytham Fayek, Xiaodong Li, Feng Xia

    Abstract: Large Language Models (LLMs) are increasingly used for toxicity assessment in online moderation systems, where fairness across demographic groups is essential for equitable treatment. However, LLMs often produce inconsistent toxicity judgements for subtle expressions, particularly those involving implicit hate speech, revealing underlying biases that are difficult to correct through standard train… ▽ More

    Submitted 14 January, 2026; originally announced January 2026.

    Comments: Accepted by Findings of WWW 2026

  24. arXiv:2601.08108  [pdf, ps, other

    cs.CL cs.AI

    Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought

    Authors: Bowen Li, Ziqi Xu, Jing Ren, Renqiang Luo, Xikun Zhang, Xiuzhen Zhang, Yongli Ren, Feng Xia

    Abstract: Despite notable advancements in prompting methods for Large Language Models (LLMs), such as Chain-of-Thought (CoT), existing strategies still suffer from excessive token usage and limited generalisability across diverse reasoning tasks. To address these limitations, we propose an Adaptive Causal Prompting with Sketch-of-Thought (ACPS) framework, which leverages structural causal models to infer th… ▽ More

    Submitted 12 January, 2026; originally announced January 2026.

    Comments: Accepted by Findings of EACL 2026

  25. arXiv:2601.03100  [pdf, ps, other

    cs.CV cs.AI

    Text-Guided Layer Fusion Mitigates Hallucination in Multimodal LLMs

    Authors: Chenchen Lin, Sanbao Su, Rachel Luo, Yuxiao Chen, Yan Wang, Marco Pavone, Fei Miao

    Abstract: Multimodal large language models (MLLMs) typically rely on a single late-layer feature from a frozen vision encoder, leaving the encoder's rich hierarchy of visual cues under-utilized. MLLMs still suffer from visually ungrounded hallucinations, often relying on language priors rather than image evidence. While many prior mitigation strategies operate on the text side, they leave the visual represe… ▽ More

    Submitted 17 February, 2026; v1 submitted 6 January, 2026; originally announced January 2026.

  26. arXiv:2601.02769  [pdf, ps, other

    stat.ML cs.LG

    Fast Conformal Prediction using Conditional Interquantile Intervals

    Authors: Naixin Guo, Rui Luo, Zhixin Zhou

    Abstract: We introduce Conformal Interquantile Regression (CIR), a conformal regression method that efficiently constructs near-minimal prediction intervals with guaranteed coverage. CIR leverages black-box machine learning models to estimate outcome distributions through interquantile ranges, transforming these estimates into compact prediction intervals while achieving approximate conditional coverage. We… ▽ More

    Submitted 6 January, 2026; originally announced January 2026.

  27. arXiv:2512.05110  [pdf, ps, other

    cs.CV cs.AI cs.GR

    ShadowDraw: From Any Object to Shadow-Drawing Compositional Art

    Authors: Rundong Luo, Noah Snavely, Wei-Chiu Ma

    Abstract: We introduce ShadowDraw, a framework that transforms ordinary 3D objects into shadow-drawing compositional art. Given a 3D object, our system predicts scene parameters, including object pose and lighting, together with a partial line drawing, such that the cast shadow completes the drawing into a recognizable image. To this end, we optimize scene configurations to reveal meaningful shadows, employ… ▽ More

    Submitted 4 December, 2025; originally announced December 2025.

    Comments: Project page: https://red-fairy.github.io/ShadowDraw/

  28. arXiv:2512.02690  [pdf, ps, other

    cs.GT math.OC

    Monotone Near-Zero-Sum Games: A Generalization of Convex-Concave Minimax

    Authors: Ruichen Luo, Sebastian U. Stich, Krishnendu Chatterjee

    Abstract: Zero-sum and non-zero-sum (aka general-sum) games are relevant in a wide range of applications. While general non-zero-sum games are computationally hard, researchers focus on the special class of monotone games for gradient-based algorithms. However, there is a substantial gap between the gradient complexity of monotone zero-sum and monotone general-sum games. Moreover, in many practical scenario… ▽ More

    Submitted 2 December, 2025; originally announced December 2025.

  29. arXiv:2511.21631  [pdf, ps, other

    cs.CV cs.AI

    Qwen3-VL Technical Report

    Authors: Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, Wenbin Ge, Zhifang Guo, Qidong Huang, Jie Huang, Fei Huang, Binyuan Hui, Shutong Jiang, Zhaohai Li, Mingsheng Li, Mei Li, Kaixin Li, Zicheng Lin, Junyang Lin, Xuejing Liu, Jiawei Liu , et al. (39 additional authors not shown)

    Abstract: We introduce Qwen3-VL, the most capable vision-language model in the Qwen series to date, achieving superior performance across a broad range of multimodal benchmarks. It natively supports interleaved contexts of up to 256K tokens, seamlessly integrating text, images, and video. The model family includes both dense (2B/4B/8B/32B) and mixture-of-experts (30B-A3B/235B-A22B) variants to accommodate d… ▽ More

    Submitted 27 November, 2025; v1 submitted 26 November, 2025; originally announced November 2025.

    Comments: 42 pages

  30. arXiv:2511.11793  [pdf, ps, other

    cs.CL

    MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

    Authors: MiroMind Team, Song Bai, Lidong Bing, Carson Chen, Guanzheng Chen, Yuntao Chen, Zhe Chen, Ziyi Chen, Jifeng Dai, Xuan Dong, Wenhan Dou, Yue Deng, Yunjie Fu, Junqi Ge, Chenxia Han, Tammy Huang, Zhenhang Huang, Jerry Jiao, Shilei Jiang, Tianyu Jiao, Xiaoqi Jian, Lei Lei, Ruilin Li, Ryan Luo, Tiantong Li , et al. (30 additional authors not shown)

    Abstract: We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scaling at the model level, systematically training the model to handle deeper and more frequent agent-environment interactions as a third dimension of p… ▽ More

    Submitted 18 November, 2025; v1 submitted 14 November, 2025; originally announced November 2025.

    Comments: Technical Report

  31. arXiv:2511.04716  [pdf, ps, other

    cs.CR cs.AI

    P-MIA: A Profiled-Based Membership Inference Attack on Cognitive Diagnosis Models

    Authors: Mingliang Hou, Yinuo Wang, Teng Guo, Zitao Liu, Wenzhou Dou, Jiaqi Zheng, Renqiang Luo, Mi Tian, Weiqi Luo

    Abstract: Cognitive diagnosis models (CDMs) are pivotal for creating fine-grained learner profiles in modern intelligent education platforms. However, these models are trained on sensitive student data, raising significant privacy concerns. While membership inference attacks (MIA) have been studied in various domains, their application to CDMs remains a critical research gap, leaving their privacy risks unq… ▽ More

    Submitted 5 November, 2025; originally announced November 2025.

  32. arXiv:2511.03966  [pdf, ps, other

    cs.LG

    PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis

    Authors: Mingliang Hou, Yinuo Wang, Teng Guo, Zitao Liu, Wenzhou Dou, Jiaqi Zheng, Renqiang Luo, Mi Tian, Weiqi Luo

    Abstract: The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users' growing assertion of their "right to be forgotten". However, existing CD models are largely designed without privacy considerations and lack effective data unlearning mechanisms. Directly applying general purpose unlearning algorithms is suboptimal, as they struggle to… ▽ More

    Submitted 5 November, 2025; originally announced November 2025.

  33. arXiv:2511.00835  [pdf, ps, other

    cs.GT

    Optimal Allocations under Strongly Pigou-Dalton Criteria: Hidden Layer Structure & Efficient Combinatorial Approach

    Authors: Taikun Zhu, Kai Jin, Ruixi Luo, Song Cao

    Abstract: We investigate optimal social welfare allocations of $m$ items to $n$ agents with binary additive or submodular valuations. For binary additive valuations, we prove that the set of optimal allocations coincides with the set of so-called \emph{stable allocations}, as long as the employed criterion for evaluating social welfare is strongly Pigou-Dalton (SPD) and symmetric. Many common criteria are S… ▽ More

    Submitted 6 January, 2026; v1 submitted 2 November, 2025; originally announced November 2025.

  34. arXiv:2510.26160  [pdf, ps, other

    cs.CV

    CRAG-MM: Multi-modal Multi-turn Comprehensive RAG Benchmark

    Authors: Jiaqi Wang, Xiao Yang, Kai Sun, Parth Suresh, Sanat Sharma, Adam Czyzewski, Derek Andersen, Surya Appini, Arkav Banerjee, Sajal Choudhary, Shervin Ghasemlou, Ziqiang Guan, Akil Iyer, Haidar Khan, Lingkun Kong, Roy Luo, Tiffany Ma, Zhen Qiao, David Tran, Wenfang Xu, Skyler Yeatman, Chen Zhou, Gunveer Gujral, Yinglong Xia, Shane Moon , et al. (16 additional authors not shown)

    Abstract: Wearable devices such as smart glasses are transforming the way people interact with their surroundings, enabling users to seek information regarding entities in their view. Multi-Modal Retrieval-Augmented Generation (MM-RAG) plays a key role in supporting such questions, yet there is still no comprehensive benchmark for this task, especially regarding wearables scenarios. To fill this gap, we pre… ▽ More

    Submitted 30 October, 2025; originally announced October 2025.

  35. arXiv:2510.26020  [pdf, ps, other

    cs.CL cs.AI cs.LG

    PORTool: Tool-Use LLM Training with Rewarded Tree

    Authors: Feijie Wu, Weiwu Zhu, Yuxiang Zhang, Soumya Chatterjee, Jiarong Zhu, Fan Mo, Rodin Luo, Jing Gao

    Abstract: Current tool-use large language models (LLMs) are trained on static datasets, enabling them to interact with external tools and perform multi-step, tool-integrated reasoning, which produces tool-call trajectories. However, these models imitate how a query is resolved in a generic tool-call routine, thereby failing to explore possible solutions and demonstrating limited performance in an evolved, d… ▽ More

    Submitted 29 October, 2025; originally announced October 2025.

  36. arXiv:2510.13721  [pdf, ps, other

    cs.CL cs.AI cs.CV cs.MM

    NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

    Authors: Run Luo, Xiaobo Xia, Lu Wang, Longze Chen, Renke Shan, Jing Luo, Min Yang, Tat-Seng Chua

    Abstract: Next-generation multimodal foundation models capable of any-to-any cross-modal generation and multi-turn interaction will serve as core components of artificial general intelligence systems, playing a pivotal role in human-machine interaction. However, most existing multimodal models remain constrained by autoregressive architectures, whose inherent limitations prevent a balanced integration of un… ▽ More

    Submitted 15 October, 2025; v1 submitted 15 October, 2025; originally announced October 2025.

  37. arXiv:2510.08964  [pdf, ps, other

    cs.CV cs.CL

    Unleashing Perception-Time Scaling to Multimodal Reasoning Models

    Authors: Yifan Li, Zhenghao Chen, Ziheng Wu, Kun Zhou, Ruipu Luo, Can Zhang, Zhentao He, Yufei Zhan, Wayne Xin Zhao, Minghui Qiu

    Abstract: Recent advances in inference-time scaling, particularly those leveraging reinforcement learning with verifiable rewards, have substantially enhanced the reasoning capabilities of Large Vision-Language Models (LVLMs). Inspired by this success, similar strategies have been applied to multimodal reasoning, yet their impact on visual perception remains unclear. To investigate this gap, we introduce Di… ▽ More

    Submitted 9 October, 2025; originally announced October 2025.

  38. arXiv:2509.25755  [pdf, ps, other

    cs.IR cs.SI

    HiFIRec Towards High-Frequency yet Low-Intention Behaviors for Multi-Behavior Recommendation

    Authors: Ruiqi Luo, Ran Jin, Kaixi Hu, Xiaohui Tao, Lin Li

    Abstract: Multi behavior recommendation leverages multiple types of user-item interactions to address data sparsity and cold-start issues,providing personalized services in domains such as healthcare and ecommerce.Most existing methods utilize graph neural networks to model user intention in a unified manner,which inadequately considers the heterogeneity across different behaviors.Especially,high frequency… ▽ More

    Submitted 1 December, 2025; v1 submitted 30 September, 2025; originally announced September 2025.

  39. arXiv:2509.22638  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Language Models Can Learn from Verbal Feedback Without Scalar Rewards

    Authors: Renjie Luo, Zichen Liu, Xiangyan Liu, Chao Du, Min Lin, Wenhu Chen, Wei Lu, Tianyu Pang

    Abstract: LLMs are often trained with RL from human or AI feedback, yet such methods typically compress nuanced feedback into scalar rewards, discarding much of their richness and inducing scale imbalance. We propose treating verbal feedback as a conditioning signal. Inspired by language priors in text-to-image generation, which enable novel outputs from unseen prompts, we introduce the feedback-conditional… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

  40. arXiv:2509.18776  [pdf, ps, other

    cs.CL cs.AI cs.LG

    AECBench: A Hierarchical Benchmark for Knowledge Evaluation of Large Language Models in the AEC Field

    Authors: Chen Liang, Zhaoqi Huang, Haofen Wang, Fu Chai, Chunying Yu, Huanhuan Wei, Zhengjie Liu, Yanpeng Li, Hongjun Wang, Ruifeng Luo, Xianzhong Zhao

    Abstract: Large language models (LLMs), as a novel information technology, are seeing increasing adoption in the Architecture, Engineering, and Construction (AEC) field. They have shown their potential to streamline processes throughout the building lifecycle. However, the robustness and reliability of LLMs in such a specialized and safety-critical domain remain to be evaluated. To address this challenge, t… ▽ More

    Submitted 13 February, 2026; v1 submitted 23 September, 2025; originally announced September 2025.

    Comments: Accepted by Advanced Engineering Informatics. Code and data available at: https://github.com/ArchiAI-LAB/AECBench

    Journal ref: Advanced Engineering Informatics, Vol. 71, Article 104314 (2026)

  41. arXiv:2509.17361  [pdf

    cs.IR cs.AI

    SeqUDA-Rec: Sequential User Behavior Enhanced Recommendation via Global Unsupervised Data Augmentation for Personalized Content Marketing

    Authors: Ruihan Luo, Xuanjing Chen, Ziyang Ding

    Abstract: Personalized content marketing has become a crucial strategy for digital platforms, aiming to deliver tailored advertisements and recommendations that match user preferences. Traditional recommendation systems often suffer from two limitations: (1) reliance on limited supervised signals derived from explicit user feedback, and (2) vulnerability to noisy or unintentional interactions. To address th… ▽ More

    Submitted 22 September, 2025; originally announced September 2025.

  42. arXiv:2509.09928  [pdf

    cs.CE

    Fraud detection and risk assessment of online payment transactions on e-commerce platforms based on LLM and GCN frameworks

    Authors: RuiHan Luo, Nanxi Wang, Xiaotong Zhu

    Abstract: With the rapid growth of e-commerce, online payment fraud has become increasingly complex, posing serious threats to financial security and consumer trust. Traditional detection methods often struggle to capture the intricate relational structures inherent in transactional data. This study presents a novel fraud detection framework that combines Large Language Models (LLM) with Graph Convolutional… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

  43. arXiv:2509.08697  [pdf, ps, other

    cs.LG cs.AI

    Reshaping the Forward-Forward Algorithm with a Similarity-Based Objective

    Authors: James Gong, Raymond Luo, Emma Wang, Leon Ge, Bruce Li, Felix Marattukalam, Waleed Abdulla

    Abstract: Backpropagation is the pivotal algorithm underpinning the success of artificial neural networks, yet it has critical limitations such as biologically implausible backward locking and global error propagation. To circumvent these constraints, the Forward-Forward algorithm was proposed as a more biologically plausible method that replaces the backward pass with an additional forward pass. Despite th… ▽ More

    Submitted 29 August, 2025; originally announced September 2025.

    Comments: 6 pages

  44. arXiv:2509.02522  [pdf, ps, other

    cs.CL cs.LG

    Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

    Authors: Jiaming Li, Longze Chen, Ze Gong, Yukun Chen, Lu Wang, Wanwei He, Run Luo, Min Yang

    Abstract: Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have empowered large language models (LLMs) to tackle challenging reasoning tasks such as mathematics and programming. Despite its promise, the RLVR paradigm poses significant challenges, as existing methods often suffer from sparse reward signals and unstable policy gradient updates, inherent to RL-based approaches. To addre… ▽ More

    Submitted 16 February, 2026; v1 submitted 2 September, 2025; originally announced September 2025.

  45. arXiv:2508.16910  [pdf, ps, other

    cs.CL

    Unbiased Reasoning for Knowledge-Intensive Tasks in Large Language Models via Conditional Front-Door Adjustment

    Authors: Bo Zhao, Yinghao Zhang, Ziqi Xu, Yongli Ren, Xiuzhen Zhang, Renqiang Luo, Zaiwen Feng, Feng Xia

    Abstract: Large Language Models (LLMs) have shown impressive capabilities in natural language processing but still struggle to perform well on knowledge-intensive tasks that require deep reasoning and the integration of external knowledge. Although methods such as Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) have been proposed to enhance LLMs with external knowledge, they still suffer fro… ▽ More

    Submitted 23 August, 2025; originally announced August 2025.

    Comments: This paper has been accepted to the 34th ACM International Conference on Information and Knowledge Management (CIKM 2025), Full Research Paper

  46. arXiv:2508.15499  [pdf, ps, other

    cs.LG

    Let's Grow an Unbiased Community: Guiding the Fairness of Graphs via New Links

    Authors: Jiahua Lu, Huaxiao Liu, Shuotong Bai, Junjie Xu, Renqiang Luo, Enyan Dai

    Abstract: Graph Neural Networks (GNNs) have achieved remarkable success across diverse applications. However, due to the biases in the graph structures, graph neural networks face significant challenges in fairness. Although the original user graph structure is generally biased, it is promising to guide these existing structures toward unbiased ones by introducing new links. The fairness guidance via new li… ▽ More

    Submitted 2 November, 2025; v1 submitted 21 August, 2025; originally announced August 2025.

  47. arXiv:2508.14879  [pdf, ps, other

    cs.GR cs.CV

    MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

    Authors: Bingquan Dai, Li Ray Luo, Qihong Tang, Jie Wang, Xinyu Lian, Hao Xu, Minghan Qin, Xudong Xu, Bo Dai, Haoqian Wang, Zhaoyang Lyu, Jiangmiao Pang

    Abstract: Reconstructing 3D objects into editable programs is pivotal for applications like reverse engineering and shape editing. However, existing methods often rely on limited domain-specific languages (DSLs) and small-scale datasets, restricting their ability to model complex geometries and structures. To address these challenges, we introduce MeshCoder, a novel framework that reconstructs complex 3D ob… ▽ More

    Submitted 22 August, 2025; v1 submitted 20 August, 2025; originally announced August 2025.

  48. arXiv:2508.06069  [pdf, ps, other

    stat.ML cs.LG

    Lightweight Auto-bidding based on Traffic Prediction in Live Advertising

    Authors: Bo Yang, Ruixuan Luo, Junqi Jin, Han Zhu

    Abstract: Internet live streaming is widely used in online entertainment and e-commerce, where live advertising is an important marketing tool for anchors. An advertising campaign hopes to maximize the effect (such as conversions) under constraints (such as budget and cost-per-click). The mainstream control of campaigns is auto-bidding, where the performance depends on the decision of the bidding algorithm… ▽ More

    Submitted 8 August, 2025; originally announced August 2025.

  49. arXiv:2507.20511  [pdf, ps, other

    cs.CV

    Beyond Class Tokens: LLM-guided Dominant Property Mining for Few-shot Classification

    Authors: Wei Zhuo, Runjie Luo, Wufeng Xue, Linlin Shen

    Abstract: Few-shot Learning (FSL), which endeavors to develop the generalization ability for recognizing novel classes using only a few images, faces significant challenges due to data scarcity. Recent CLIP-like methods based on contrastive language-image pertaining mitigate the issue by leveraging textual representation of the class name for unseen image discovery. Despite the achieved success, simply alig… ▽ More

    Submitted 29 July, 2025; v1 submitted 28 July, 2025; originally announced July 2025.

    Comments: 11 pages, 7 figures

  50. Graph Learning

    Authors: Feng Xia, Ciyuan Peng, Jing Ren, Falih Gozi Febrinanto, Renqiang Luo, Vidya Saikrishna, Shuo Yu, Xiangjie Kong

    Abstract: Graph learning has rapidly evolved into a critical subfield of machine learning and artificial intelligence (AI). Its development began with early graph-theoretic methods, gaining significant momentum with the advent of graph neural networks (GNNs). Over the past decade, progress in scalable architectures, dynamic graph modeling, multimodal learning, generative AI, explainable AI (XAI), and respon… ▽ More

    Submitted 7 November, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 185 pages

    MSC Class: 68T09; 68R10 ACM Class: I.2.6; G.2.2; E.1

    Journal ref: Foundations and Trends in Signal Processing, Vol. 19, No. 4, pp 371-551. 2025