Skip to main content

Showing 1–50 of 87 results for author: Nan, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.13611  [pdf, ps, other

    cs.SE

    V2E: Validating Smart Contract Vulnerabilities through Profit-driven Exploit Generation and Execution

    Authors: Jingwen Zhang, Yuhong Nan, Kaiwen Ning, Mingxi Ye, Wei Li, Yuming Xiao, Yuming Feng, Weizhe Zhang, Zibin Zheng

    Abstract: Smart contracts are a critical component of blockchain systems. Due to the large amount of digital assets carried by smart contracts, their security is of critical importance. Although numerous tools have been developed for detecting smart contract vulnerability, their effectiveness remains limited, particularly due to the high false positives included in the reported results. Therefore, developer… ▽ More

    Submitted 15 April, 2026; originally announced April 2026.

    Comments: Accepted by FSE 2026

  2. arXiv:2604.11686  [pdf, ps, other

    cs.IR

    EA-Agent: A Structured Multi-Step Reasoning Agent for Entity Alignment

    Authors: Yixuan Nan, Xixun Lin, Yanmin Shang, Ge Zhang, Zheng Fang, Fang Fang, Yanan Cao

    Abstract: Entity alignment (EA) aims to identify entities across different knowledge graphs (KGs) that refer to the same real-world object and plays a critical role in knowledge fusion and integration. Traditional EA methods mainly rely on knowledge representation learning, but their performance is often limited under noisy or sparsely supervised scenarios. Recently, large language models (LLMs) have been i… ▽ More

    Submitted 13 April, 2026; originally announced April 2026.

    Comments: ACL 2026,Main Conference

  3. arXiv:2603.29640  [pdf, ps, other

    cs.AI

    ASI-Evolve: AI Accelerates AI

    Authors: Weixian Xu, Tiantian Mi, Yixiu Liu, Yang Nan, Zhimeng Zhou, Lyumanshan Ye, Lin Zhang, Yu Qiao, Pengfei Liu

    Abstract: Can AI accelerate the development of AI itself? While recent agentic systems have shown strong performance on well-scoped tasks with rapid feedback, it remains unclear whether they can tackle the costly, long-horizon, and weakly supervised research loops that drive real AI progress. We present ASI-Evolve, an agentic framework for AI-for-AI research that closes this loop through a learn-design-expe… ▽ More

    Submitted 31 March, 2026; originally announced March 2026.

    Comments: 19 pages, 6 figures, 6 tables. Code available at https://github.com/GAIR-NLP/ASI-Evolve

  4. arXiv:2603.12614  [pdf, ps, other

    cs.SE cs.CR

    ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents

    Authors: Jiangrong Wu, Zitong Yao, Yuhong Nan, Zibin Zheng

    Abstract: Tool-augmented LLM agents increasingly rely on multi-step, multi-tool workflows to complete real tasks. This design expands the attack surface, because data produced by one tool can be persisted and later reused as input to another tool, enabling exploitable source-to-sink dataflows that only emerge through tool composition. We study this risk as multi-tool vulnerabilities in LLM agents, and show… ▽ More

    Submitted 12 March, 2026; originally announced March 2026.

  5. arXiv:2603.07557  [pdf, ps, other

    cs.SE

    AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents

    Authors: Yixi Lin, Jiangrong Wu, Yuhong Nan, Xueqiang Wang, Xinyuan Zhang, Zibin Zheng

    Abstract: The rapid integration of Large Language Model (LLM) agents into autonomous task execution has introduced significant privacy concerns within cross-tool data flows. In this paper, we systematically investigate and define a novel risk termed Data Over-Exposure (DOE) in LLM Agent, where an Agent inadvertently transmits sensitive data beyond the scope of user intent and functional necessity. We identi… ▽ More

    Submitted 8 March, 2026; originally announced March 2026.

    Comments: 26 pages, 7 figures

  6. arXiv:2602.16763  [pdf, ps, other

    cs.AI

    When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

    Authors: Mubashara Akhtar, Anka Reuel, Prajna Soni, Sanchit Ahuja, Pawan Sasanka Ammanamanchi, Ruchit Rawal, Vilém Zouhar, Srishti Yadav, Chenxi Whitehouse, Dayeon Ki, Jennifer Mickel, Leshem Choshen, Marek Šuppa, Jan Batzner, Jenny Chim, Jeba Sania, Yanan Long, Hossein A. Rahmani, Christina Knight, Yiyang Nan, Jyoutir Raj, Yu Fan, Shubham Singh, Subramanyam Sahoo, Eliya Habba , et al. (12 additional authors not shown)

    Abstract: Artificial Intelligence (AI) benchmarks play a central role in measuring progress in model development and guiding deployment decisions. However, many benchmarks quickly become saturated, meaning that they can no longer differentiate between the best-performing models, diminishing their long-term value. In this study, we analyze benchmark saturation across 60 Large Language Model (LLM) benchmarks… ▽ More

    Submitted 18 February, 2026; originally announced February 2026.

  7. DIVER: A Robust Text-to-SQL System with Dynamic Interactive Value Linking and Evidence Reasoning

    Authors: Yafeng Nan, Haifeng Sun, Zirui Zhuang, Qi Qi, Guojun Chu, Jianxin Liao, Dan Pei, Jingyu Wang

    Abstract: In the era of large language models, Text-to-SQL, as a natural language interface for databases, is playing an increasingly important role. The sota Text-to-SQL models have achieved impressive accuracy, but their performance critically relies on expert-written evidence, which typically clarifies schema and value linking that existing models struggle to identify. Such limitations stem from the ambi… ▽ More

    Submitted 12 February, 2026; originally announced February 2026.

    Comments: Accepted by SIGMOD 2026

  8. arXiv:2602.02427  [pdf, ps, other

    cs.LG

    Embedding Perturbation may Better Reflect the Uncertainty in LLM Reasoning

    Authors: Qihao Wen, Jiahao Wang, Yang Nan, Pengfei He, Ravi Tandon, Han Xu

    Abstract: Large language Models (LLMs) have achieved significant breakthroughs across diverse domains; however, they can still produce unreliable or misleading outputs. For responsible LLM application, Uncertainty Quantification (UQ) techniques are used to estimate a model's uncertainty about its outputs, indicating the likelihood that those outputs may be problematic. For LLM reasoning tasks, it is essenti… ▽ More

    Submitted 2 February, 2026; originally announced February 2026.

  9. Is My RPC Response Reliable? Detecting RPC Bugs in Ethereum Blockchain Client under Context

    Authors: Zhijie Zhong, Yuhong Nan, Mingxi Ye, Qing Xue, Jiashui Wang, Xinlei Ying, Long Liu, Zibin Zheng

    Abstract: Blockchain clients are fundamental software for running blockchain nodes. They provide users with various RPC (Remote Procedure Call) interfaces to interact with the blockchain. These RPC methods are expected to follow the same specification across different blockchain nodes, providing users with seamless interaction. However, there have been continuous reports on various RPC bugs that can cause u… ▽ More

    Submitted 29 January, 2026; originally announced January 2026.

    Comments: The paper is accepted by ICSE 2026

  10. arXiv:2601.09151  [pdf, ps, other

    cs.LG

    Interpretable Probability Estimation with LLMs via Shapley Reconstruction

    Authors: Yang Nan, Qihao Wen, Jiahao Wang, Pengfei He, Ravi Tandon, Yong Ge, Han Xu

    Abstract: Large Language Models (LLMs) demonstrate potential to estimate the probability of uncertain events, by leveraging their extensive knowledge and reasoning capabilities. This ability can be applied to support intelligent decision-making across diverse fields, such as financial forecasting and preventive healthcare. However, directly prompting LLMs for probability estimation faces significant challen… ▽ More

    Submitted 13 January, 2026; originally announced January 2026.

  11. arXiv:2510.18927  [pdf, ps, other

    cs.LG cs.AI cs.CL

    BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

    Authors: Zhiheng Xi, Xin Guo, Yang Nan, Enyu Zhou, Junrui Shen, Wenxiang Chen, Jiaqi Liu, Jixuan Huang, Zhihao Zhang, Honglin Guo, Xun Deng, Zhikai Lei, Miao Zheng, Guoteng Wang, Shuo Zhang, Peng Sun, Rui Zheng, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Reinforcement learning (RL) has recently become the core paradigm for aligning and strengthening large language models (LLMs). Yet, applying RL in off-policy settings--where stale data from past policies are used for training--improves sample efficiency, but remains challenging: policy entropy declines sharply, optimization often becomes unstable and may even collapse. Through theoretical and empi… ▽ More

    Submitted 21 October, 2025; originally announced October 2025.

    Comments: Preprint

  12. arXiv:2510.18342  [pdf, ps, other

    cs.AI

    ShortcutBreaker: Low-Rank Noisy Bottleneck and Frequency Filtering Block for Multi-Class Unsupervised Anomaly Detection

    Authors: Peng Tang, Xiaobin Hu, Tingcheng Li, Yang Nan, Tobias Lasser, Hongwei Bran Li

    Abstract: Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to develop a unified model for anomaly detection across multiple classes, i.e., eliminating the need to train separate models for distinct objects and thereby saving substantial computational resources. Under the MUAD setting, while advanced Transformer-based architectures have brought significant… ▽ More

    Submitted 27 March, 2026; v1 submitted 21 October, 2025; originally announced October 2025.

    Comments: Under Review

  13. arXiv:2509.23679  [pdf, ps, other

    cs.SE

    Satellite: Detecting and Analyzing Smart Contract Vulnerabilities caused by Subcontract Misuse

    Authors: Zeqin Liao, Yuhong Nan, Zixu Gao, Henglong Liang, Sicheng Hao, Jiajing Wu, Zibin Zheng

    Abstract: Developers of smart contracts pervasively reuse subcontracts to improve development efficiency. Like any program language, such subcontract reuse may unexpectedly include, or introduce vulnerabilities to the end-point smart contract. Unfortunately, automatically detecting such issues poses several unique challenges. Particularly, in most cases, smart contracts are compiled as bytecode, whose class… ▽ More

    Submitted 28 September, 2025; originally announced September 2025.

    Comments: This is the author version of the article accepted for publication in IEEE Transactions on Software Engineering. The final version is available at 10.1109/TSE.2025.3613470

  14. Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

    Authors: Junjie Ye, Yuming Yang, Yang Nan, Shuo Li, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan

    Abstract: Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains underexplored, limiting our ability to control knowledge change behavior in fine-tuned models. To address this gap, we evaluate closed-book question answering (CBQA)… ▽ More

    Submitted 10 February, 2026; v1 submitted 20 September, 2025; originally announced September 2025.

    Comments: Accepted by EMNLP 2025 Main Conference. Codes for parameter restoration are available at https://github.com/UmeanNever/ParamRestore

  15. arXiv:2509.09730  [pdf, ps, other

    cs.CV cs.AI

    MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance

    Authors: Kaikai Zhao, Zhaoxiang Liu, Peng Wang, Xin Wang, Zhicheng Ma, Yajun Xu, Wenjing Zhang, Yibing Nan, Kai Wang, Shiguo Lian

    Abstract: General-domain large multimodal models (LMMs) have achieved significant advances in various image-text tasks. However, their performance in the Intelligent Traffic Surveillance (ITS) domain remains limited due to the absence of dedicated multimodal datasets. To address this gap, we introduce MITS (Multimodal Intelligent Traffic Surveillance), the first large-scale multimodal benchmark dataset spec… ▽ More

    Submitted 10 September, 2025; originally announced September 2025.

    Comments: accepted by Image and Vision Computing

  16. arXiv:2509.04464  [pdf, ps, other

    cs.CL cs.AI

    Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?

    Authors: Yang Nan, Pengfei He, Ravi Tandon, Han Xu

    Abstract: Large language models (LLMs) have delivered significant breakthroughs across diverse domains but can still produce unreliable or misleading outputs, posing critical challenges for real-world applications. While many recent studies focus on quantifying model uncertainty, relatively little work has been devoted to \textit{diagnosing the source of uncertainty}. In this study, we show that, when an LL… ▽ More

    Submitted 28 August, 2025; originally announced September 2025.

    Comments: Proceedings of The 2025 Conference on Empirical Methods in Natural Language Processing (Findings)

  17. arXiv:2508.20559  [pdf, ps, other

    cs.CL cs.IR

    Leveraging Generative Models for Real-Time Query-Driven Text Summarization in Large-Scale Web Search

    Authors: Zeyu Xiong, Yixuan Nan, Li Gao, Hengzhu Tang, Shuaiqiang Wang, Junfeng Wang, Dawei Yin

    Abstract: In the dynamic landscape of large-scale web search, Query-Driven Text Summarization (QDTS) aims to generate concise and informative summaries from textual documents based on a given query, which is essential for improving user engagement and facilitating rapid decision-making. Traditional extractive summarization models, based primarily on ranking candidate summary segments, have been the dominant… ▽ More

    Submitted 28 August, 2025; originally announced August 2025.

    Comments: CIKM'25

  18. arXiv:2508.11464  [pdf, ps, other

    cs.CV

    Data-Driven Deepfake Image Detection Method -- The 2024 Global Deepfake Image Detection Challenge

    Authors: Xiaoya Zhu, Yibing Nan, Shiguo Lian

    Abstract: With the rapid development of technology in the field of AI, deepfake technology has emerged as a double-edged sword. It has not only created a large amount of AI-generated content but also posed unprecedented challenges to digital security. The task of the competition is to determine whether a face image is a Deepfake image and output its probability score of being a Deepfake image. In the image… ▽ More

    Submitted 15 August, 2025; originally announced August 2025.

  19. arXiv:2507.22434  [pdf, ps, other

    cs.LG

    RANA: Robust Active Learning for Noisy Network Alignment

    Authors: Yixuan Nan, Xixun Lin, Yanmin Shang, Zhuofan Li, Can Zhao, Yanan Cao

    Abstract: Network alignment has attracted widespread attention in various fields. However, most existing works mainly focus on the problem of label sparsity, while overlooking the issue of noise in network alignment, which can substantially undermine model performance. Such noise mainly includes structural noise from noisy edges and labeling noise caused by human-induced and process-driven errors. To addres… ▽ More

    Submitted 7 August, 2025; v1 submitted 30 July, 2025; originally announced July 2025.

    Comments: Accepted by ECAI 2025

  20. arXiv:2507.18267  [pdf, ps, other

    cs.SE

    An Empirical Study on Embodied Artificial Intelligence Robot (EAIR) Software Bugs

    Authors: Zeqin Liao, Zibin Zheng, Peifan Reng, Henglong Liang, Zixu Gao, Zhixiang Chen, Wei Li, Yuhong Nan

    Abstract: Embodied Artificial Intelligence Robots (EAIR) is an emerging and rapidly evolving technological domain. Ensuring their program correctness is fundamental to their successful deployment. However, a general and in-depth understanding of EAIR system bugs remains lacking, which hinders the development of practices and techniques to tackle EAIR system bugs. To bridge this gap, we conducted the first… ▽ More

    Submitted 24 July, 2025; originally announced July 2025.

  21. arXiv:2507.18074  [pdf, ps, other

    cs.AI

    AlphaGo Moment for Model Architecture Discovery

    Authors: Yixiu Liu, Yang Nan, Weixian Xu, Xiangkun Hu, Lyumanshan Ye, Zhen Qin, Pengfei Liu

    Abstract: While AI systems demonstrate exponentially improving capabilities, the pace of AI research itself remains linearly bounded by human cognitive capacity, creating an increasingly severe development bottleneck. We present ASI-Arch, the first demonstration of Artificial Superintelligence for AI research (ASI4AI) in the critical domain of neural architecture discovery--a fully autonomous system that sh… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

  22. arXiv:2507.15759  [pdf, ps, other

    cs.CL

    Interaction as Intelligence: Deep Research With Human-AI Partnership

    Authors: Lyumanshan Ye, Xiaojie Cai, Xinkai Wang, Junfei Wang, Xiangkun Hu, Jiadi Su, Yang Nan, Sihan Wang, Bohan Zhang, Xiaoze Fan, Jinbin Luo, Yuxiang Zheng, Tianze Xu, Dayuan Fu, Yunze Wu, Pengrui Lu, Zengzhi Wang, Yiwei Qin, Zhen Huang, Yan Ma, Zhulin Hu, Haoyang Zou, Tiantian Mi, Yixin Ye, Ethan Chern , et al. (1 additional authors not shown)

    Abstract: This paper introduces "Interaction as Intelligence" research series, presenting a reconceptualization of human-AI relationships in deep research tasks. Traditional approaches treat interaction merely as an interface for accessing AI capabilities-a conduit between human intent and machine output. We propose that interaction itself constitutes a fundamental dimension of intelligence. As AI systems e… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

    Comments: 30 pages, 10 figures

  23. arXiv:2507.02699  [pdf, ps, other

    cs.CR

    Control at Stake: Evaluating the Security Landscape of LLM-Driven Email Agents

    Authors: Jiangrong Wu, Yuhong Nan, Jianliang Wu, Zitong Yao, Zibin Zheng

    Abstract: The increasing capabilities of LLMs have led to the rapid proliferation of LLM agent apps, where developers enhance LLMs with access to external resources to support complex task execution. Among these, LLM email agent apps represent one of the widely used categories, as email remains a critical communication medium for users. LLM email agents are capable of managing and responding to email using… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  24. arXiv:2505.12056  [pdf, ps, other

    cs.SE

    Understanding the Sneaky Patterns of Pop-up Windows in the Mobile Ecosystem

    Authors: Dongpeng Wu, Yuhong Nan, Shaojiang Wang, Jiawei Wang, Luwa Li, Xueqiang Wang

    Abstract: In mobile applications, Pop-up window (PoW) plays a crucial role in improving user experience, guiding user actions, and delivering key information. Unfortunately, the excessive use of PoWs severely degrades the user experience. These PoWs often sneakily mislead users in their choices, employing tactics that subtly manipulate decision-making processes. In this paper, we provide the first in-depth… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  25. arXiv:2505.08751  [pdf, other

    cs.CL cs.CV cs.LG

    Aya Vision: Advancing the Frontier of Multilingual Multimodality

    Authors: Saurabh Dash, Yiyang Nan, John Dang, Arash Ahmadian, Shivalika Singh, Madeline Smith, Bharat Venkitesh, Vlad Shmyhlo, Viraat Aryabumi, Walter Beller-Morales, Jeremy Pekmez, Jason Ozuzu, Pierre Richemond, Acyr Locatelli, Nick Frosst, Phil Blunsom, Aidan Gomez, Ivan Zhang, Marzieh Fadaee, Manoj Govindassamy, Sudip Roy, Matthias Gallé, Beyza Ermis, Ahmet Üstün, Sara Hooker

    Abstract: Building multimodal language models is fundamentally challenging: it requires aligning vision and language modalities, curating high-quality instruction data, and avoiding the degradation of existing text-only capabilities once vision is introduced. These difficulties are further magnified in the multilingual setting, where the need for multimodal data in different languages exacerbates existing d… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  26. arXiv:2504.20879  [pdf, other

    cs.AI cs.CL cs.LG stat.ME

    The Leaderboard Illusion

    Authors: Shivalika Singh, Yiyang Nan, Alex Wang, Daniel D'Souza, Sayash Kapoor, Ahmet Üstün, Sanmi Koyejo, Yuntian Deng, Shayne Longpre, Noah A. Smith, Beyza Ermis, Marzieh Fadaee, Sara Hooker

    Abstract: Measuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also grow more susceptible to distortion. Chatbot Arena has emerged as the go-to leaderboard for ranking the most capable AI systems. Yet, in this work we identify systematic issues that have resulted in a distorted playing field. We find that undisclosed private test… ▽ More

    Submitted 12 May, 2025; v1 submitted 29 April, 2025; originally announced April 2025.

    Comments: 68 pages, 18 figures, 9 tables

  27. arXiv:2503.19860  [pdf, ps, other

    eess.IV cs.CV

    Unpaired Translation of Chest X-ray Images for Lung Opacity Diagnosis via Adaptive Activation Masks and Cross-Domain Alignment

    Authors: Junzhi Ning, Dominic Marshall, Yijian Gao, Xiaodan Xing Yang Nan, Yingying Fang, Sheng Zhang, Matthieu Komorowski, Guang Yang

    Abstract: Chest X-ray radiographs (CXRs) play a pivotal role in diagnosing and monitoring cardiopulmonary diseases. However, lung opacities in CXRs frequently obscure anatomical structures, impeding clear identification of lung borders and complicating the localization of pathology. This challenge significantly hampers segmentation accuracy and precise lesion identification, which are crucial for diagnosis.… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  28. Revisiting Medical Image Retrieval via Knowledge Consolidation

    Authors: Yang Nan, Huichi Zhou, Xiaodan Xing, Giorgos Papanastasiou, Lei Zhu, Zhifan Gao, Alejandro F Fangi, Guang Yang

    Abstract: As artificial intelligence and digital medicine increasingly permeate healthcare systems, robust governance frameworks are essential to ensure ethical, secure, and effective implementation. In this context, medical image retrieval becomes a critical component of clinical data management, playing a vital role in decision-making and safeguarding patient information. Existing methods usually learn ha… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  29. arXiv:2502.17184  [pdf, ps, other

    cs.CL

    Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric

    Authors: Yuming Yang, Yang Nan, Junjie Ye, Shihan Dou, Xiao Wang, Shuo Li, Huijie Lv, Mingqi Wu, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Data diversity is crucial for the instruction tuning of large language models. Existing studies have explored various diversity-aware data selection methods to construct high-quality datasets and enhance model performance. However, the fundamental problem of precisely defining and measuring data diversity remains underexplored, limiting clear guidance for data engineering. To address this, we syst… ▽ More

    Submitted 2 June, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

    Comments: Accepted at ACL 2025 Main. Camera-ready version updated (20 pages). Project page: https://github.com/UmeanNever/NovelSum

  30. arXiv:2501.08670  [pdf, ps, other

    cs.SE

    Augmenting Smart Contract Decompiler Output through Fine-grained Dependency Analysis and LLM-facilitated Semantic Recovery

    Authors: Zeqin Liao, Yuhong Nan, Zixu Gao, Henglong Liang, Sicheng Hao, Peifan Reng, Zibin Zheng

    Abstract: Decompiler is a specialized type of reverse engineering tool extensively employed in program analysis tasks, particularly in program comprehension and vulnerability detection. However, current Solidity smart contract decompilers face significant limitations in reconstructing the original source code. In particular, the bottleneck of SOTA decompilers lies in inaccurate method identification, incorr… ▽ More

    Submitted 16 October, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: This is the author version of the article accepted for publication in IEEE Transactions on Software Engineering

  31. arXiv:2411.19799  [pdf, other

    cs.CL

    INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

    Authors: Angelika Romanou, Negar Foroutan, Anna Sotnikova, Zeming Chen, Sree Harsha Nelaturu, Shivalika Singh, Rishabh Maheshwary, Micol Altomare, Mohamed A. Haggag, Snegha A, Alfonso Amayuelas, Azril Hafizi Amirudin, Viraat Aryabumi, Danylo Boiko, Michael Chang, Jenny Chim, Gal Cohen, Aditya Kumar Dalmia, Abraham Diress, Sharad Duwal, Daniil Dzenhaliou, Daniel Fernando Erazo Florez, Fabian Farestam, Joseph Marvin Imperial, Shayekh Bin Islam , et al. (34 additional authors not shown)

    Abstract: The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (\ie, multilingual LLMs) is bottlenecked by the lack of high-quality evaluation resources in languages other th… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

  32. arXiv:2411.15247  [pdf, other

    cs.LG

    Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

    Authors: Zhiwei Jia, Yuesong Nan, Huixi Zhao, Gengdai Liu

    Abstract: Recent research has shown that fine-tuning diffusion models (DMs) with arbitrary rewards, including non-differentiable ones, is feasible with reinforcement learning (RL) techniques, enabling flexible model alignment. However, applying existing RL methods to step-distilled DMs is challenging for ultra-fast ($\le2$-step) image generation. Our analysis suggests several limitations of policy-based RL… ▽ More

    Submitted 11 March, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

    Comments: CVPR 2025

  33. arXiv:2411.04933  [pdf, other

    cs.CV

    SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering

    Authors: Tianyu Yang, Yiyang Nan, Lisen Dai, Zhenwen Liang, Yapeng Tian, Xiangliang Zhang

    Abstract: Audio-Visual Question Answering (AVQA) is a challenging task that involves answering questions based on both auditory and visual information in videos. A significant challenge is interpreting complex multi-modal scenes, which include both visual objects and sound sources, and connecting them to the given question. In this paper, we introduce the Source-aware Semantic Representation Network (SaSR-N… ▽ More

    Submitted 10 November, 2024; v1 submitted 7 November, 2024; originally announced November 2024.

    Comments: EMNLP 2024

  34. arXiv:2411.02860  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Continual Audio-Visual Sound Separation

    Authors: Weiguo Pian, Yiyang Nan, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian

    Abstract: In this paper, we introduce a novel continual audio-visual sound separation task, aiming to continuously separate sound sources for new classes while preserving performance on previously learned classes, with the aid of visual guidance. This problem is crucial for practical visually guided auditory perception as it can significantly enhance the adaptability and robustness of audio-visual sound sep… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Comments: NeurIPS 2024

  35. arXiv:2410.13823  [pdf, other

    cs.CV

    Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning

    Authors: Xiaodan Xing, Junzhi Ning, Yang Nan, Guang Yang

    Abstract: Deep generative models have significantly advanced medical imaging analysis by enhancing dataset size and quality. Beyond mere data augmentation, our research in this paper highlights an additional, significant capacity of deep generative models: their ability to reveal and demonstrate patterns in medical images. We employ a generative structure with hybrid conditions, combining clinical data and… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Accepted by AIM-FM Workshop of NeurIPS2024

  36. arXiv:2410.02010  [pdf, other

    eess.IV cs.CV

    MONICA: Benchmarking on Long-tailed Medical Image Classification

    Authors: Lie Ju, Siyuan Yan, Yukun Zhou, Yang Nan, Xiaodan Xing, Peibo Duan, Zongyuan Ge

    Abstract: Long-tailed learning is considered to be an extremely challenging problem in data imbalance learning. It aims to train well-generalized models from a large number of images that follow a long-tailed class distribution. In the medical field, many diagnostic imaging exams such as dermoscopy and chest radiography yield a long-tailed distribution of complex clinical findings. Recently, long-tailed lea… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  37. arXiv:2409.18468  [pdf, other

    cs.SE

    SmartReco: Detecting Read-Only Reentrancy via Fine-Grained Cross-DApp Analysis

    Authors: Jingwen Zhang, Zibin Zheng, Yuhong Nan, Mingxi Ye, Kaiwen Ning, Yu Zhang, Weizhe Zhang

    Abstract: Despite the increasing popularity of Decentralized Applications (DApps), they are suffering from various vulnerabilities that can be exploited by adversaries for profits. Among such vulnerabilities, Read-Only Reentrancy (called ROR in this paper), is an emerging type of vulnerability that arises from the complex interactions between DApps. In the recent three years, attack incidents of ROR have al… ▽ More

    Submitted 9 December, 2024; v1 submitted 27 September, 2024; originally announced September 2024.

    Comments: Accepted by ICSE 2025

  38. arXiv:2409.13701  [pdf, ps, other

    cs.CL cs.AI

    CA-BERT: Leveraging Context Awareness for Enhanced Multi-Turn Chat Interaction

    Authors: Minghao Liu, Mingxiu Sui, Yi Nan, Cangqing Wang, Zhijie Zhou

    Abstract: Effective communication in automated chat systems hinges on the ability to understand and respond to context. Traditional models often struggle with determining when additional context is necessary for generating appropriate responses. This paper introduces Context-Aware BERT (CA-BERT), a transformer-based model specifically fine-tuned to address this challenge. CA-BERT innovatively applies deep l… ▽ More

    Submitted 1 October, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

    Comments: This paper has been accepted by ICBASE 2024

  39. arXiv:2409.04937  [pdf, other

    cs.SE

    CONNECTOR: Enhancing the Traceability of Decentralized Bridge Applications via Automatic Cross-chain Transaction Association

    Authors: Dan Lin, Jiajing Wu, Yuxin Su, Ziye Zheng, Yuhong Nan, Qinnan Zhang, Bowen Song, Zibin Zheng

    Abstract: Decentralized bridge applications are important software that connects various blockchains and facilitates cross-chain asset transfer in the decentralized finance (DeFi) ecosystem which currently operates in a multi-chain environment. Cross-chain transaction association identifies and matches unique transactions executed by bridge DApps, which is important research to enhance the traceability of c… ▽ More

    Submitted 19 December, 2024; v1 submitted 7 September, 2024; originally announced September 2024.

  40. arXiv:2409.03087  [pdf, ps, other

    eess.IV cs.CV

    Coupling AI and Citizen Science in Creation of Enhanced Training Dataset for Medical Image Segmentation

    Authors: Amir Syahmi, Xiangrong Lu, Yinxuan Li, Haoxuan Yao, Hanjun Jiang, Ishita Acharya, Shiyi Wang, Yang Nan, Xiaodan Xing, Guang Yang

    Abstract: Recent advancements in medical imaging and artificial intelligence (AI) have greatly enhanced diagnostic capabilities, but the development of effective deep learning (DL) models is still constrained by the lack of high-quality annotated datasets. The traditional manual annotation process by medical experts is time- and resource-intensive, limiting the scalability of these datasets. In this work, w… ▽ More

    Submitted 20 July, 2025; v1 submitted 4 September, 2024; originally announced September 2024.

  41. Beyond the Hype: A dispassionate look at vision-language models in medical scenario

    Authors: Yang Nan, Huichi Zhou, Xiaodan Xing, Guang Yang

    Abstract: Recent advancements in Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across diverse tasks, garnering significant attention in AI communities. However, their performance and reliability in specialized domains such as medicine remain insufficiently assessed. In particular, most assessments over-concentrate on evaluating VLMs based on simple Visual Question Answering… ▽ More

    Submitted 9 April, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

    Comments: 10 pages

  42. arXiv:2407.03542  [pdf

    eess.IV cs.CV cs.LG

    Probing Perfection: The Relentless Art of Meddling for Pulmonary Airway Segmentation from HRCT via a Human-AI Collaboration Based Active Learning Method

    Authors: Shiyi Wang, Yang Nan, Sheng Zhang, Federico Felder, Xiaodan Xing, Yingying Fang, Javier Del Ser, Simon L F Walsh, Guang Yang

    Abstract: In pulmonary tracheal segmentation, the scarcity of annotated data is a prevalent issue in medical segmentation. Additionally, Deep Learning (DL) methods face challenges: the opacity of 'black box' models and the need for performance enhancement. Our Human-Computer Interaction (HCI) based models (RS_UNet, LC_UNet, UUNet, and WD_UNet) address these challenges by combining diverse query strategies w… ▽ More

    Submitted 23 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  43. arXiv:2406.18924  [pdf, other

    cs.AI cs.LG cs.RO

    Learning Pareto Set for Multi-Objective Continuous Robot Control

    Authors: Tianye Shu, Ke Shang, Cheng Gong, Yang Nan, Hisao Ishibuchi

    Abstract: For a control problem with multiple conflicting objectives, there exists a set of Pareto-optimal policies called the Pareto set instead of a single optimal policy. When a multi-objective control problem is continuous and complex, traditional multi-objective reinforcement learning (MORL) algorithms search for many Pareto-optimal deep policies to approximate the Pareto set, which is quite resource-c… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  44. arXiv:2406.16189  [pdf, other

    eess.IV cs.CV

    Fuzzy Attention-based Border Rendering Network for Lung Organ Segmentation

    Authors: Sheng Zhang, Yang Nan, Yingying Fang, Shiyi Wang, Xiaodan Xing, Zhifan Gao, Guang Yang

    Abstract: Automatic lung organ segmentation on CT images is crucial for lung disease diagnosis. However, the unlimited voxel values and class imbalance of lung organs can lead to false-negative/positive and leakage issues in advanced methods. Additionally, some slender lung organs are easily lost during the recycled down/up-sample procedure, e.g., bronchioles & arterioles, causing severe discontinuity issue… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  45. SmartAxe: Detecting Cross-Chain Vulnerabilities in Bridge Smart Contracts via Fine-Grained Static Analysis

    Authors: Zeqin Liao, Yuhong Nan, Henglong Liang, Sicheng Hao, Juan Zhai, Jiajing Wu, Zibin Zheng

    Abstract: With the increasing popularity of blockchain, different blockchain platforms coexist in the ecosystem (e.g., Ethereum, BNB, EOSIO, etc.), which prompts the high demand for cross-chain communication. Cross-chain bridge is a specific type of decentralized application for asset exchange across different blockchain platforms. Securing the smart contracts of cross-chain bridges is in urgent need, as th… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Journal ref: The ACM International Conference on the Foundations of Software Engineering 2024

  46. SmartState: Detecting State-Reverting Vulnerabilities in Smart Contracts via Fine-Grained State-Dependency Analysis

    Authors: Zeqin Liao, Sicheng Hao, Yuhong Nan, Zibin Zheng

    Abstract: Smart contracts written in Solidity are widely used in different blockchain platforms such as Ethereum, TRON and BNB Chain. One of the unique designs in Solidity smart contracts is its state-reverting mechanism for error handling and access control. Unfortunately, a number of recent security incidents showed that adversaries also utilize this mechanism to manipulate critical states of smart contra… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 12 pages, 10 figures

    Journal ref: ISSTA 2023

  47. arXiv:2406.11192  [pdf, other

    cs.CL

    Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition

    Authors: Yuming Yang, Wantong Zhao, Caishuang Huang, Junjie Ye, Xiao Wang, Huiyuan Zheng, Yang Nan, Yuran Wang, Xueying Xu, Kaixin Huang, Yunke Zhang, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Open Named Entity Recognition (NER), which involves identifying arbitrary types of entities from arbitrary domains, remains challenging for Large Language Models (LLMs). Recent studies suggest that fine-tuning LLMs on extensive NER data can boost their performance. However, training directly on existing datasets neglects their inconsistent entity definitions and redundant data, limiting LLMs to da… ▽ More

    Submitted 21 April, 2025; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted at COLING 2025. Camera-ready version updated. Project page: https://github.com/UmeanNever/B2NER

    Journal ref: Proceedings of the 31st International Conference on Computational Linguistics (2025) 10902-10923

  48. arXiv:2406.02554  [pdf, other

    eess.AS cs.AI cs.CL cs.CV cs.LG cs.MM

    Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition

    Authors: Shijian Deng, Erin E. Kosloski, Siddhi Patel, Zeke A. Barnett, Yiyang Nan, Alexander Kaplan, Sisira Aarukapalli, William T. Doan, Matthew Wang, Harsh Singh, Pamela R. Rollins, Yapeng Tian

    Abstract: In this article, we introduce a novel problem of audio-visual autism behavior recognition, which includes social behavior recognition, an essential aspect previously omitted in AI-assisted autism screening research. We define the task at hand as one that is audio-visual autism behavior recognition, which uses audio and visual cues, including any speech present in the audio, to recognize autism-rel… ▽ More

    Submitted 22 March, 2024; originally announced June 2024.

  49. arXiv:2406.01815  [pdf

    cs.CV

    Deep asymmetric mixture model for unsupervised cell segmentation

    Authors: Yang Nan, Guang Yang

    Abstract: Automated cell segmentation has become increasingly crucial for disease diagnosis and drug discovery, as manual delineation is excessively laborious and subjective. To address this issue with limited manual annotation, researchers have developed semi/unsupervised segmentation approaches. Among these approaches, the Deep Gaussian mixture model plays a vital role due to its capacity to facilitate co… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures

  50. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.