Skip to main content

Showing 1–50 of 130 results for author: Cao, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.08491  [pdf, ps, other

    cs.HC

    Figures as Interfaces: Toward LLM-Native Artifacts for Scientific Discovery

    Authors: Yifang Wang, Rui Sheng, Erzhuo Shao, Yifan Qian, Haotian Li, Nan Cao, Dashun Wang

    Abstract: Large language models (LLMs) are transforming scientific workflows, not only through their generative capabilities but also through their emerging ability to use tools, reason about data, and coordinate complex analytical tasks. Yet in most human-AI collaborations, the primary outputs, figures, are still treated as static visual summaries: once rendered, they are handled by both humans and multimo… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  2. arXiv:2604.05115  [pdf, ps, other

    cs.ET cs.LG

    Probabilistic Tree Inference Enabled by FDSOI Ferroelectric FETs

    Authors: Pengyu Ren, Xingtian Wang, Boyang Cheng, Jiahui Duan, Giuk Kim, Xuezhong Niu, Halid Mulaosmanovic, Stefan Duenkel, Sven Beyer, X. Sharon Hu, Ningyuan Cao, Kai Ni

    Abstract: Artificial intelligence applications in autonomous driving, medical diagnostics, and financial systems increasingly demand machine learning models that can provide robust uncertainty quantification, interpretability, and noise resilience. Bayesian decision trees (BDTs) are attractive for these tasks because they combine probabilistic reasoning, interpretable decision-making, and robustness to nois… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

  3. arXiv:2603.25692  [pdf, ps, other

    cs.LG cs.AI cs.AR cs.ET

    A Unified Memory Perspective for Probabilistic Trustworthy AI

    Authors: Xueji Zhao, Likai Pei, Jianbo Liu, Kai Ni, Ningyuan Cao

    Abstract: Trustworthy artificial intelligence increasingly relies on probabilistic computation to achieve robustness, interpretability, security and privacy. In practical systems, such workloads interleave deterministic data access with repeated stochastic sampling across models, data paths and system functions, shifting performance bottlenecks from arithmetic units to memory systems that must deliver both… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

  4. arXiv:2603.12211  [pdf, ps, other

    cs.DS cs.DB

    Bounding the Fragmentation of B-Trees Subject to Batched Insertions

    Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton, Hanna Komlós, Yarin Shechter, Nicole Wein

    Abstract: The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at… ▽ More

    Submitted 12 March, 2026; originally announced March 2026.

    Comments: To appear at PODS 2026, 30 pages, 5 figures

  5. arXiv:2603.09286  [pdf, ps, other

    cs.CV

    CogBlender: Towards Continuous Cognitive Intervention in Text-to-Image Generation

    Authors: Shengqi Dang, Jiaying Lei, Yi He, Ziqing Qian, Nan Cao

    Abstract: Beyond conveying semantic information, an image can also manifest cognitive attributes that elicit specific cognitive processes from the viewer, such as memory encoding or emotional response. While modern text-to-image models excel at generating semantically coherent content, they remain limited in their ability to control such cognitive properties of images (e.g., valence, memorability), often fa… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

  6. arXiv:2602.13071  [pdf, ps, other

    cs.LG cs.AI

    Bus-Conditioned Zero-Shot Trajectory Generation via Task Arithmetic

    Authors: Shuai Liu, Ning Cao, Yile Chen, Yue Jiang, Gao Cong

    Abstract: Mobility trajectory data provide essential support for smart city applications. However, such data are often difficult to obtain. Meanwhile, most existing trajectory generation methods implicitly assume that at least a subset of real mobility data from target city is available, which limits their applicability in data-inaccessible scenarios. In this work, we propose a new problem setting, called b… ▽ More

    Submitted 13 February, 2026; originally announced February 2026.

  7. Beyond Input-Output: Rethinking Creativity through Design-by-Analogy in Human-AI Collaboration

    Authors: Xuechen Li, Shuai Zhang, Nan Cao, Qing Chen

    Abstract: While the proliferation of foundation models has significantly boosted individual productivity, it also introduces a potential challenge: the homogenization of creative content. In response, we revisit Design-by-Analogy (DbA), a cognitively grounded approach that fosters novel solutions by mapping inspiration across domains. However, prevailing perspectives often restrict DbA to early ideation or… ▽ More

    Submitted 10 February, 2026; originally announced February 2026.

    Comments: 20 pages, 9 figures. Accepted to the 2026 CHI Conference on Human Factors in Computing Systems

    ACM Class: H.5.m

    Journal ref: Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26)

  8. arXiv:2602.04356  [pdf, ps, other

    cs.CV

    When and Where to Attack? Stage-wise Attention-Guided Adversarial Attack on Large Vision Language Models

    Authors: Jaehyun Kwak, Nam Cao, Boryeong Cho, Segyu Lee, Sumyeong Ahn, Se-Young Yun

    Abstract: Adversarial attacks against Large Vision-Language Models (LVLMs) are crucial for exposing safety vulnerabilities in modern multimodal systems. Recent attacks based on input transformations, such as random cropping, suggest that spatially localized perturbations can be more effective than global image manipulation. However, randomly cropping the entire image is inherently stochastic and fails to us… ▽ More

    Submitted 4 February, 2026; originally announced February 2026.

    Comments: Pre-print

  9. arXiv:2602.02000  [pdf, ps, other

    cs.CV cs.AI

    SurfSplat: Conquering Feedforward 2D Gaussian Splatting with Surface Continuity Priors

    Authors: Bing He, Jingnan Gao, Yunuo Chen, Ning Cao, Gang Chen, Zhengxue Cheng, Li Song, Wenjun Zhang

    Abstract: Reconstructing 3D scenes from sparse images remains a challenging task due to the difficulty of recovering accurate geometry and texture without optimization. Recent approaches leverage generalizable models to generate 3D scenes using 3D Gaussian Splatting (3DGS) primitive. However, they often fail to produce continuous surfaces and instead yield discrete, color-biased point clouds that appear pla… ▽ More

    Submitted 3 February, 2026; v1 submitted 2 February, 2026; originally announced February 2026.

    Comments: ICLR 2026; Project Page: https://hebing-sjtu.github.io/SurfSplat-website/

  10. arXiv:2512.06982  [pdf, ps, other

    cs.LG eess.SY

    LLM-Driven Composite Neural Architecture Search for Multi-Source RL State Encoding

    Authors: Yu Yu, Qian Xie, Nairen Cao, Li Jin

    Abstract: Designing state encoders for reinforcement learning (RL) with multiple information sources -- such as sensor measurements, time-series signals, image observations, and textual instructions -- remains underexplored and often requires manual design. We formalize this challenge as a problem of composite neural architecture search (NAS), where multiple source-specific modules and a fusion module are j… ▽ More

    Submitted 11 December, 2025; v1 submitted 7 December, 2025; originally announced December 2025.

    Comments: NeurIPS 2025 Workshop on Bridging Language, Agent, and World Models for Reasoning and Planning

  11. arXiv:2511.12516  [pdf, ps, other

    cs.SI

    Designed to Spread: A Generative Approach to Enhance Information Diffusion

    Authors: Ziqing Qian, Jiaying Lei, Shengqi Dang, Nan Cao

    Abstract: Social media has fundamentally transformed how people access information and form social connections, with content expression playing a critical role in driving information diffusion. While prior research has focused largely on network structures and tipping point identification, it provides limited tools for automatically generating content tailored for virality within a specific audience. To fil… ▽ More

    Submitted 12 March, 2026; v1 submitted 16 November, 2025; originally announced November 2025.

    Comments: Accepted by AAAI26

  12. arXiv:2511.09298  [pdf, ps, other

    cs.CV cs.AI

    DensiCrafter: Physically-Constrained Generation and Fabrication of Self-Supporting Hollow Structures

    Authors: Shengqi Dang, Fu Chai, Jiaxin Li, Chao Yuan, Wei Ye, Nan Cao

    Abstract: The rise of 3D generative models has enabled automatic 3D geometry and texture synthesis from multimodal inputs (e.g., text or images). However, these methods often ignore physical constraints and manufacturability considerations. In this work, we address the challenge of producing 3D designs that are both lightweight and self-supporting. We present DensiCrafter, a framework for generating lightwe… ▽ More

    Submitted 26 November, 2025; v1 submitted 12 November, 2025; originally announced November 2025.

  13. arXiv:2510.08022  [pdf, ps, other

    cs.RO cs.AI

    FastUMI-100K: Advancing Data-driven Robotic Manipulation with a Large-scale UMI-style Dataset

    Authors: Kehui Liu, Zhongjie Jia, Yang Li, Zhaxizhuoma, Pengan Chen, Song Liu, Xin Liu, Pingrui Zhang, Haoming Song, Xinyi Ye, Nieqing Cao, Zhigang Wang, Jia Zeng, Dong Wang, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: Data-driven robotic manipulation learning depends on large-scale, high-quality expert demonstration datasets. However, existing datasets, which primarily rely on human teleoperated robot collection, are limited in terms of scalability, trajectory smoothness, and applicability across different robotic embodiments in real-world environments. In this paper, we present FastUMI-100K, a large-scale UMI-… ▽ More

    Submitted 9 October, 2025; originally announced October 2025.

  14. arXiv:2509.17513  [pdf, ps, other

    cs.CV

    4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming

    Authors: Zihan Zheng, Zhenlong Wu, Houqiang Zhong, Yuan Tian, Ning Cao, Lan Xu, Jiangchao Yao, Xiaoyun Zhang, Qiang Hu, Wenjun Zhang

    Abstract: Achieving seamless viewing of high-fidelity volumetric video, comparable to 2D video experiences, remains an open challenge. Existing volumetric video compression methods either lack the flexibility to adjust quality and bitrate within a single model for efficient streaming across diverse networks and devices, or struggle with real-time decoding and rendering on lightweight mobile platforms. To ad… ▽ More

    Submitted 26 September, 2025; v1 submitted 22 September, 2025; originally announced September 2025.

    Comments: NeurIPS 2025

  15. arXiv:2509.17506  [pdf, ps, other

    cs.CV

    4D-MoDe: Towards Editable and Scalable Volumetric Streaming via Motion-Decoupled 4D Gaussian Compression

    Authors: Houqiang Zhong, Zihan Zheng, Qiang Hu, Yuan Tian, Ning Cao, Lan Xu, Xiaoyun Zhang, Zhengxue Cheng, Li Song, Wenjun Zhang

    Abstract: Volumetric video has emerged as a key medium for immersive telepresence and augmented/virtual reality, enabling six-degrees-of-freedom (6DoF) navigation and realistic spatial interactions. However, delivering high-quality dynamic volumetric content at scale remains challenging due to massive data volume, complex motion, and limited editability of existing representations. In this paper, we present… ▽ More

    Submitted 22 September, 2025; originally announced September 2025.

  16. arXiv:2509.16323  [pdf, ps, other

    cs.HC

    Funding the Frontier: Visualizing the Broad Impact of Science and Science Funding

    Authors: Yifang Wang, Yifan Qian, Xiaoyu Qi, Yian Yin, Shengqi Dang, Ziqing Qian, Benjamin F. Jones, Nan Cao, Dashun Wang

    Abstract: Understanding the broad impact of science and science funding is critical to ensuring that science investments and policies align with societal needs. Existing research links science funding to the output of scientific publications but largely leaves out the downstream uses of science and the myriad ways in which investing in science may impact human society. As funders seek to allocate scarce fun… ▽ More

    Submitted 19 September, 2025; originally announced September 2025.

  17. arXiv:2508.10868  [pdf, ps, other

    cs.CV

    TexVerse: A Universe of 3D Objects with High-Resolution Textures

    Authors: Yibo Zhang, Li Zhang, Rui Ma, Nan Cao

    Abstract: We introduce TexVerse, a large-scale 3D dataset featuring high-resolution textures. While recent advances in large-scale 3D datasets have enhanced high-resolution geometry generation, creating high-resolution textures end-to-end remains underexplored due to the lack of suitable datasets. TexVerse fills this gap with a curated collection of over 858K unique high-resolution 3D models sourced from Sk… ▽ More

    Submitted 3 September, 2025; v1 submitted 14 August, 2025; originally announced August 2025.

    Comments: https://github.com/yiboz2001/TexVerse

  18. arXiv:2508.03747  [pdf, ps, other

    cs.SI cs.AI cs.LG

    Data-Driven Discovery of Mobility Periodicity for Understanding Urban Systems

    Authors: Xinyu Chen, Qi Wang, Yunhan Zheng, Nina Cao, HanQin Cai, Jinhua Zhao

    Abstract: Human mobility regularity is crucial for understanding urban dynamics and informing decision-making processes. This study first quantifies the periodicity in complex human mobility data as a sparse identification of dominant positive auto-correlations in time series autoregression and then discovers periodic patterns. We apply the framework to large-scale metro passenger flow data in Hangzhou, Chi… ▽ More

    Submitted 12 September, 2025; v1 submitted 2 August, 2025; originally announced August 2025.

  19. arXiv:2507.10943  [pdf, ps, other

    cs.CV

    Robust ID-Specific Face Restoration via Alignment Learning

    Authors: Yushun Fang, Lu Liu, Xiang Gao, Qiang Hu, Ning Cao, Jianghe Cui, Gang Chen, Xiaoyun Zhang

    Abstract: The latest developments in Face Restoration have yielded significant advancements in visual quality through the utilization of diverse diffusion priors. Nevertheless, the uncertainty of face identity introduced by identity-obscure inputs and stochastic generative processes remains unresolved. To address this challenge, we present Robust ID-Specific Face Restoration (RIDFR), a novel ID-specific fac… ▽ More

    Submitted 28 August, 2025; v1 submitted 14 July, 2025; originally announced July 2025.

    Comments: PRCV2025 Accepted

  20. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3410 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 19 December, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  21. arXiv:2506.18455  [pdf, ps, other

    cs.HC

    CODS : A Theoretical Model for Computational Design Based on Design Space

    Authors: Nan Cao, Xiaoyu Qi, Chuer Chen, Xiaoke Yan

    Abstract: We introduce CODS (Computational Optimization in Design Space), a theoretical model that frames computational design as a constrained optimization problem over a structured, multi-dimensional design space. Unlike existing methods that rely on handcrafted heuristics or domain-specific rules, CODS provides a generalizable and interpretable framework that supports diverse design tasks. Given a user r… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  22. arXiv:2506.13129  [pdf, ps, other

    cs.HC

    ChartBlender: An Interactive System for Authoring and Synchronizing Visualization Charts in Video

    Authors: Yi He, Yuqi Liu, Chenpu Li, Ruoyan Chen, Chuer Chen, Shengqi Dang, Nan Cao

    Abstract: Embedding data visualizations in video can enhance the communication of complex information. However, this process is often labor-intensive, requiring designers to adjust visualizations frame by frame manually. In this work, we present ChartBlender, a novel system that streamlines this process by enabling users to create data visualizations, embed them seamlessly into video scenes, and automatical… ▽ More

    Submitted 30 December, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

    Comments: 12 pages, 8 figures

  23. arXiv:2506.10587  [pdf, ps, other

    cs.HC

    IDEA: Augmenting Design Intelligence through Design Space Exploration

    Authors: Chuer Chen, Xiaoke Yan, Xiaoyu Qi, Nan Cao

    Abstract: Design spaces serve as a conceptual framework that enables designers to explore feasible solutions through the selection and combination of design elements. However, effective decision-making remains heavily dependent on the designer's experience, and the absence of mathematical formalization prevents computational support for automated design processes. To bridge this gap, we introduce a structur… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  24. Person Re-Identification System at Semantic Level based on Pedestrian Attributes Ontology

    Authors: Ngoc Q. Ly, Hieu N. M. Cao, Thi T. Nguyen

    Abstract: Person Re-Identification (Re-ID) is a very important task in video surveillance systems such as tracking people, finding people in public places, or analysing customer behavior in supermarkets. Although there have been many works to solve this problem, there are still remaining challenges such as large-scale datasets, imbalanced data, viewpoint, fine grained data (attributes), the Local Features a… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Journal ref: International Journal of Advanced Computer Science and Applications(IJACSA), 11(2), 2020

  25. arXiv:2505.24597  [pdf, other

    cs.AI

    Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction

    Authors: Shuai Liu, Ning Cao, Yile Chen, Yue Jiang, Gao Cong

    Abstract: Next location prediction plays a critical role in understanding human mobility patterns. However, existing approaches face two core limitations: (1) they fall short in capturing the complex, multi-functional semantics of real-world locations; and (2) they lack the capacity to model heterogeneous behavioral dynamics across diverse user groups. To tackle these challenges, we introduce NextLocMoE, a… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  26. arXiv:2505.21954  [pdf, ps, other

    cs.CV cs.AI

    UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios

    Authors: Le Thien Phuc Nguyen, Zhuoran Yu, Khoa Quang Nhat Cao, Yuwei Guo, Tu Ho Manh Pham, Tuan Tai Nguyen, Toan Ngo Duc Vo, Lucas Poon, Soochahn Lee, Yong Jae Lee

    Abstract: We present UniTalk, a novel dataset specifically designed for the task of active speaker detection, emphasizing challenging scenarios to enhance model generalization. Unlike previously established benchmarks such as AVA, which predominantly features old movies and thus exhibits significant domain gaps, UniTalk focuses explicitly on diverse and difficult real-world conditions. These include underre… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  27. arXiv:2504.17334  [pdf, other

    cs.HC cs.IR

    DataScout: Automatic Data Fact Retrieval for Statement Augmentation with an LLM-Based Agent

    Authors: Chuer Chen, Yuqi Liu, Danqing Shi, Shixiong Cao, Nan Cao

    Abstract: A data story typically integrates data facts from multiple perspectives and stances to construct a comprehensive and objective narrative. However, retrieving these facts demands time for data search and challenges the creator's analytical skills. In this work, we introduce DataScout, an interactive system that automatically performs reasoning and stance-based data facts retrieval to augment the us… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

  28. arXiv:2504.17267  [pdf, other

    cs.HC cs.MM

    MV-Crafter: An Intelligent System for Music-guided Video Generation

    Authors: Chuer Chen, Shengqi Dang, Yuqi Liu, Nanxuan Zhao, Yang Shi, Nan Cao

    Abstract: Music videos, as a prevalent form of multimedia entertainment, deliver engaging audio-visual experiences to audiences and have gained immense popularity among singers and fans. Creators can express their interpretations of music naturally through visual elements. However, the creation process of music video demands proficiency in script design, video shooting, and music-video synchronization, posi… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

  29. arXiv:2504.12060  [pdf, ps, other

    cs.DS

    Static to Dynamic Correlation Clustering

    Authors: Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, David Rasmussen Lolck, Alantha Newman, Mikkel Thorup, Lukas Vogl, Shuyi Yan, Hanwen Zhang

    Abstract: Correlation clustering is a well-studied problem, first proposed by Bansal, Blum, and Chawla [BBC04]. The input is an unweighted, undirected graph. The problem is to cluster the vertices so as to minimizing the number of edges between vertices in different clusters and missing edges between vertices inside the same cluster. This problem has a wide application in data mining and machine learning. W… ▽ More

    Submitted 22 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

  30. arXiv:2503.23685  [pdf, other

    cs.ET

    An In-Situ Spatial-Temporal Sequence Detector for Neuromorphic Vision Sensor Empowered by High Density Vertical NAND Storage

    Authors: Zijian Zhao, Varun Darshana Parekh, Po-Kai Hsu, Yixin Qin, Yiming Song, A N M Nafiul Islam, Ningyuan Cao, Siddharth Joshi, Thomas Kämpfe, Moonyoung Jung, Kwangyou Seo, Kwangsoo Kim, Wanki Kim, Daewon Ha, Sourav Dutta, Abhronil Sengupta, Xiao Gong, Shimeng Yu, Vijaykrishnan Narayanan, Kai Ni

    Abstract: Neuromorphic vision sensors require efficient real-time pattern recognition, yet conventional architectures struggle with energy and latency constraints. Here, we present a novel in-situ spatiotemporal sequence detector that leverages vertical NAND storage to achieve massively parallel pattern detection. By encoding each cell with two single-transistor-based multi-level cell (MLC) memory elements,… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

    Comments: 26 pages, 7 figures

  31. Solving the Correlation Cluster LP in Sublinear Time

    Authors: Nairen Cao, Vincent Cohen-Addad, Shi Li, Euiwoong Lee, David Rasmussen Lolck, Alantha Newman, Mikkel Thorup, Lukas Vogl, Shuyi Yan, Hanwen Zhang

    Abstract: Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of missing intra-cluster edges. CCL+24 introduced the cluster LP for Correlation Clustering, which they argued captures the problem much more succinctly than previou… ▽ More

    Submitted 4 November, 2025; v1 submitted 26 March, 2025; originally announced March 2025.

  32. arXiv:2502.16163  [pdf, other

    eess.IV cs.CV

    Large Language Model for Lossless Image Compression with Visual Prompts

    Authors: Junhao Du, Chuqin Zhou, Ning Cao, Gang Chen, Yunuo Chen, Zhengxue Cheng, Li Song, Guo Lu, Wenjun Zhang

    Abstract: Recent advancements in deep learning have driven significant progress in lossless image compression. With the emergence of Large Language Models (LLMs), preliminary attempts have been made to leverage the extensive prior knowledge embedded in these pretrained models to enhance lossless image compression, particularly by improving the entropy model. However, a significant challenge remains in bridg… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

  33. arXiv:2502.16038  [pdf, ps, other

    cs.SI

    Emotion-Aware Design: Modulating Valence, Arousal, and Dominance in Communication via Design

    Authors: Shixiong Cao, Nan Cao

    Abstract: In an era of emotionally saturated digital media and information overload, effective communication demands more than clarity and accuracy-it requires emotional awareness. This review introduces the paradigm of emotion-aware design, a framework grounded in the valence-arousal-dominance (VAD) model of affect, which systematically examines how emotional modulation shapes comprehension, memory, and be… ▽ More

    Submitted 25 June, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

    Comments: 17 pages,6 figures

  34. arXiv:2502.12519  [pdf, other

    cs.DS cs.DC

    Min-Max Correlation Clustering via Neighborhood Similarity

    Authors: Nairen Cao, Steven Roche, Hsin-Hao Su

    Abstract: We present an efficient algorithm for the min-max correlation clustering problem. The input is a complete graph where edges are labeled as either positive $(+)$ or negative $(-)$, and the objective is to find a clustering that minimizes the $\ell_{\infty}$-norm of the disagreement vector over all vertices. We resolve this problem with an efficient $(3 + ε)$-approximation algorithm that runs in n… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  35. arXiv:2501.05710  [pdf, ps, other

    cs.CV

    EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model

    Authors: Shengqi Dang, Yi He, Long Ling, Ziqing Qian, Nanxuan Zhao, Nan Cao

    Abstract: Recent research shows that emotions can enhance users' cognition and influence information communication. While research on visual emotion analysis is extensive, limited work has been done on helping users generate emotionally rich image content. Existing work on emotional image generation relies on discrete emotion categories, making it challenging to capture complex and subtle emotional nuances… ▽ More

    Submitted 30 December, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

    Comments: 11 pages, 10 figures

  36. arXiv:2501.04577  [pdf, other

    cs.AR cs.AI cs.LG cs.RO

    A 65 nm Bayesian Neural Network Accelerator with 360 fJ/Sample In-Word GRNG for AI Uncertainty Estimation

    Authors: Zephan M. Enciso, Boyang Cheng, Likai Pei, Jianbo Liu, Steven Davis, Michael Niemier, Ningyuan Cao

    Abstract: Uncertainty estimation is an indispensable capability for AI-enabled, safety-critical applications, e.g. autonomous vehicles or medical diagnosis. Bayesian neural networks (BNNs) use Bayesian statistics to provide both classification predictions and uncertainty estimation, but they suffer from high computational overhead associated with random number generation and repeated sample iterations. Furt… ▽ More

    Submitted 22 January, 2025; v1 submitted 8 January, 2025; originally announced January 2025.

    Comments: 7 pages, 12 figures

    ACM Class: B.7.1; B.3.1; I.2.10; I.2.9

  37. arXiv:2411.19064  [pdf, other

    cs.CL cs.AI

    Way to Specialist: Closing Loop Between Specialized LLM and Evolving Domain Knowledge Graph

    Authors: Yutong Zhang, Lixing Chen, Shenghong Li, Nan Cao, Yang Shi, Jiaxin Ding, Zhe Qu, Pan Zhou, Yang Bai

    Abstract: Large language models (LLMs) have demonstrated exceptional performance across a wide variety of domains. Nonetheless, generalist LLMs continue to fall short in reasoning tasks necessitating specialized knowledge. Prior investigations into specialized LLMs focused on domain-specific training, which entails substantial efforts in domain data acquisition and model parameter fine-tuning. To address th… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

    Comments: Accepted by KDD 2025

  38. arXiv:2410.22710  [pdf, other

    cs.CV

    LoFLAT: Local Feature Matching using Focused Linear Attention Transformer

    Authors: Naijian Cao, Renjie He, Yuchao Dai, Mingyi He

    Abstract: Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges due to the quadratic computational complexity of attention mechanisms, especially at high resolutions. However, while existing Transformer-based detector-free… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

  39. arXiv:2410.13407  [pdf, other

    cs.RO

    BestMan: A Modular Mobile Manipulator Platform for Embodied AI with Unified Simulation-Hardware APIs

    Authors: Kui Yang, Nieqing Cao, Yan Ding, Chao Chen

    Abstract: Embodied Artificial Intelligence (Embodied AI) emphasizes agents' ability to perceive, understand, and act in physical environments. Simulation platforms play a crucial role in advancing this field by enabling the validation and optimization of algorithms. However, existing platforms face challenges such as multilevel technical integration complexity, insufficient modularity, interface heterogenei… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  40. arXiv:2410.09321  [pdf, ps, other

    cs.DS

    Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering

    Authors: Nairen Cao, Shi Li, Jia Ye

    Abstract: We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman[DMN24]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all vertices. We present an efficient algorithm that produces a clustering that is simultaneously a $63.3$-approximation for all monotone symmetric norms. This sign… ▽ More

    Submitted 21 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

  41. arXiv:2410.09129  [pdf, ps, other

    cs.LG cs.AI cs.CL

    NextLocLLM: Location Semantics Modeling and Coordinate-Based Next Location Prediction with LLMs

    Authors: Shuai Liu, Ning Cao, Yile Chen, Yue Jiang, George Rosario Jagadeesh, Gao Cong

    Abstract: Next location prediction is a critical task in human mobility analysis.Existing methods typically formulate it as a classification task based on discrete location IDs, which hinders spatial continuity modeling and limits generalization to new cities. In this paper, we propose NextLocLLM, a novel framework that reformulates next-location prediction as coordinate regression and integrates LLMs for b… ▽ More

    Submitted 28 September, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: STIntelligence in CIKM 2025

  42. arXiv:2409.19499  [pdf, other

    cs.RO

    FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset

    Authors: Zhaxizhuoma, Kehui Liu, Chuyue Guan, Zhongjie Jia, Ziniu Wu, Xin Liu, Tianyu Wang, Shuai Liang, Pengan Chen, Pingrui Zhang, Haoming Song, Delin Qu, Dong Wang, Zhigang Wang, Nieqing Cao, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: Real-world manipulation data involving robotic arms is crucial for developing generalist action policies, yet such data remains scarce since existing data collection methods are hindered by high costs, hardware dependencies, and complex setup requirements. In this work, we introduce FastUMI, a substantial redesign of the Universal Manipulation Interface (UMI) system that addresses these challenges… ▽ More

    Submitted 1 February, 2025; v1 submitted 28 September, 2024; originally announced September 2024.

  43. arXiv:2409.11905  [pdf, other

    cs.RO cs.AI cs.IR

    AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots

    Authors: Zhaxizhuoma Zhaxizhuoma, Pengan Chen, Ziniu Wu, Jiawei Sun, Dong Wang, Peng Zhou, Nieqing Cao, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. In domestic settings, aligning task planning with user reminders poses significant challenges due to the limited quantity, diversity, and multimodal nature of the reminders. To address these challenges, AlignBot employs a fine-t… ▽ More

    Submitted 21 March, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

  44. arXiv:2408.14520   

    cs.LG cs.AI cs.SI

    Towards Graph Prompt Learning: A Survey and Beyond

    Authors: Qingqing Long, Yuchen Yan, Peiyan Zhang, Chen Fang, Wentao Cui, Zhiyuan Ning, Meng Xiao, Ning Cao, Xiao Luo, Lingjun Xu, Shiyue Jiang, Zheng Fang, Chong Chen, Xian-Sheng Hua, Yuanchun Zhou

    Abstract: Large-scale "pre-train and prompt learning" paradigms have demonstrated remarkable adaptability, enabling broad applications across diverse domains such as question answering, image recognition, and multimodal retrieval. This approach fully leverages the potential of large-scale pre-trained models, reducing downstream data requirements and computational costs while enhancing model applicability ac… ▽ More

    Submitted 24 September, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: I have decided to temporarily withdraw this draft as I am in the process of making further revisions to improve its content

  45. arXiv:2408.02679  [pdf, other

    cs.LG cs.GR cs.HC stat.ME

    Visual Analysis of Multi-outcome Causal Graphs

    Authors: Mengjie Fan, Jinlu Yu, Daniel Weiskopf, Nan Cao, Huai-Yu Wang, Liang Zhou

    Abstract: We introduce a visual analysis method for multiple causal graphs with different outcome variables, namely, multi-outcome causal graphs. Multi-outcome causal graphs are important in healthcare for understanding multimorbidity and comorbidity. To support the visual analysis, we collaborated with medical experts to devise two comparative visualization techniques at different stages of the analysis pr… ▽ More

    Submitted 25 August, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

  46. arXiv:2407.19533  [pdf, other

    cs.GR

    FreeShell: A Context-Free 4D Printing Technique for Fabricating Complex 3D Triangle Mesh Shells

    Authors: Chao Yuan, Nan Cao, Xuejiao Ma, Shengqi Dang

    Abstract: Freeform thin-shell surfaces are critical in various fields, but their fabrication is complex and costly. Traditional methods are wasteful and require custom molds, while 3D printing needs extensive support structures and post-processing. Thermoshrinkage actuated 4D printing is an effective method through flat structures fabricating 3D shell. However, existing research faces issues related to prec… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: This paper includes 12 pages and 19 figures

  47. arXiv:2407.18269  [pdf, other

    cs.AR cs.AI cs.LG

    LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits

    Authors: Chen-Chia Chang, Yikang Shen, Shaoze Fan, Jing Li, Shun Zhang, Ningyuan Cao, Yiran Chen, Xin Zhang

    Abstract: In the realm of electronic and electrical engineering, automation of analog circuit is increasingly vital given the complexity and customized requirements of modern applications. However, existing methods only develop search-based algorithms that require many simulation iterations to design a custom circuit topology, which is usually a time-consuming process. To this end, we introduce LaMAGIC, a p… ▽ More

    Submitted 29 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:6253-6262 https://proceedings.mlr.press/v235/chang24c.html

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:6253-6262, 2024

  48. arXiv:2405.04700  [pdf, other

    cs.LG cs.AI cs.DC cs.IR

    Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures

    Authors: Ruiyang Qin, Zheyu Yan, Dewen Zeng, Zhenge Jia, Dancheng Liu, Jianbo Liu, Zhi Zheng, Ningyuan Cao, Kai Ni, Jinjun Xiong, Yiyu Shi

    Abstract: Large Language Models (LLMs) deployed on edge devices learn through fine-tuning and updating a certain portion of their parameters. Although such learning methods can be optimized to reduce resource utilization, the overall required resources remain a heavy burden on edge devices. Instead, Retrieval-Augmented Generation (RAG), a resource-efficient LLM learning method, can improve the quality of th… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  49. Understanding the Cluster LP for Correlation Clustering

    Authors: Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, Lukas Vogl

    Abstract: In the classic Correlation Clustering problem introduced by Bansal, Blum, and Chawla (FOCS 2002), the input is a complete graph where edges are labeled either $+$ or $-$, and the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the -edges within parts. In recent years, Chawla, Makarychev, Schramm and Yaroslavtsev (STOC 2015) gave a 2.06-… ▽ More

    Submitted 31 October, 2025; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: The conference version of this paper claimed an approximation ratio of 1.437 whose proof currently has a gap. This version fixes the gap with a slightly worse approximation ratio of 1.485

  50. arXiv:2404.03381  [pdf, other

    cs.CL

    Learning to Plan and Generate Text with Citations

    Authors: Constanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata

    Abstract: The increasing demand for the deployment of LLMs in information-seeking scenarios has spurred efforts in creating verifiable systems, which generate responses to queries along with supporting evidence. In this paper, we explore the attribution capabilities of plan-based models which have been recently shown to improve the faithfulness, grounding, and controllability of generated text. We conceptua… ▽ More

    Submitted 23 July, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ACL 2024