Skip to main content

Showing 1–50 of 1,514 results for author: Cao, Z

.
  1. arXiv:2512.21094  [pdf, ps, other

    cs.CV

    T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

    Authors: Zhe Cao, Tao Wang, Jiaming Wang, Yanghai Wang, Yuanxing Zhang, Jialu Chen, Miao Deng, Jiahao Wang, Yubin Guo, Chenxi Liao, Yize Zhang, Zhaoxiang Zhang, Jiaheng Liu

    Abstract: Text-to-Audio-Video (T2AV) generation aims to synthesize temporally coherent video and semantically synchronized audio from natural language, yet its evaluation remains fragmented, often relying on unimodal metrics or narrowly scoped benchmarks that fail to capture cross-modal alignment, instruction following, and perceptual realism under complex prompts. To address this limitation, we present T2A… ▽ More

    Submitted 24 December, 2025; originally announced December 2025.

  2. arXiv:2512.19179  [pdf, ps, other

    cs.DC

    L4: Low-Latency and Load-Balanced LLM Serving via Length-Aware Scheduling

    Authors: Yitao Yuan, Chenqi Zhao, Bohan Zhao, Zane Cao, Yongchao He, Wenfei Wu

    Abstract: Efficiently harnessing GPU compute is critical to improving user experience and reducing operational costs in large language model (LLM) services. However, current inference engine schedulers overlook the attention backend's sensitivity to request-length heterogeneity within a batch. As state-of-the-art models now support context windows exceeding 128K tokens, this once-tolerable inefficiency has… ▽ More

    Submitted 22 December, 2025; originally announced December 2025.

    Comments: 15 pages, 16 figures

  3. arXiv:2512.18437  [pdf, ps, other

    cs.CV cs.AI

    MeniMV: A Multi-view Benchmark for Meniscus Injury Severity Grading

    Authors: Shurui Xu, Siqi Yang, Jiapin Ren, Zhong Cao, Hongwei Yang, Mengzhen Fan, Yuyu Sun, Shuyan Li

    Abstract: Precise grading of meniscal horn tears is critical in knee injury diagnosis but remains underexplored in automated MRI analysis. Existing methods often rely on coarse study-level labels or binary classification, lacking localization and severity information. In this paper, we introduce MeniMV, a multi-view benchmark dataset specifically designed for horn-specific meniscus injury grading. MeniMV co… ▽ More

    Submitted 20 December, 2025; originally announced December 2025.

    Comments: 5 pages, 2 figures

    ACM Class: I.4.9; J.3

  4. arXiv:2512.18251  [pdf, ps, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    CrystalFormer-CSP: Thinking Fast and Slow for Crystal Structure Prediction

    Authors: Zhendong Cao, Shigang Ou, Lei Wang

    Abstract: Crystal structure prediction is a fundamental problem in materials science. We present CrystalFormer-CSP, an efficient framework that unifies data-driven heuristic and physics-driven optimization approaches to predict stable crystal structures for given chemical compositions. The approach combines pretrained generative models for space-group-informed structure generation and a universal machine le… ▽ More

    Submitted 20 December, 2025; originally announced December 2025.

    Comments: 11 pages, 4 figures

  5. arXiv:2512.16453  [pdf, ps, other

    cs.AI

    TimeSeries2Report prompting enables adaptive large language model management of lithium-ion batteries

    Authors: Jiayang Yang, Chunhui Zhao, Martin Guay, Zhixing Cao

    Abstract: Large language models (LLMs) offer promising capabilities for interpreting multivariate time-series data, yet their application to real-world battery energy storage system (BESS) operation and maintenance remains largely unexplored. Here, we present TimeSeries2Report (TS2R), a prompting framework that converts raw lithium-ion battery operational time-series into structured, semantically enriched r… ▽ More

    Submitted 18 December, 2025; originally announced December 2025.

  6. arXiv:2512.16339  [pdf

    cs.DL cs.DB

    Beyond openness: Inclusiveness and usability of Chinese scholarly data in OpenAlex

    Authors: Lin Zhang, Zhe Cao, Jianhua Liu, Nees Jan van Eck

    Abstract: OpenAlex, launched in 2022 as a fully open scholarly data source, promises greater inclusiveness compared to traditional proprietary databases. This study evaluates whether OpenAlex delivers on that promise by examining its coverage and metadata quality for Chinese-language journals and their articles. Using the 2023 edition of A Guide to the Core Journals of China (GCJC) and Wanfang Data as a ben… ▽ More

    Submitted 18 December, 2025; originally announced December 2025.

  7. arXiv:2512.14689  [pdf, ps, other

    cs.RO cs.LG

    CHIP: Adaptive Compliance for Humanoid Control through Hindsight Perturbation

    Authors: Sirui Chen, Zi-ang Cao, Zhengyi Luo, Fernando CastaƱeda, Chenran Li, Tingwu Wang, Ye Yuan, Linxi "Jim" Fan, C. Karen Liu, Yuke Zhu

    Abstract: Recent progress in humanoid robots has unlocked agile locomotion skills, including backflipping, running, and crawling. Yet it remains challenging for a humanoid robot to perform forceful manipulation tasks such as moving objects, wiping, and pushing a cart. We propose adaptive Compliance Humanoid control through hIsight Perturbation (CHIP), a plug-and-play module that enables controllable end-eff… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: The first two authors contributed equally. Project page: https://nvlabs.github.io/CHIP/

  8. arXiv:2512.14082  [pdf, ps, other

    cs.CL

    A Unified Sparse Attention via Multi-Granularity Compression

    Authors: Siran Liu, Zane Cao, Yongchao He

    Abstract: Efficient long-context understanding and reasoning are increasingly vital for large language model (LLM) applications such as multi-turn dialogue and program analysis. However, the core self-attention mechanism scales quadratically with sequence length, creating a fundamental computational bottleneck. Existing sparse attention methods alleviate this issue but face trade-offs: training-based method… ▽ More

    Submitted 15 December, 2025; originally announced December 2025.

  9. arXiv:2512.13151  [pdf, ps, other

    astro-ph.HE

    Multiband gravitational wave observations of eccentric escaping binary black holes from globular clusters

    Authors: Yuetong Zhao, Abbas Askar, Youjun Lu, Zhoujian Cao, Mirek Giersz, Grzegorz Wiktorowicz, Arkadiusz Hypki, Lucas Hellstrom, Sohaib Ali, Wei-Tou Ni

    Abstract: Stellar-mass binary black holes (sBBHs) formed in globular clusters (GCs) are promising sources for multiband gravitational wave (GW) observations, particularly with low- and middle-frequency detectors. These sBBHs can retain detectable eccentricities when they enter the sensitivity bands of low-frequency GW observatories. We study multiband GW observations of eccentric sBBHs that escape from GC m… ▽ More

    Submitted 15 December, 2025; originally announced December 2025.

    Comments: 19 pages, 3 tables, and 10 figures, submitted to Astrophysical Journal (ApJ)

  10. arXiv:2512.13094  [pdf, ps, other

    cs.RO cs.AI

    Sequence of Expert: Boosting Imitation Planners for Autonomous Driving through Temporal Alternation

    Authors: Xiang Li, Gang Liu, Weitao Zhou, Hongyi Zhu, Zhong Cao

    Abstract: Imitation learning (IL) has emerged as a central paradigm in autonomous driving. While IL excels in matching expert behavior in open-loop settings by minimizing per-step prediction errors, its performance degrades unexpectedly in closed-loop due to the gradual accumulation of small, often imperceptible errors over time.Over successive planning cycles, these errors compound, potentially resulting i… ▽ More

    Submitted 15 December, 2025; originally announced December 2025.

  11. arXiv:2512.12682  [pdf, ps, other

    hep-th

    $\mathcal{N} = (0, 2)$ higher-spin supergravity in AdS$_3$

    Authors: Zisong Cao

    Abstract: In this paper we generalize Vasiliev's higher-spin gravity theory in 3d into $\mathcal{N} = (0, 2)$ case, by which we mean that the asymptotic symmetry of such a gravity theory have the structure of 2d $\mathcal{N} = (0, 2)$ superconformal algebra. While the construction is limited to linearized level, asymptotic symmetry and possible matter content of such theories is discussed. Also, the 1-loop… ▽ More

    Submitted 14 December, 2025; originally announced December 2025.

    Comments: 28 pages, 3 tables

  12. arXiv:2512.11286  [pdf, ps, other

    quant-ph cs.CR

    A Survey of OAM-Encoded High-Dimensional Quantum Key Distribution: Foundations, Experiments, and Recent Trends

    Authors: Huan Zhang, Zhenyu Cao, Yu Sun, Hu Jin

    Abstract: High-dimensional quantum key distribution (HD-QKD) enhances information efficiency and noise tolerance by encoding data in large Hilbert spaces. The orbital angular momentum (OAM) of light provides a scalable basis for such encoding and supports high-dimensional photonic communication. Practical OAM-based implementations remain constrained by challenges in state generation, transmission, and detec… ▽ More

    Submitted 12 December, 2025; originally announced December 2025.

    Comments: 20 pages, 5 figures, submitted to ICT Express

    MSC Class: 81P94

  13. arXiv:2512.11262  [pdf, ps, other

    gr-qc hep-th

    Gravitational Wave Detection Based on Gravitomagnetic Effects

    Authors: Yu-Qi Dong, Zhoujian Cao, Yu-Xiao Liu

    Abstract: In this paper, we explore the feasibility of detecting gravitomagnetic effects generated by gravitational waves, by monitoring the relative orientation of the angular momentum vectors of test particles. We analyze the response of the relative angular momentum direction to all six polarization modes of gravitational waves and estimate the magnitude of its variation during gravitational wave events.… ▽ More

    Submitted 11 December, 2025; originally announced December 2025.

    Comments: 15 pages, 1 figure

  14. arXiv:2512.10878  [pdf, ps, other

    cs.LG

    Classifier Reconstruction Through Counterfactual-Aware Wasserstein Prototypes

    Authors: Xuan Zhao, Zhuo Cao, Arya Bangun, Hanno Scharr, Ira Assent

    Abstract: Counterfactual explanations provide actionable insights by identifying minimal input changes required to achieve a desired model prediction. Beyond their interpretability benefits, counterfactuals can also be leveraged for model reconstruction, where a surrogate model is trained to replicate the behavior of a target model. In this work, we demonstrate that model reconstruction can be significantly… ▽ More

    Submitted 11 December, 2025; originally announced December 2025.

    Comments: Accepted by Actionable Interpretability Workshop at ICML 2025

  15. arXiv:2512.10696  [pdf, ps, other

    cs.AI cs.CL

    Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution

    Authors: Zouying Cao, Jiaji Deng, Li Yu, Weikang Zhou, Zhaoyang Liu, Bolin Ding, Hai Zhao

    Abstract: Procedural memory enables large language model (LLM) agents to internalize "how-to" knowledge, theoretically reducing redundant trial-and-error. However, existing frameworks predominantly suffer from a "passive accumulation" paradigm, treating memory as a static append-only archive. To bridge the gap between static storage and dynamic reasoning, we propose $\textbf{ReMe}$ (… ▽ More

    Submitted 11 December, 2025; originally announced December 2025.

    Comments: 16 pages, 9 figures, 9 tables

  16. arXiv:2512.09537  [pdf, ps, other

    cs.RO

    REASAN: Learning Reactive Safe Navigation for Legged Robots

    Authors: Qihao Yuan, Ziyu Cao, Ming Cao, Kailai Li

    Abstract: We present a novel modularized end-to-end framework for legged reactive navigation in complex dynamic environments using a single light detection and ranging (LiDAR) sensor. The system comprises four simulation-trained modules: three reinforcement-learning (RL) policies for locomotion, safety shielding, and navigation, and a transformer-based exteroceptive estimator that processes raw point-cloud… ▽ More

    Submitted 10 December, 2025; originally announced December 2025.

    Comments: 8 pages

  17. arXiv:2512.06977  [pdf, ps, other

    eess.IV cs.LG

    Physics-Guided Diffusion Priors for Multi-Slice Reconstruction in Scientific Imaging

    Authors: Laurentius Valdy, Richard D. Paul, Alessio Quercia, Zhuo Cao, Xuan Zhao, Hanno Scharr, Arya Bangun

    Abstract: Accurate multi-slice reconstruction from limited measurement data is crucial to speed up the acquisition process in medical and scientific imaging. However, it remains challenging due to the ill-posed nature of the problem and the high computational and memory demands. We propose a framework that addresses these challenges by integrating partitioned diffusion priors with physics-based constraints.… ▽ More

    Submitted 7 December, 2025; originally announced December 2025.

    Comments: 8 pages, 5 figures, AAAI AI2ASE 2026

  18. arXiv:2512.05760  [pdf, ps, other

    cs.AI

    Evolutionary System 2 Reasoning: An Empirical Proof

    Authors: Zeyuan Ma, Wenqi Huang, Guo-Huan Song, Hongshu Guo, Sijie Ma, Zhiguang Cao, Yue-Jiao Gong

    Abstract: Machine intelligence marks the ultimate dream of making machines' intelligence comparable to human beings. While recent progress in Large Language Models (LLMs) show substantial specific skills for a wide array of downstream tasks, they more or less fall shorts in general intelligence. Following correlation between intelligence and system 2 reasoning (slow thinking), in this paper, we aim to answe… ▽ More

    Submitted 5 December, 2025; originally announced December 2025.

  19. arXiv:2512.05348  [pdf, ps, other

    eess.SY

    Comparative Analysis of Barrier-like Function Methods for Reach-Avoid Verification in Stochastic Discrete-Time Systems

    Authors: Zhipeng Cao, Peixin Wang, Luke Ong, Đorđe Žikelić, Dominik Wagner, Bai Xue

    Abstract: In this paper, we compare several representative barrier-like conditions from the literature for infinite-horizon reach-avoid verification of stochastic discrete-time systems. Our comparison examines both their theoretical properties and computational tractability, highlighting each condition's strengths and limitations that affect applicability and conservativeness. Finally, we illustrate their p… ▽ More

    Submitted 4 December, 2025; originally announced December 2025.

    Comments: 23pages, 5tables

  20. arXiv:2512.05115  [pdf, ps, other

    cs.CV

    Light-X: Generative 4D Video Rendering with Camera and Illumination Control

    Authors: Tianqi Liu, Zhaoxi Chen, Zihao Huang, Shaocong Xu, Saining Zhang, Chongjie Ye, Bohan Li, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei Liu

    Abstract: Recent advances in illumination control extend image-based methods to video, yet still facing a trade-off between lighting fidelity and temporal consistency. Moving beyond relighting, a key step toward generative modeling of real-world scenes is the joint control of camera trajectory and illumination, since visual dynamics are inherently shaped by both geometry and lighting. To this end, we presen… ▽ More

    Submitted 15 December, 2025; v1 submitted 4 December, 2025; originally announced December 2025.

    Comments: Project Page: https://lightx-ai.github.io/ , Code: https://github.com/TQTQliu/Light-X

  21. arXiv:2512.04556  [pdf, ps, other

    cs.GR cs.CV

    Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

    Authors: Zhizhen Wu, Zhe Cao, Yuchi Huo

    Abstract: Image convolution with complex kernels is a fundamental operation in photography, scientific imaging, and animation effects, yet direct dense convolution is computationally prohibitive on resource-limited devices. Existing approximations, such as simulated annealing or low-rank decompositions, either lack efficiency or fail to capture non-convex kernels. We introduce a differentiable kernel decomp… ▽ More

    Submitted 4 December, 2025; originally announced December 2025.

    Comments: 10 pages, 7 figures

  22. arXiv:2512.04358  [pdf

    cs.CV

    MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching

    Authors: Ao Xu, Rujin Zhao, Xiong Xu, Boceng Huang, Yujia Jia, Hongfeng Long, Fuxuan Chen, Zilong Cao, Fangyuan Chen

    Abstract: Existing stereo matching networks typically rely on either cost-volume construction based on 3D convolutions or deformation methods based on iterative optimization. The former incurs significant computational overhead during cost aggregation, whereas the latter often lacks the ability to model non-local contextual information. These methods exhibit poor compatibility on resource-constrained mobile… ▽ More

    Submitted 3 December, 2025; originally announced December 2025.

  23. arXiv:2512.01976  [pdf, ps, other

    q-bio.BM

    Consistent Synthetic Sequences Unlock Structural Diversity in Fully Atomistic De Novo Protein Design

    Authors: Danny Reidenbach, Zhonglin Cao, Zuobai Zhang, Kieran Didi, Tomas Geffner, Guoqing Zhou, Jian Tang, Christian Dallago, Arash Vahdat, Emine Kucukbenli, Karsten Kreis

    Abstract: High-quality training datasets are crucial for the development of effective protein design models, but existing synthetic datasets often include unfavorable sequence-structure pairs, impairing generative model performance. We leverage ProteinMPNN, whose sequences are experimentally favorable as well as amenable to folding, together with structure prediction models to align high-quality synthetic s… ▽ More

    Submitted 10 December, 2025; v1 submitted 1 December, 2025; originally announced December 2025.

  24. arXiv:2512.01643  [pdf, ps, other

    cs.CV

    ViT$^3$: Unlocking Test-Time Training in Vision

    Authors: Dongchen Han, Yining Li, Tianyu Li, Zixuan Cao, Ziming Wang, Jun Song, Yu Cheng, Bo Zheng, Gao Huang

    Abstract: Test-Time Training (TTT) has recently emerged as a promising direction for efficient sequence modeling. TTT reformulates attention operation as an online learning problem, constructing a compact inner model from key-value pairs at test time. This reformulation opens a rich and flexible design space while achieving linear computational complexity. However, crafting a powerful visual TTT design rema… ▽ More

    Submitted 1 December, 2025; originally announced December 2025.

  25. arXiv:2512.00719  [pdf, ps, other

    cs.DC

    SIMPLE: Disaggregating Sampling from GPU Inference into a Decision Plane for Faster Distributed LLM Serving

    Authors: Bohan Zhao, Zane Cao, Yongchao He

    Abstract: As large language models (LLMs) scale out with tensor parallelism (TP) and pipeline parallelism (PP) and production stacks have aggressively optimized the data plane (attention/GEMM and KV cache), sampling, the decision plane that turns logits into tokens, becomes a new bottleneck. This creates a structural holdout: sampling neither expands with TP nor balances across PP stages, so its share of it… ▽ More

    Submitted 29 November, 2025; originally announced December 2025.

  26. arXiv:2512.00601  [pdf, ps, other

    cs.AI

    Clinical-R1: Empowering Large Language Models for Faithful and Comprehensive Reasoning with Clinical Objective Relative Policy Optimization

    Authors: Boyang Gu, Hongjian Zhou, Bradley Max Segal, Jinge Wu, Zeyu Cao, Hantao Zhong, Lei Clifton, Fenglin Liu, David A. Clifton

    Abstract: Recent advances in large language models (LLMs) have shown strong reasoning capabilities through large-scale pretraining and post-training reinforcement learning, demonstrated by DeepSeek-R1. However, current post-training methods, such as Grouped Relative Policy Optimization (GRPO), mainly reward correctness, which is not aligned with the multi-dimensional objectives required in high-stakes field… ▽ More

    Submitted 3 December, 2025; v1 submitted 29 November, 2025; originally announced December 2025.

  27. arXiv:2512.00412  [pdf, ps, other

    cs.CR cs.AI

    Red Teaming Large Reasoning Models

    Authors: Jiawei Chen, Yang Yang, Chao Yu, Yu Tian, Zhi Cao, Linghao Li, Hang Su, Zhaoxia Yin

    Abstract: Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical consistency through explicit chains of thought (CoT). However, these models introduce novel safety and reliability risks, such as CoT-hijacking and prompt-induced inefficiencies, which are not fully captured by existing evaluation methods. To address this g… ▽ More

    Submitted 29 November, 2025; originally announced December 2025.

    Comments: 30 pages, 9 figures

  28. arXiv:2511.22869  [pdf, ps, other

    cs.CL

    JBE-QA: Japanese Bar Exam QA Dataset for Assessing Legal Domain Knowledge

    Authors: Zhihan Cao, Fumihito Nishino, Hiroaki Yamada, Nguyen Ha Thanh, Yusuke Miyao, Ken Satoh

    Abstract: We introduce JBE-QA, a Japanese Bar Exam Question-Answering dataset to evaluate large language models' legal knowledge. Derived from the multiple-choice (tanto-shiki) section of the Japanese bar exam (2015-2024), JBE-QA provides the first comprehensive benchmark for Japanese legal-domain evaluation of LLMs. It covers the Civil Code, the Penal Code, and the Constitution, extending beyond the Civil… ▽ More

    Submitted 27 November, 2025; originally announced November 2025.

    Comments: Three tables and one figure

  29. arXiv:2511.22466  [pdf, ps, other

    cs.CV

    RoadSceneBench: A Lightweight Benchmark for Mid-Level Road Scene Understanding

    Authors: Xiyan Liu, Han Wang, Yuhu Wang, Junjie Cai, Zhe Cao, Jianzhong Yang, Zhen Lu

    Abstract: Understanding mid-level road semantics, which capture the structural and contextual cues that link low-level perception to high-level planning, is essential for reliable autonomous driving and digital map construction. However, existing benchmarks primarily target perception tasks such as detection or segmentation, overlooking the reasoning capabilities required to infer road topology and dynamic… ▽ More

    Submitted 27 November, 2025; originally announced November 2025.

  30. arXiv:2511.21328  [pdf, ps, other

    astro-ph.EP

    Planet Migration in Protoplanetary Disks with Rims

    Authors: Zhuoya Cao, Ya-Ping Li, Douglas N. C. Lin, Shude Mao

    Abstract: Complex structures, including sharp edges, rings and gaps, have been commonly observed in protoplanetary disks with or without planetary candidates. Here we consider the possibility that they are the intrinsic consequences of angular momentum transfer mechanisms, and investigate how they may influence the dynamical evolution of embedded planets. With the aid of numerical hydrodynamic simulations,… ▽ More

    Submitted 26 November, 2025; originally announced November 2025.

    Comments: 13 pages, 7 figures, Submitted to APJ

  31. arXiv:2511.21212  [pdf, ps, other

    physics.optics

    Flexible mm-Wave Frequency and High-Speed Arbitrary IQ Signal Synthesis by a Photonic System on Chip

    Authors: Bowen Zhu, Tao Zhu, Yazhi Pi, Chunyang Ma, Xiaochuan Xu, Zizheng Cao, Lei Wang, Shaohua Yu

    Abstract: Photonics-assisted millimeter-wave bands and terahertz signal generation offer significant advantages over traditional electronic methods by leveraging the inherent benefits of optical components, including broad bandwidth, low power consumption, and minimal insertion loss. This work utilizes a silicon photonic chip in conjunction with a reconfigurable optical frequency comb to demonstrate the syn… ▽ More

    Submitted 26 November, 2025; originally announced November 2025.

    Comments: 11 pages, 8 figures

  32. arXiv:2511.17190  [pdf, ps, other

    cs.CL cs.DB

    AutoLink: Autonomous Schema Exploration and Expansion for Scalable Schema Linking in Text-to-SQL at Scale

    Authors: Ziyang Wang, Yuanlei Zheng, Zhenbiao Cao, Xiaojin Zhang, Zhongyu Wei, Pei Fu, Zhenbo Luo, Wei Chen, Xiang Bai

    Abstract: For industrial-scale text-to-SQL, supplying the entire database schema to Large Language Models (LLMs) is impractical due to context window limits and irrelevant noise. Schema linking, which filters the schema to a relevant subset, is therefore critical. However, existing methods incur prohibitive costs, struggle to trade off recall and noise, and scale poorly to large databases. We present \textb… ▽ More

    Submitted 21 November, 2025; originally announced November 2025.

  33. arXiv:2511.16128  [pdf, ps, other

    math.DG

    Rigidity of five-dimensional quasi-Einstein manifolds with constant scalar curvature

    Authors: Zhongxian Cao

    Abstract: Let $(M^5,g)$ be a five-dimensional non-trivial simply-connected compact quasi-Einstein manifold with boundary. If $M$ has constant scalar $R$, Johnatan Costa, Ernani Ribeiro Jr, and Detang Zhou show that $R$ = $((m-5)k+20)/(m-k+4)Ī»$ for some $k\in\{0,2,3,4\}$. Both cases of $k=0$ and $k=4$ are already classified. In this paper we will prove that the case $k=3$ is rigid.

    Submitted 20 November, 2025; originally announced November 2025.

    Comments: 23 pages. Comments are welcome

  34. arXiv:2511.15066  [pdf, ps, other

    cs.CV

    BokehFlow: Depth-Free Controllable Bokeh Rendering via Flow Matching

    Authors: Yachuan Huang, Xianrui Luo, Qiwen Wang, Liao Shen, Jiaqi Li, Huiqiang Sun, Zihao Huang, Wei Jiang, Zhiguo Cao

    Abstract: Bokeh rendering simulates the shallow depth-of-field effect in photography, enhancing visual aesthetics and guiding viewer attention to regions of interest. Although recent approaches perform well, rendering controllable bokeh without additional depth inputs remains a significant challenge. Existing classical and neural controllable methods rely on accurate depth maps, while generative approaches… ▽ More

    Submitted 18 November, 2025; originally announced November 2025.

  35. arXiv:2511.13744  [pdf, ps, other

    cs.CV cs.AI cs.RO

    nuCarla: A nuScenes-Style Bird's-Eye View Perception Dataset for CARLA Simulation

    Authors: Zhijie Qiao, Zhong Cao, Henry X. Liu

    Abstract: End-to-end (E2E) autonomous driving heavily relies on closed-loop simulation, where perception, planning, and control are jointly trained and evaluated in interactive environments. Yet, most existing datasets are collected from the real world under non-interactive conditions, primarily supporting open-loop learning while offering limited value for closed-loop testing. Due to the lack of standardiz… ▽ More

    Submitted 12 November, 2025; originally announced November 2025.

  36. arXiv:2511.13648  [pdf, ps, other

    cs.CV cs.RO

    PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

    Authors: Ziang Cao, Fangzhou Hong, Zhaoxi Chen, Liang Pan, Ziwei Liu

    Abstract: 3D modeling is shifting from static visual representations toward physical, articulated assets that can be directly used in simulation and interaction. However, most existing 3D generation methods overlook key physical and articulation properties, thereby limiting their utility in embodied AI. To bridge this gap, we introduce PhysX-Anything, the first simulation-ready physical 3D generative framew… ▽ More

    Submitted 17 November, 2025; originally announced November 2025.

    Comments: Project page: https://physx-anything.github.io/

  37. arXiv:2511.13598  [pdf, ps, other

    cs.CR cs.AI

    Robust Client-Server Watermarking for Split Federated Learning

    Authors: Jiaxiong Tang, Zhengchunmin Dai, Liantao Wu, Peng Sun, Honglong Chen, Zhenfu Cao

    Abstract: Split Federated Learning (SFL) is renowned for its privacy-preserving nature and low computational overhead among decentralized machine learning paradigms. In this framework, clients employ lightweight models to process private data locally and transmit intermediate outputs to a powerful server for further computation. However, SFL is a double-edged sword: while it enables edge computing and enhan… ▽ More

    Submitted 17 November, 2025; originally announced November 2025.

  38. arXiv:2511.12939  [pdf, ps, other

    cs.CV

    Semi-Supervised High Dynamic Range Image Reconstructing via Bi-Level Uncertain Area Masking

    Authors: Wei Jiang, Jiahao Cui, Yizheng Wu, Zhan Peng, Zhiyu Pan, Zhiguo Cao

    Abstract: Reconstructing high dynamic range (HDR) images from low dynamic range (LDR) bursts plays an essential role in the computational photography. Impressive progress has been achieved by learning-based algorithms which require LDR-HDR image pairs. However, these pairs are hard to obtain, which motivates researchers to delve into the problem of annotation-efficient HDR image reconstructing: how to achie… ▽ More

    Submitted 16 November, 2025; originally announced November 2025.

    Comments: 9 pages, 5 figures, accepted to AAAI 2026 (poster)

  39. arXiv:2511.12921  [pdf, ps, other

    cs.CV

    Generative Photographic Control for Scene-Consistent Video Cinematic Editing

    Authors: Huiqiang Sun, Liao Shen, Zhan Peng, Kun Wang, Size Wu, Yuhang Zang, Tianqi Liu, Zihao Huang, Xingyu Zeng, Zhiguo Cao, Wei Li, Chen Change Loy

    Abstract: Cinematic storytelling is profoundly shaped by the artful manipulation of photographic elements such as depth of field and exposure. These effects are crucial in conveying mood and creating aesthetic appeal. However, controlling these effects in generative video models remains highly challenging, as most existing methods are restricted to camera motion control. In this paper, we propose CineCtrl,… ▽ More

    Submitted 16 November, 2025; originally announced November 2025.

  40. arXiv:2511.12792  [pdf, ps, other

    cs.AI

    Multi-Agent Reinforcement Learning for Heterogeneous Satellite Cluster Resources Optimization

    Authors: Mohamad A. Hady, Siyi Hu, Mahardhika Pratama, Zehong Cao, Ryszard Kowalczyk

    Abstract: This work investigates resource optimization in heterogeneous satellite clusters performing autonomous Earth Observation (EO) missions using Reinforcement Learning (RL). In the proposed setting, two optical satellites and one Synthetic Aperture Radar (SAR) satellite operate cooperatively in low Earth orbit to capture ground targets and manage their limited onboard resources efficiently. Traditiona… ▽ More

    Submitted 16 November, 2025; originally announced November 2025.

  41. arXiv:2511.12481  [pdf, ps, other

    astro-ph.IM astro-ph.GA

    Mock Observations for the CSST Mission: Multi-Channel Imager--Instrument Simulation

    Authors: Zhao-Jun Yan, Huan-Yuan Shan, Zhen-Ya Zheng, Xi-Yan Peng, Zhao-Xiang Qi, Chun Xu, Lin Lin, Xin-Rong Wen, Chun-Yan Jiang, Li-Xin Zheng, Jing Zhong, Fang-Ting Yuan, Zhen-Lei Chen, Wei Chen, Mao-Chun Wu, Zhen-Sen Fu, Ke-Xin Li, Lin Nie, Chao Liu, Nan Li, Qiao Wang, Zi-Huang Cao, Shuai Feng, Guo-Liang Li, Lei Wang , et al. (18 additional authors not shown)

    Abstract: The Chinese Space Station Survey Telescope (CSST), a two-meter aperture astronomical space telescope under China's manned space program, is equipped with multiple back-end scientific instruments. As an astronomical precision measurement module of the CSST, the Multi-Channel Imager (MCI) can cover a wide wavelength range from ultraviolet to near-infrared with three-color simultaneous high-precision… ▽ More

    Submitted 16 November, 2025; originally announced November 2025.

    Comments: 28 pages, 21 figures, accepted by RAA

  42. arXiv:2511.11814  [pdf, ps, other

    astro-ph.EP

    On the coincidence between the close passage of HD7977 and the Pliocene-Pleistocene transition

    Authors: Zhuoya Cao, Abraham Loeb, Morgan MacLeod

    Abstract: The Oort Cloud's dynamical evolution is significantly influenced by both the galactic tide and stellar flybys. This study investigates the particular case of HD7977's close encounter 2.47 Myr ago, which likely repopulated the Inner Oort Cloud and potentially triggered a significant comet shower on Earth. Our results demonstrate that the shower's intensity strongly depends on HD7977's impact parame… ▽ More

    Submitted 14 November, 2025; originally announced November 2025.

    Comments: 15 pages, 4 figures, accepted by Scientific Reports

  43. arXiv:2511.10784  [pdf, ps, other

    cond-mat.supr-con cond-mat.mtrl-sci

    Raman fingerprint of high-temperature superconductivity in compressed hydrides

    Authors: Philip Dalladay-Simpson, Guglielmo Marchese, Zi-Yu Cao, Paolo Barone, Lara Benfatto, Gaston Garbarino, Francesco Mauri, Federico Aiace Gorelli

    Abstract: The discovery of high-temperature superconductivity in hydrogen-rich compounds under extreme pressures has prompted great excitement, intense research, but also debate over the past decade. Electrical transport has been the primary diagnostic tool for identifying superconductivity in these systems, whereas complementary probes, including magnetic, spectroscopic, tunnelling and ultrafast methods, r… ▽ More

    Submitted 13 November, 2025; originally announced November 2025.

  44. arXiv:2511.10395  [pdf, ps, other

    cs.LG cs.AI cs.CL

    AgentEvolver: Towards Efficient Self-Evolving Agent System

    Authors: Yunpeng Zhai, Shuchang Tao, Cheng Chen, Anni Zou, Ziqian Chen, Qingxu Fu, Shinji Mai, Li Yu, Jiaji Deng, Zouying Cao, Zhaoyang Liu, Bolin Ding, Jingren Zhou

    Abstract: Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments. However, current approaches to developing such agents remain costly and inefficient, as they typically require manually constructed task datasets and reinforcement learning (RL) pipelines with extens… ▽ More

    Submitted 13 November, 2025; originally announced November 2025.

  45. arXiv:2511.10233  [pdf, ps, other

    cs.AI cs.LG cs.NE

    Bridging Synthetic and Real Routing Problems via LLM-Guided Instance Generation and Progressive Adaptation

    Authors: Jianghan Zhu, Yaoxin Wu, Zhuoyi Lin, Zhengyuan Zhang, Haiyan Yin, Zhiguang Cao, Senthilnath Jayavelu, Xiaoli Li

    Abstract: Recent advances in Neural Combinatorial Optimization (NCO) methods have significantly improved the capability of neural solvers to handle synthetic routing instances. Nonetheless, existing neural solvers typically struggle to generalize effectively from synthetic, uniformly-distributed training data to real-world VRP scenarios, including widely recognized benchmark instances from TSPLib and CVRPLi… ▽ More

    Submitted 13 November, 2025; originally announced November 2025.

    Comments: 21 pages; To be published in AAAI-26

  46. arXiv:2511.09915  [pdf, ps, other

    cs.CL cs.MM cs.SD

    HI-TransPA: Hearing Impairments Translation Personal Assistant

    Authors: Zhiming Ma, Shiyu Gan, Junhao Zhao, Xianming Li, Qingyun Pan, Peidong Wang, Mingjun Pan, Yuhao Mo, Jiajie Cheng, Chengxin Chen, Zhonglun Cao, Chonghan Liu, Shi Cheng

    Abstract: Hearing-impaired individuals often face significant barriers in daily communication due to the inherent challenges of producing clear speech. To address this, we introduce the Omni-Model paradigm into assistive technology and present HI-TransPA, an instruction-driven audio-visual personal assistant. The model fuses indistinct speech with lip dynamics, enabling both translation and dialogue within… ▽ More

    Submitted 14 November, 2025; v1 submitted 12 November, 2025; originally announced November 2025.

  47. arXiv:2511.09149  [pdf, ps, other

    cs.LG cs.AI cs.MA

    Enabling Agents to Communicate Entirely in Latent Space

    Authors: Zhuoyun Du, Runze Wang, Huiyu Bai, Zouying Cao, Xiaoyong Zhu, Bo Zheng, Wei Chen, Haochao Ying

    Abstract: While natural language is the de facto communication medium for LLM-based agents, it presents a fundamental constraint. The process of downsampling rich, internal latent states into discrete tokens inherently limits the depth and nuance of information that can be transmitted, thereby hindering collaborative problem-solving. Inspired by human mind-reading, we propose Interlat (Inter-agent Latent Sp… ▽ More

    Submitted 12 November, 2025; originally announced November 2025.

    Comments: Work in progess

  48. arXiv:2511.07820  [pdf, ps, other

    cs.RO cs.AI cs.CV cs.GR eess.SY

    SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control

    Authors: Zhengyi Luo, Ye Yuan, Tingwu Wang, Chenran Li, Sirui Chen, Fernando CastaƱeda, Zi-Ang Cao, Jiefeng Li, David Minor, Qingwei Ben, Xingye Da, Runyu Ding, Cyrus Hogg, Lina Song, Edy Lim, Eugene Jeong, Tairan He, Haoru Xue, Wenli Xiao, Zi Wang, Simon Yuen, Jan Kautz, Yan Chang, Umar Iqbal, Linxi "Jim" Fan , et al. (1 additional authors not shown)

    Abstract: Despite the rise of billion-parameter foundation models trained across thousands of GPUs, similar scaling gains have not been shown for humanoid control. Current neural controllers for humanoids remain modest in size, target a limited set of behaviors, and are trained on a handful of GPUs over several days. We show that scaling up model capacity, data, and compute yields a generalist humanoid cont… ▽ More

    Submitted 4 December, 2025; v1 submitted 10 November, 2025; originally announced November 2025.

    Comments: Project page: https://nvlabs.github.io/SONIC/

  49. Multipartite steering verification with imprecise measurements

    Authors: Zeyang Lu, Chan Li, Gang Wang, Zhu Cao

    Abstract: Quantum steering is a fundamental quantum correlation that plays a pivotal role in quantum technologies, but its verification crucially relies on precise measurements -- an assumption often undermined by practical imperfections. Here, we investigate multipartite steering verification under imprecise measurements and develop a quantitative method that effectively eliminates false positives induced… ▽ More

    Submitted 10 November, 2025; originally announced November 2025.

    Comments: 9 pages, 5 figures

    Journal ref: Physical Review A 112, 042435 (2025)

  50. arXiv:2511.06970  [pdf, ps, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.SR

    Mock Observations for the CSST Mission: Main Surveys--An Overview of Framework and Simulation Suite

    Authors: Cheng-Liang Wei, Guo-Liang Li, Yue-Dong Fang, Xin Zhang, Yu Luo, Hao Tian, De-Zi Liu, Xian-Ming Meng, Zhang Ban, Xiao-Bo Li, Zun Luo, Jing-Tian Xian, Wei Wang, Xi-Yan Peng, Nan Li, Ran Li, Li Shao, Tian-Meng Zhang, Jing Tang, Yang Chen, Zhao-Xiang Qi, Zi-Huang Cao, Huan- Yuan Shan, Lin Nie, Lei Wang , et al. (4 additional authors not shown)

    Abstract: The Chinese Space Station Survey Telescope (CSST) is a flagship space-based observatory. Its main survey camera is designed to conduct high spatial resolution near-ultraviolet to near-infrared imaging and low-resolution spectroscopic surveys. To maximize the scientific output of CSST, we have developed a comprehensive, high-fidelity simulation pipeline for reproducing both imaging and spectroscopi… ▽ More

    Submitted 10 November, 2025; originally announced November 2025.

    Comments: 38 pages, 22 figures, accepted for publication in RAA. The image simulation code is now publicly accessible at https://csst-tb.bao.ac.cn/code/csst-sims/csst_msc_sim