Skip to main content

Showing 1–50 of 1,418 results for author: Fu, H

.
  1. arXiv:2604.12315  [pdf, ps, other

    cs.CV cs.MM

    GTPBD-MM: A Global Terraced Parcel and Boundary Dataset with Multi-Modality

    Authors: Zhiwei Zhang, Xingyuan Zeng, Xinkai Kong, Kunquan Zhang, Haoyuan Liang, Bohan Shi, Juepeng Zheng, Jianxi Huang, Yutong Lu, Haohuan Fu

    Abstract: Agricultural parcel extraction plays an important role in remote sensing-based agricultural monitoring, supporting parcel surveying, precision management, and ecological assessment. However, existing public benchmarks mainly focus on regular and relatively flat farmland scenes. In contrast, terraced parcels in mountainous regions exhibit stepped terrain, pronounced elevation variation, irregular b… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

    Comments: 15 pages, 11 figures. Submitted to ACM Multimedia 2026 Dataset Track

  2. arXiv:2604.10551  [pdf, ps, other

    cs.CV

    NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

    Authors: Xin Li, Jiachao Gong, Xijun Wang, Shiyao Xiong, Bingchen Li, Suhang Yao, Chao Zhou, Zhibo Chen, Radu Timofte, Yuxiang Chen, Shibo Yin, Yilian Zhong, Yushun Fang, Xilei Zhu, Yahui Wang, Chen Lu, Meisong Zheng, Xiaoxu Chen, Jing Yang, Zhaokun Hu, Jiahui Liu, Ying Chen, Haoran Bai, Sibin Deng, Shengxi Li , et al. (53 additional authors not shown)

    Abstract: This paper presents an overview of the NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models. This challenge utilizes a new short-form UGC (S-UGC) video restoration benchmark, termed KwaiVIR, which is contributed by USTC and Kuaishou Technology. It contains both synthetically distorted videos and real-world short-form UGC videos in the wild. For this edition,… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

    Comments: Accepted by CVPR 2026 workshop; NTIRE 2026

  3. arXiv:2604.08884  [pdf, ps, other

    cs.CV cs.AI

    HM-Bench: A Comprehensive Benchmark for Multimodal Large Language Models in Hyperspectral Remote Sensing

    Authors: Xinyu Zhang, Zurong Mai, Qingmei Li, Zjin Liao, Yibin Wen, Yuhang Chen, Xiaoya Fan, Chan Tsz Ho, Bi Tianyuan, Haoyuan Liang, Ruifeng Su, Zihao Qian, Juepeng Zheng, Jianxi Huang, Yutong Lu, Haohuan Fu

    Abstract: While multimodal large language models (MLLMs) have made significant strides in natural image understanding, their ability to perceive and reason over hyperspectral image (HSI) remains underexplored, which is a vital modality in remote sensing. The high dimensionality and intricate spectral-spatial properties of HSI pose unique challenges for models primarily trained on RGB data.To address this ga… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

  4. arXiv:2604.07937  [pdf, ps, other

    cs.CL

    HCRE: LLM-based Hierarchical Classification for Cross-Document Relation Extraction with a Prediction-then-Verification Strategy

    Authors: Guoqi Ma, Liang Zhang, Hongyao Tu, Hao Fu, Hui Li, Yujie Lin, Longyue Wang, Weihua Luo, Jinsong Su

    Abstract: Cross-document relation extraction (RE) aims to identify relations between the head and tail entities located in different documents. Existing approaches typically adopt the paradigm of ``\textit{Small Language Model (SLM) + Classifier}''. However, the limited language understanding ability of SLMs hinders further improvement of their performance. In this paper, we conduct a preliminary study to e… ▽ More

    Submitted 9 April, 2026; originally announced April 2026.

    Comments: ACL 2026 Findings

  5. Predicting Alzheimer's disease progression using rs-fMRI and a history-aware graph neural network

    Authors: Mahdi Moghaddami, Mohammad-Reza Siadat, Austin Toma, Connor Laming, Huirong Fu

    Abstract: Alzheimer's disease (AD) is a neurodegenerative disorder that affects more than seven million people in the United States alone. AD currently has no cure, but there are ways to potentially slow its progression if caught early enough. In this study, we propose a graph neural network (GNN)-based model for predicting whether a subject will transition to a more severe stage of cognitive impairment at… ▽ More

    Submitted 7 April, 2026; originally announced April 2026.

    Comments: Proc. SPIE 13926, Medical Imaging 2026: Computer-Aided Diagnosis, 1392604

    Journal ref: Proceedings Volume 13926, Medical Imaging 2026: Computer-Aided Diagnosis; 1392604 (2026)

  6. arXiv:2604.05418  [pdf, ps, other

    cs.CV cs.AI

    VideoStir: Understanding Long Videos via Spatio-Temporally Structured and Intent-Aware RAG

    Authors: Honghao Fu, Miao Xu, Yiwei Wang, Dailing Zhang, Liu Jun, Yujun Cai

    Abstract: Scaling multimodal large language models (MLLMs) to long videos is constrained by limited context windows. While retrieval-augmented generation (RAG) is a promising remedy by organizing query-relevant visual evidence into a compact context, most existing methods (i) flatten videos into independent segments, breaking their inherent spatio-temporal structure, and (ii) depend on explicit semantic mat… ▽ More

    Submitted 12 April, 2026; v1 submitted 7 April, 2026; originally announced April 2026.

    Comments: Accepted by ACL 2026

  7. arXiv:2604.05295  [pdf, ps, other

    astro-ph.SR physics.space-ph

    Statistics of blob properties in two types of coronal streamers

    Authors: Haiyi Li, Zhenghua Huang, Maria S. Madjarska, Youqian Qi, Hui Fu, Ming Xiong, Lidong Xia

    Abstract: Previous studies have shown that a streamer blob might originate in the lower corona and thus be affected by activity in that region. While the base of one streamer might differ from that of another, it can be cataloged into two distinct types: active region streamers (ARSs) that have active regions at their base, and quiet equatorial streamers (QESs) that do not have an active region underneath.T… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: A&A in press

  8. arXiv:2604.04999  [pdf, ps, other

    cs.LG cs.AI

    PRIME: Prototype-Driven Multimodal Pretraining for Cancer Prognosis with Missing Modalities

    Authors: Kai Yu, Shuang Zhou, Yiran Song, Zaifu Zhan, Jie Peng, Kaixiong Zhou, Tianlong Chen, Feng Xie, Meng Wang, Huazhu Fu, Mingquan Lin, Rui Zhang

    Abstract: Multimodal self-supervised pretraining offers a promising route to cancer prognosis by integrating histopathology whole-slide images, gene expression, and pathology reports, yet most existing approaches require fully paired and complete inputs. In practice, clinical cohorts are fragmented and often miss one or more modalities, limiting both supervised fusion and scalable multimodal pretraining. We… ▽ More

    Submitted 5 April, 2026; originally announced April 2026.

  9. arXiv:2604.01915  [pdf, ps, other

    cs.CV

    Enhancing Medical Visual Grounding via Knowledge-guided Spatial Prompts

    Authors: Yifan Gao, Tao Zhou, Yi Zhou, Ke Zou, Yizhe Zhang, Huazhu Fu

    Abstract: Medical Visual Grounding (MVG) aims to identify diagnostically relevant phrases from free-text radiology reports and localize their corresponding regions in medical images, providing interpretable visual evidence to support clinical decision-making. Although recent Vision-Language Models (VLMs) exhibit promising multimodal reasoning ability, their grounding remains insufficient spatial precision,… ▽ More

    Submitted 2 April, 2026; originally announced April 2026.

    Comments: 10 pages, 6 figures

  10. arXiv:2603.28353  [pdf, ps, other

    cs.CV

    VistaGEN: Consistent Driving Video Generation with Fine-Grained Control Using Multiview Visual-Language Reasoning

    Authors: Li-Heng Chen, Ke Cheng, Yahui Liu, Lei Shi, Shi-Sheng Huang, Hongbo Fu

    Abstract: Driving video generation has achieved much progress in controllability, video resolution, and length, but fails to support fine-grained object-level controllability for diverse driving videos, while preserving the spatiotemporal consistency, especially in long video generation. In this paper, we present a new driving video generation technique, called VistaGEN, which enables fine-grained control o… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

  11. arXiv:2603.28207  [pdf, ps, other

    astro-ph.SR

    Statistics of transition-region loop brightenings and their heating implication

    Authors: Xiuhui Zuo, Zhenghua Huang, Maria S. Madjarska, Hui Fu, Hengyuan Wei, Xinzheng Shi, Lidong Xia

    Abstract: Transition-region loops are a type of critical magnetic structure in the solar atmosphere, yet their physical properties and evolutionary characteristics remain statistically poorly constrained. We aim to statistically characterize the physical properties of propagating brightening events in transition-region loops and to explore the underlying heating mechanism responsible for these brightenings.… ▽ More

    Submitted 30 March, 2026; originally announced March 2026.

    Comments: 11 pages, 7 figures

  12. arXiv:2603.26058  [pdf, ps, other

    math.RT

    Derived Weil Representation and Relative Langlands Duality

    Authors: Haoshuo Fu

    Abstract: The Weil representation is a particularly significant linear representation of the metaplectic group, used in the study of theta correspondence. In this paper, I introduce a derived category version of the Weil representation in the local field case. For the dual pair $ (\mathrm{GL}_n,\mathrm{GL}_m) $, I give a coherent description of this category, in the philosophy of relative Langlands duality.

    Submitted 27 March, 2026; originally announced March 2026.

  13. arXiv:2603.25754  [pdf, ps, other

    cs.IT

    DUGC-VRNet: Joint VR Recognition and Channel Estimation for Spatially Non-Stationary XL-MIMO

    Authors: Jinhao Nie, Guangchi Zhang, Miao Cui, Hao Fu, Xiaoli Chu

    Abstract: In this letter, we address spatially non-stationary near-field channel estimation for extremely large-scale multiple-input multiple-output (XL-MIMO) systems with a hybrid combining architecture. One key challenge in the considered problem lies in that conventional channel estimation algorithms typically struggle to effectively identify and adapt to the partial antenna visibility caused by varying… ▽ More

    Submitted 24 March, 2026; originally announced March 2026.

  14. arXiv:2603.24892  [pdf, ps, other

    astro-ph.SR

    Propagating Kink Waves in Chromospheric Jet-like Structures and Coronal Plumelets

    Authors: Youqian Qi, Mingzhe Guo, Zhenghua Huang, Tom Van Doorsselaere, Bo Li, Lidong Xia, Hengyuan Wei, Hui Fu

    Abstract: Coronal plumes and chromospheric jet-like structures are believed to be highly dynamic. We report the first direct observations of a propagating kink wave in a chromospheric jet-like structure and its associated plumelet structure in the upper corona of the solar polar region, using data from the High Resolution Imager (HRI) of the Extreme Ultraviolet Imager (EUI) on board Solar Orbiter (SO). The… ▽ More

    Submitted 25 March, 2026; originally announced March 2026.

    Comments: 22 pages, 9 figures, accepted for publication in ApJ

  15. arXiv:2603.24513  [pdf, ps, other

    cond-mat.mtrl-sci

    Multiple Topological States in LaAgAs2, a Failed Square-Net Semimetal

    Authors: Yang Liu, Tongrui Li, Xixi Yuan, Nour Maraytta, Alexei V. Fedorov, Asish K. Kundu, Turgut Yilmaz, Elio Vescovo, Xueliang Wu, Long Zhang, Mingquan He, Yisheng Chai, Xiaoyuan Zhou, Michael Merz, Zhe Sun, Huixia Fu, Tonica Valla, Aifeng Wang

    Abstract: The rational design of new materials emerges as an important direction to explore new topological materials, which is based on the understanding of the correlation between crystal and electronic structures. In this paper, we perform a comprehensive study on the crystal and electronic structures in LaAgAs2 through a combination of single-crystal x-ray diffraction (XRD), quantum oscillation, and ang… ▽ More

    Submitted 25 March, 2026; originally announced March 2026.

    Comments: 33 pages, 7 figures. Accepted by npj Quantum Materials

  16. arXiv:2603.23337  [pdf, ps, other

    hep-ph astro-ph.CO hep-ex physics.atom-ph quant-ph

    Dark Matter Detection through Rydberg Atom Transducer

    Authors: J. F. Chen, Haokun Fu, Christina Gao, Jing Shu, Geng-Bo Wu, Peiran Yin, Yi-Ming Zhong, Ying Zuo

    Abstract: Ultralight bosonic dark matter with masses in the meV range, corresponding to terahertz (THz) Compton frequencies, remains largely unexplored due to the difficulty of achieving both efficient signal conversion and single-photon-sensitive detection at THz frequencies. We propose a hybrid detection architecture that integrates a dielectric haloscope, Rydberg-atom transducer, and superconducting nano… ▽ More

    Submitted 24 March, 2026; originally announced March 2026.

    Comments: 12 pages, 5 figures, 3 tables

  17. arXiv:2603.20231  [pdf, ps, other

    cs.CY cs.AI cs.CL

    Moral Mazes in the Era of LLMs

    Authors: Dang Nguyen, Harvey Yiyun Fu, Peter West, Ari Holtzman, Chenhao Tan

    Abstract: Navigating complex social situations is an integral part of corporate life, ranging from giving critical feedback without hurting morale to rejecting requests without alienating teammates. Although large language models (LLMs) are permeating the workplace, it is unclear how well they can navigate these norms. To investigate this question, we created HR Simulator, a game where users roleplay as an… ▽ More

    Submitted 6 April, 2026; v1 submitted 6 March, 2026; originally announced March 2026.

    Comments: 47 pages (including appendix), 7 figures, 2 tables in the main body. v2: updated title and abstract

  18. arXiv:2603.19731  [pdf, ps, other

    cs.CV

    PerformRecast: Expression and Head Pose Disentanglement for Portrait Video Editing

    Authors: Jiadong Liang, Bojun Xiong, Jie Tian, Hua Li, Xiao Long, Yong Zheng, Huan Fu

    Abstract: This paper primarily investigates the task of expression-only portrait video performance editing based on a driving video, which plays a crucial role in animation and film industries. Most existing research mainly focuses on portrait animation, which aims to animate a static portrait image according to the facial motion from the driving video. As a consequence, it remains challenging for them to d… ▽ More

    Submitted 20 March, 2026; originally announced March 2026.

    Comments: Accepted to CVPR 2026. Project Page: https://youku-aigc.github.io/PerformRecast

  19. arXiv:2603.17705  [pdf, ps, other

    cs.CV

    Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation

    Authors: Haocheng Li, Juepeng Zheng, Shuangxi Miao, Ruibo Lu, Guosheng Cai, Haohuan Fu, Jianxi Huang

    Abstract: Multimodal remote sensing semantic segmentation enhances scene interpretation by exploiting complementary physical cues from heterogeneous data. Although pretrained Vision Foundation Models (VFMs) provide strong general-purpose representations, adapting them to multimodal tasks often incurs substantial computational overhead and is prone to modality imbalance, where the contribution of auxiliary m… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

    Comments: 14 pages, 6 figures

  20. arXiv:2603.16038  [pdf, ps, other

    hep-ph

    Probing the equivalence of chiral LCSRs in $D \to πe ν_e$ decays and extraction of $|V_{cd}|$

    Authors: Xiu-Fen Wang, Hai-Jiang Tian, Yin-Long Yang, Long Zeng, Hai-Bing Fu

    Abstract: In the paper, we have carried out research on the $D\toπ$ decay process. We employ two different currents to study the $D\toπ$ transition form factors (TFFs) by using the light-cone sum rule within the framework of chiral current approach. Firstly, we follow the right-handed and left-handed currents for the correlators to present the expression of the vector form factors upto next-leading-order an… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

    Comments: 11 pages, 6 figures, accepted for publication in Nucl.Phys.A

  21. arXiv:2603.15260  [pdf, ps, other

    cs.AI cs.CV

    AGCD: Agent-Guided Cross-Modal Decoding for Weather Forecasting

    Authors: Jing Wu, Yang Liu, Lin Zhang, Junbo Zeng, Jiabin Wang, Zi Ye, Guowen Li, Shilei Cao, Jiashun Cheng, Fang Wang, Meng Jin, Yerong Feng, Hong Cheng, Yutong Lu, Haohuan Fu, Juepeng Zheng

    Abstract: Accurate weather forecasting is more than grid-wise regression: it must preserve coherent synoptic structures and physical consistency of meteorological fields, especially under autoregressive rollouts where small one-step errors can amplify into structural bias. Existing physics-priors approaches typically impose global, once-for-all constraints via architectures, regularization, or NWP coupling,… ▽ More

    Submitted 16 March, 2026; originally announced March 2026.

  22. arXiv:2603.14342  [pdf, ps, other

    cs.CV cs.AI

    AgroNVILA: Perception-Reasoning Decoupling for Multi-view Agricultural Multimodal Large Language Models

    Authors: Jiarui Zhang, Junqi Hu, Zurong Mai, Yuhang Chen, Shuohong Lou, Henglian Huang, Lingyuan Zhao, Jianxi Huang, Yutong Lu, Haohuan Fu, Juepeng Zheng

    Abstract: Agricultural multimodal reasoning requires robust spatial understanding across varying scales, from ground-level close-ups to top-down UAV and satellite imagery. Existing Multi-modal Large Language Models (MLLMs) suffer from a significant "terrestrial-centric" bias, causing scale confusion and logic drift during complex agricultural planning. To address this, we introduce the first large-scale Agr… ▽ More

    Submitted 15 March, 2026; originally announced March 2026.

  23. arXiv:2603.09454  [pdf, ps, other

    cs.CR

    ShapeMark: Robust and Diversity-Preserving Watermarking for Diffusion Models

    Authors: Yuqi Qian, Yun Cao, Haocheng Fu, Meiyang Lv, Meineng Zhu

    Abstract: Diffusion models have made substantial advances in recent years, enabling high-quality image synthesis; however, the widespread dissemination and reuse of their outputs have introduced new challenges in intellectual property protection and content provenance. Image watermarking offers a solution to these challenges, and recent work has increasingly explored Noise-as-Watermark (NaW) approaches that… ▽ More

    Submitted 10 March, 2026; originally announced March 2026.

  24. arXiv:2603.07131  [pdf, ps, other

    cs.CV cs.AI

    Deep Expert Injection for Anchoring Retinal VLMs with Domain-Specific Knowledge

    Authors: Shuai Lu, Meng Wang, Jia Guo, Jiawei Du, Bo Liu, Shengzhu Yang, Weihang Zhang, Huazhu Fu, Huiqi Li

    Abstract: Large Vision Language Models (LVLMs) show immense potential for automated ophthalmic diagnosis. However, their clinical deployment is severely hindered by lacking domain-specific knowledge. In this work, we identify two structural deficiencies hindering reliable medical reasoning: 1) the Perception Gap, where general-purpose visual encoders fail to resolve fine-grained pathological cues (e.g., mic… ▽ More

    Submitted 19 March, 2026; v1 submitted 7 March, 2026; originally announced March 2026.

  25. arXiv:2603.07127  [pdf, ps, other

    cs.IT

    Enhancing User Fairness in Two-Layer RSMA: A Movable Antenna Approach

    Authors: Ji Luo, Yaxuan Chen, Guangchi Zhang, Miao Cui, Hao Fu, Changsheng You

    Abstract: Enhancing user fairness in advanced multi-user systems like two-layer rate-splitting multiple access (RSMA) is a critical yet challenging task. This letter proposes a novel movable antenna (MA) approach to address this challenge. We formulate a max-min fairness problem, maximizing the minimum user rate, a key metric for fairness, through the joint optimization of the beamforming matrices, user clu… ▽ More

    Submitted 7 March, 2026; originally announced March 2026.

  26. arXiv:2603.05911  [pdf, ps, other

    cs.CV cs.AI

    CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning

    Authors: Yuxin Xie, Yuming Chen, Yishan Yang, Yi Zhou, Tao Zhou, Zhen Zhao, Jiacheng Liu, Huazhu Fu

    Abstract: Medical image segmentation is undergoing a paradigm shift from conventional visual pattern matching to cognitive reasoning analysis. Although Multimodal Large Language Models (MLLMs) have shown promise in integrating linguistic and visual knowledge, significant gaps remain: existing general MLLMs possess broad common sense but lack the specialized visual reasoning required for complex lesions, whe… ▽ More

    Submitted 5 March, 2026; originally announced March 2026.

    Comments: Under Review with Computational Visual Media

  27. arXiv:2603.05866  [pdf

    cond-mat.mtrl-sci

    Electrically tunable circular photocurrent via local-field induced symmetry breaking at a metal-MoTe2 interface

    Authors: Butian Zhang, Kexin Wang, Jun-Tao Ma, Yiya Guo, Chengyu Yan, Xin Yi, Luojun Du, Youwei Zhang, Hua-Hua Fu, Shun Wang

    Abstract: Transition metal dichalcogenides (TMDCs) constitute a promising platform for symmetry-engineered responses to circularly polarized light. The high crystal symmetry of centrosymmetric 2H-phase TMDCs inherently forbids the circular photogalvanic effect, thereby necessitating external stimuli such as electric fields or strain to lower the symmetry for its activation. While Schottky junctions provide… ▽ More

    Submitted 5 March, 2026; originally announced March 2026.

  28. arXiv:2603.04896  [pdf, ps, other

    cs.AI

    Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

    Authors: Lianyu Wang, Meng Wang, Huazhu Fu, Daoqiang Zhang

    Abstract: The rapid adoption of vision-language models (VLMs) has heightened the demand for robust intellectual property (IP) protection of these high-value pretrained models. Effective IP protection should proactively confine model deployment within authorized domains and prevent unauthorized transfers. However, existing methods rely on static training-time definitions, limiting flexibility in dynamic envi… ▽ More

    Submitted 5 March, 2026; originally announced March 2026.

  29. arXiv:2603.04639  [pdf, ps, other

    cs.RO cs.AI

    RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

    Authors: Yinpei Dai, Hongze Fu, Jayjun Lee, Yuejiang Liu, Haoran Zhang, Jianing Yang, Chelsea Finn, Nima Fazeli, Joyce Chai

    Abstract: Memory is critical for long-horizon and history-dependent robotic manipulation. Such tasks often involve counting repeated actions or manipulating objects that become temporarily occluded. Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms; however, their evaluations remain confined to narrow, non-standardized settings. This limits their systematic understanding… ▽ More

    Submitted 4 March, 2026; originally announced March 2026.

  30. arXiv:2603.02587  [pdf, ps, other

    hep-ph

    Nature of $K^*(1680)$ and $q\bar{q}$-hybrid mixing as the SU(3) partner of $η_{1}(1855)$ in the strange sector

    Authors: Samee Ullah, Ye Cao, Ming-Xiao Duan, Hai-Bing Fu, Qiang Zhao

    Abstract: We presents an investigation of the $K^*(1680)$ state in its strong decays into two-body finial states within the flux-tube model and quark pair creation model. Since the charge conjugation parity is not conserved in the strange sector, the conventional $q\bar{q}$ states of $J^{P(C)}=1^{-(-)}$ can mix with the lowest hybrid states with $J^{P(C)}=1^{-(+)}$. Our analysis of the $K^*(1680)$ two-body… ▽ More

    Submitted 2 March, 2026; originally announced March 2026.

    Comments: 12 pages, 3 eps figures

  31. arXiv:2603.02138  [pdf, ps, other

    cs.CV

    OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens

    Authors: Yiying Yang, Wei Cheng, Sijin Chen, Honghao Fu, Xianfang Zeng, Yujun Cai, Gang Yu, Xingjun Ma

    Abstract: OmniLottie is a versatile framework that generates high quality vector animations from multi-modal instructions. For flexible motion and visual content control, we focus on Lottie, a light weight JSON formatting for both shapes and animation behaviors representation. However, the raw Lottie JSON files contain extensive invariant structural metadata and formatting tokens, posing significant challen… ▽ More

    Submitted 2 March, 2026; originally announced March 2026.

    Comments: Accepted by CVPR 2026. Project Page: https://openvglab.github.io/OmniLottie/

  32. arXiv:2603.00484  [pdf, ps, other

    cs.CG

    MergeDJD: A Fast Constructive Algorithm with Piece Merging for the Two-Dimensional Irregular Bin Packing Problem

    Authors: Yi Zhou, Haocheng Fu, Yiping Liu, Jian Mao, Zhang-Hua Fu, Yuyi Wang

    Abstract: The two-dimensional irregular bin packing problem (2DIBPP) aims to pack a given set of irregular polygons, referred to as pieces, into fixed-size rectangular bins without overlap, while maximizing bin utilization. Although numerous metaheuristic algorithms have been proposed for the 2DIBPP, many industrial applications favor simpler constructive heuristics due to their deterministic behavior and l… ▽ More

    Submitted 28 February, 2026; originally announced March 2026.

  33. arXiv:2602.23216  [pdf, ps, other

    cs.PL cs.LO cs.SE

    Array-Carrying Symbolic Execution for Function Contract Generation

    Authors: Weijie Lu, Jingyu Ke, Hongfei Fu, Zhouyue Sun, Yi Zhou, Guoqiang Li, Haokun Li

    Abstract: Function contract generation is a classical problem in program analysis that targets the automated analysis of functions in a program with multiple procedures. The problem is fundamental in inter-procedural analysis where properties of functions are first obtained via the generation of function contracts and then the generated contracts are used as building blocks to analyze the whole program. Typ… ▽ More

    Submitted 27 February, 2026; v1 submitted 26 February, 2026; originally announced February 2026.

    Comments: 30 pages, 2 figures. To appear in the 27th International Symposium on Formal Methods (FM 2026)

    MSC Class: 68Q60; 68N30 ACM Class: D.2.4; F.3.1

  34. arXiv:2602.22029  [pdf, ps, other

    cs.SD eess.AS

    MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline

    Authors: Fang-Duo Tsai, Yi-An Lai, Fei-Yueh Chen, Hsueh-Wei Fu, Li Chai, Wei-Jaw Lee, Hao-Chung Cheng, Yi-Hsuan Yang

    Abstract: Song generation aims to produce full songs with vocals and accompaniment from lyrics and text descriptions, yet end-to-end models remain data- and compute-intensive and provide limited editability. We advocate a compositional alternative that decomposes the task into melody composition, singing voice synthesis, and singing accompaniment generation. Central to our approach is MIDI-informed singing… ▽ More

    Submitted 24 February, 2026; originally announced February 2026.

  35. arXiv:2602.21589  [pdf, ps, other

    cs.CV

    SEF-MAP: Subspace-Decomposed Expert Fusion for Robust Multimodal HD Map Prediction

    Authors: Haoxiang Fu, Lingfeng Zhang, Hao Li, Ruibing Hu, Zhengrong Li, Guanjing Liu, Zimu Tan, Long Chen, Hangjun Ye, Xiaoshuai Hao

    Abstract: High-definition (HD) maps are essential for autonomous driving, yet multi-modal fusion often suffers from inconsistency between camera and LiDAR modalities, leading to performance degradation under low-light conditions, occlusions, or sparse point clouds. To address this, we propose SEFMAP, a Subspace-Expert Fusion framework for robust multimodal HD map prediction. The key idea is to explicitly di… ▽ More

    Submitted 25 February, 2026; originally announced February 2026.

  36. Probing $D_s^+ \to η^{(\prime)} \ell^+ν_\ell$ semileptonic decay within LCSR under chiral heavy quark effective field theory

    Authors: Ruiyu Zhou, Hai-Bing Fu, Yi Zhang, Wei Cheng

    Abstract: Motivated by the successful application of Heavy Quark Effective Field Theory in describing decays from heavy to light mesons, this work explores its applicability to the semileptonic decays of charmed mesons. So in this paper we investigate the $D_s^+\to η^{(\prime)} \ell^+ ν_\ell$ transition form factors using the light-cone sum rules approach within the framework of heavy-quark effective field… ▽ More

    Submitted 23 February, 2026; originally announced February 2026.

    Comments: 8 pages, 5 figures. Accepted by Physics Letters B

    Journal ref: Phys. Lett. B 875 (2026) 140297

  37. arXiv:2602.20284  [pdf, ps, other

    cs.SE

    PhantomRun: Auto Repair of Compilation Errors in Embedded Open Source Software

    Authors: Han Fu, Andreas Ermedahl, Sigrid Eldh, Kristian Wiklund, Philipp Haller, Cyrille Artho

    Abstract: Continuous Integration (CI) pipelines for embedded software sometimes fail during compilation, consuming significant developer time for debugging. We study four major open-source embedded system projects, spanning over 4000 build failures from the project's CI runs. We find that hardware dependencies account for the majority of compilation failures, followed by syntax errors and build-script issue… ▽ More

    Submitted 23 February, 2026; originally announced February 2026.

    Comments: 13 pages, 5 figures, Mining Software Repositories 2026 (MSR 2026) , Rio de Janeiro, Brazil, 13-14 April 2026

  38. arXiv:2602.20171  [pdf, ps, other

    quant-ph cs.SE

    QSolver: A Quantum Constraint Solver

    Authors: Shangzhou Xia, Haitao Fu, Jianjun Zhao

    Abstract: With the growing interest in quantum programs, ensuring their correctness is a fundamental challenge. Although constraint-solving techniques can overcome some limitations of traditional testing and verification, they have not yet been sufficiently explored in the context of quantum programs. To address this gap, we present QSolver, the first quantum constraint solver. QSolver provides a structured… ▽ More

    Submitted 10 February, 2026; originally announced February 2026.

  39. arXiv:2602.18094  [pdf, ps, other

    cs.CV cs.AI cs.DB

    OODBench: Out-of-Distribution Benchmark for Large Vision-Language Models

    Authors: Ling Lin, Yang Bai, Heng Su, Congcong Zhu, Yaoxing Wang, Yang Zhou, Huazhu Fu, Jingrun Chen

    Abstract: Existing Visual-Language Models (VLMs) have achieved significant progress by being trained on massive-scale datasets, typically under the assumption that data are independent and identically distributed (IID). However, in real-world scenarios, it is often impractical to expect that all data processed by an AI system satisfy this assumption. Furthermore, failure to appropriately handle out-of-distr… ▽ More

    Submitted 20 February, 2026; originally announced February 2026.

    Comments: 54 pages, 21 figures

  40. arXiv:2602.13778  [pdf, ps, other

    cs.CV

    Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

    Authors: Jidong Jia, Youjian Zhang, Huan Fu, Dacheng Tao

    Abstract: Despite advances in dance generation, most methods are trained in the skeletal domain and ignore mesh-level physical constraints. As a result, motions that look plausible as joint trajectories often exhibit body self-penetration and Foot-Ground Contact (FGC) anomalies when visualized with a human body mesh, reducing the aesthetic appeal of generated dances and limiting their real-world application… ▽ More

    Submitted 14 February, 2026; originally announced February 2026.

  41. arXiv:2602.12972  [pdf, ps, other

    cs.SI cs.LG

    Jointly Optimizing Debiased CTR and Uplift for Coupons Marketing: A Unified Causal Framework

    Authors: Siyun Yang, Shixiao Yang, Jian Wang, Di Fan, Kehe Cai, Haoyan Fu, Jiaming Zhang, Wenjin Wu, Peng Jiang

    Abstract: In online advertising, marketing interventions such as coupons introduce significant confounding bias into Click-Through Rate (CTR) prediction. Observed clicks reflect a mixture of users' intrinsic preferences and the uplift induced by these interventions. This causes conventional models to miscalibrate base CTRs, which distorts downstream ranking and billing decisions. Furthermore, marketing inte… ▽ More

    Submitted 13 February, 2026; originally announced February 2026.

  42. arXiv:2602.11146  [pdf, ps, other

    cs.CV cs.AI

    Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling

    Authors: Gongye Liu, Bo Yang, Yida Zhi, Zhizhou Zhong, Lei Ke, Didan Deng, Han Gao, Yongxiang Huang, Kaihao Zhang, Hongbo Fu, Wenhan Luo

    Abstract: Preference optimization for diffusion and flow-matching models relies on reward functions that are both discriminatively robust and computationally efficient. Vision-Language Models (VLMs) have emerged as the primary reward provider, leveraging their rich multimodal priors to guide alignment. However, their computation and memory cost can be substantial, and optimizing a latent diffusion generator… ▽ More

    Submitted 11 February, 2026; originally announced February 2026.

    Comments: Code: https://github.com/HKUST-C4G/diffusion-rm

  43. arXiv:2602.08586  [pdf, ps, other

    cs.AI

    PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition

    Authors: Yiming Yang, Zhuoyuan Li, Fanxiang Zeng, Hao Fu, Yue Liu

    Abstract: Multi-agent collaboration has emerged as a promising paradigm for enhancing reasoning capabilities of Large Language Models (LLMs). However, existing approaches remain largely heuristic, lacking principled guidance on what drives performance gains and how to systematically optimize multi-agent reasoning. Specifically, it remains unclear why multi-agent collaboration outperforms single-agent reason… ▽ More

    Submitted 10 February, 2026; v1 submitted 9 February, 2026; originally announced February 2026.

  44. arXiv:2602.07313  [pdf, ps, other

    math.DG

    Manifolds with harmonic Weyl curvature and curvature operator of the second kind

    Authors: Haiping Fu, Yao Lu

    Abstract: We prove that a compact Riemannian manifold of dimension $n\ge 8$ with harmonic Weyl curvature and $\frac{3(n-1)(n+2)}{4(3n-1)}$-nonnegative curvature operator of the second kind is either globally conformally equivalent to a space of positive constant curvature or is isometric to a flat manifold. In particular, We also give a classification of four-dimensional manifolds with harmonic Weyl curvatu… ▽ More

    Submitted 6 February, 2026; originally announced February 2026.

  45. arXiv:2602.00531  [pdf, ps, other

    cs.CV

    Enhancing Open-Vocabulary Object Detection through Multi-Level Fine-Grained Visual-Language Alignment

    Authors: Tianyi Zhang, Antoine Simoulin, Kai Li, Sana Lakdawala, Shiqing Yu, Arpit Mittal, Hongyu Fu, Yu Lin

    Abstract: Traditional object detection systems are typically constrained to predefined categories, limiting their applicability in dynamic environments. In contrast, open-vocabulary object detection (OVD) enables the identification of objects from novel classes not present in the training set. Recent advances in visual-language modeling have led to significant progress of OVD. However, prior works face chal… ▽ More

    Submitted 31 January, 2026; originally announced February 2026.

  46. arXiv:2601.23250  [pdf, ps, other

    astro-ph.GA

    Too many or too massive? Investigating the high-$z$ demography of active SMBHs from JWST

    Authors: Daniel Roberts, Francesco Shankar, Vieri Cammelli, Fabio Fontanot, Alessandro Trinca, Laura Bisigello, Elena Dalla Bonta, Hao Fu, Roberto Gilli, Andrea Grazian, Luca Graziani, Andrea Lapi, Nicola Menci, Jan Scholtz, Karthik Mahesh Varadarajan

    Abstract: Recent JWST observations have unveiled a numerous population of low-luminosity active galactic nuclei (AGN) at $4< z<10$, with space densities roughly an order of magnitude above pre-JWST estimates, and many of these AGN have masses orders of magnitude above the local black hole mass-stellar mass ($M_{\rm BH}-M_{*}$) scaling relations. We investigate the consistency of these observations within a… ▽ More

    Submitted 30 January, 2026; originally announced January 2026.

    Comments: 21 pages, 14 figures, accepted for publication in MNRAS

  47. arXiv:2601.22063  [pdf, ps, other

    astro-ph.GA

    Mapping the Extended Lyman-Alpha Emission within the Circumgalactic Medium of Quasars Hosted by Dusty Starbursts with CubeCarve

    Authors: Kevin Hall, Hai Fu

    Abstract: We present a study of extended Ly$α$ emission around four quasars hosted by dusty starbursts, which are composite systems thought to represent a transitional stage in quasar evolution. To extract faint CGM emission in the presence of bright point sources, we introduce {\it CubeCarve}, a dual-channel deconvolution algorithm that separates unresolved quasar emission from spatially extended structure… ▽ More

    Submitted 29 January, 2026; originally announced January 2026.

    Comments: Submitted to ApJ, comments welcome. For CubeCarve source code, see https://github.com/kevhall23/CubeCarve

  48. arXiv:2601.20622  [pdf, ps, other

    cs.HC

    SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation

    Authors: Boyu Li, Lin-Ping Yuan, Zeyu Wang, Hongbo Fu

    Abstract: Sketching provides an intuitive way to convey dynamic intent in animation authoring (i.e., how elements change over time and space), making it a natural medium for automatic content creation. Yet existing approaches often constrain sketches to fixed command tokens or predefined visual forms, overlooking their freeform nature and the central role of humans in shaping intention. To address this, we… ▽ More

    Submitted 28 January, 2026; originally announced January 2026.

    Comments: conditionally accepted by CHI'26

  49. arXiv:2601.20304  [pdf, ps, other

    cs.CV cs.AI

    Structure-constrained Language-informed Diffusion Model for Unpaired Low-dose Computed Tomography Angiography Reconstruction

    Authors: Genyuan Zhang, Zihao Wang, Zhifan Gao, Lei Xu, Zhen Zhou, Haijun Yu, Jianjia Zhang, Xiujian Liu, Weiwei Zhang, Shaoyu Wang, Huazhu Fu, Fenglin Liu, Weiwen Wu

    Abstract: The application of iodinated contrast media (ICM) improves the sensitivity and specificity of computed tomography (CT) for a wide range of clinical indications. However, overdose of ICM can cause problems such as kidney damage and life-threatening allergic reactions. Deep learning methods can generate CT images of normal-dose ICM from low-dose ICM, reducing the required dose while maintaining diag… ▽ More

    Submitted 28 January, 2026; originally announced January 2026.

  50. arXiv:2601.17635  [pdf, ps, other

    physics.flu-dyn physics.ao-ph

    Gravity Wave Interactions in the Stratocumulus-Topped Boundary Layer

    Authors: Arun Balakrishna, Hao Fu, Parviz Moin, Morgan O'Neill

    Abstract: This work studies the breakup propensity of the stratocumulus-topped boundary layer (STBL) interacting with gravity waves using large-eddy simulation with a uniform vertical grid of $5$ m and horizontal spacing of $30$ m. A radiative-convective equilibrium (RCE) state is constructed to enforce stationarity in the STBL, and the gravity waves are introduced via a vertical momentum forcing mimicking… ▽ More

    Submitted 24 January, 2026; originally announced January 2026.