arXiv:2604.11755 [pdf, ps, other]

Ringing of rapidly rotating black holes in effective field theory

Authors: Tom van der Steen, Simon Maenaut, Stef J. B. Husken, Pedro G. S. Fernandes, Maxim D. Jockwer, Vitor Cardoso, Thomas Hertog, Tjonnie G. F. Li

Abstract: Within the effective field theory approach to gravity, deviations from general relativity can be systematically described by higher-curvature operators. However, computing the resulting corrections to black hole quasinormal mode spectra remains challenging in the rapidly rotating regime, where perturbative expansions in the spin break down. We use recently constructed numerical rotating black hole… ▽ More Within the effective field theory approach to gravity, deviations from general relativity can be systematically described by higher-curvature operators. However, computing the resulting corrections to black hole quasinormal mode spectra remains challenging in the rapidly rotating regime, where perturbative expansions in the spin break down. We use recently constructed numerical rotating black hole solutions to compute quasinormal mode frequency corrections at leading order in the effective field theory. Focusing on scalar perturbations, we evaluate cubic-curvature corrections, which constitute the leading modifications. We employ a pseudo-spectral collocation method to solve the resulting perturbation equations on these backgrounds, enabling accurate computation across a broad parameter range. We obtain frequency corrections for fundamental modes with $l\le5$ for all $m$, and the first overtone of $2 \le l \le 5$ modes for all $m$ for spins up to $a=0.99M$, with relative errors below $10^{-4}$. We observe that corrections to certain modes grow significantly as the spin approaches the near-extremal regime. △ Less

Submitted 13 April, 2026; originally announced April 2026.

Comments: 13 pages, 3 figures

arXiv:2604.11572 [pdf, ps, other]

DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models

Authors: Siyuan Xu, Tianshi Wang, Fengling Li, Lei Zhu, Heng Tao Shen

Abstract: Vision-Language-Action models (VLAs) have demonstrated strong potential for embodied AI, yet their deployment on resource-limited robots remains challenging due to high memory and computational demands. While Post-Training Quantization (PTQ) provides an efficient solution, directly applying PTQ to VLAs often results in severe performance degradation during sequential control. We identify temporal… ▽ More Vision-Language-Action models (VLAs) have demonstrated strong potential for embodied AI, yet their deployment on resource-limited robots remains challenging due to high memory and computational demands. While Post-Training Quantization (PTQ) provides an efficient solution, directly applying PTQ to VLAs often results in severe performance degradation during sequential control. We identify temporal error accumulation as a key factor, where quantization perturbations at the vision-language-to-action interface are progressively amplified, leading to kinematic drift in executed trajectories. To address this issue, we propose Drift-Aware Post-Training Quantization (DA-PTQ), which formulates quantization as a drift-aware optimization problem over sequential decision processes. DA-PTQ consists of two components: (1) Cross-Space Representation Compensation, which mitigates structured distortions between multimodal representations and action space to improve action consistency, and (2) Motion-Driven Mixed-Precision Allocation, which assigns bit-widths by minimizing trajectory-level motion errors. Extensive experiments show that DA-PTQ significantly reduces kinematic drift and achieves comparable performance to full-precision models under low-bit settings, enabling practical deployment of VLAs on resource-limited robotic platforms. △ Less

Submitted 13 April, 2026; originally announced April 2026.

Comments: 13 pages, 6 figures

arXiv:2604.11538 [pdf, ps, other]

ResearchCube: Multi-Dimensional Trade-off Exploration for Research Ideation

Authors: Zijian Ding, Fenghai Li, Ziyi Wang, Joel Chan

Abstract: Research ideation requires navigating trade-offs across multiple evaluative dimensions, yet most AI-assisted ideation tools leave this multi-dimensional reasoning unsupported, or reducing evaluation to unipolar scales where "more is better". We present ResearchCube, a system that reframes evaluation dimensions as bipolar trade-off spectra (e.g., theory-driven vs. data-driven) and renders research… ▽ More Research ideation requires navigating trade-offs across multiple evaluative dimensions, yet most AI-assisted ideation tools leave this multi-dimensional reasoning unsupported, or reducing evaluation to unipolar scales where "more is better". We present ResearchCube, a system that reframes evaluation dimensions as bipolar trade-off spectra (e.g., theory-driven vs. data-driven) and renders research ideas as manipulable points in a user-constructed 3D evaluation space. Given a research intent, the system proposes candidate bipolar dimension pairs; users select up to three to define the axes of a personalized evaluation cube. Four spatial interactions -- AI-scaffolded dimension generation, 3D navigation with face snapping, drag-based idea steering, and drag-based synthesis -- enable researchers to explore and refine ideas through direct manipulation rather than text prompts. A qualitative study with 11 researchers revealed that (1) bipolar dimensions served as cognitive scaffolds that externalized evaluative thinking and offloaded working memory, (2) the spatial representation provided a sense of agency absent in chatbot-based AI tools, (3) participants desired fluid transitions across dimensionality levels -- from single-dimension focus to more than three dimensions, and (4) a productive tension emerged between AI-suggested starting dimensions and users' evolving desire for control. We distill these findings into design implications for multi-dimensional research ideation tools, including progressive dimensional control, fluid dimensionality, and transparent synthesis with provenance. △ Less

Submitted 13 April, 2026; originally announced April 2026.

arXiv:2604.11487 [pdf, ps, other]

NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

Authors: Aleksandr Gushchin, Khaled Abud, Ekaterina Shumitskaya, Artem Filippov, Georgii Bychkov, Sergey Lavrushkin, Mikhail Erofeev, Anastasia Antsiferova, Changsheng Chen, Shunquan Tan, Radu Timofte, Dmitry Vatolin, Chuanbiao Song, Zijian Yu, Hao Tan, Jun Lan, Zhiqiang Yang, Yongwei Tang, Zhiqiang Wu, Jia Wen Seow, Hong Vin Koay, Haodong Ren, Feng Xu, Shuai Chen, Ruiyang Xia , et al. (29 additional authors not shown)

Abstract: This paper presents an overview of the NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild, held in conjunction with the NTIRE workshop at CVPR 2026. The goal of this challenge was to develop detection models capable of distinguishing real images from generated ones in realistic scenarios: the images are often transformed (cropped, resized, compressed, blurred) for practical us… ▽ More This paper presents an overview of the NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild, held in conjunction with the NTIRE workshop at CVPR 2026. The goal of this challenge was to develop detection models capable of distinguishing real images from generated ones in realistic scenarios: the images are often transformed (cropped, resized, compressed, blurred) for practical usage, and therefore, the detection models should be robust to such transformations. The challenge is based on a novel dataset consisting of 108,750 real and 185,750 AI-generated images from 42 generators comprising a large variety of open-source and closed-source models of various architectures, augmented with 36 image transformations. Methods were evaluated using ROC AUC on the full test set, including both transformed and untransformed images. A total of 511 participants registered, with 20 teams submitting valid final solutions. This report provides a comprehensive overview of the challenge, describes the proposed solutions, and can be used as a valuable reference for researchers and practitioners in increasing the robustness of the detection models to real-world transformations. △ Less

Submitted 13 April, 2026; originally announced April 2026.

Comments: CVPR 2026 NTIRE Workshop Paper, Robust AI-Generated Image Detection Technical Report

arXiv:2604.11129 [pdf, ps, other]

DeCoVec: Building Decoding Space based Task Vector for Large Language Models via In-Context Learning

Authors: Feiyang Li, Yile Wang

Abstract: Task vectors, representing directions in model or activation spaces that encode task-specific behaviors, have emerged as a promising tool for steering large language models (LLMs). However, existing approaches typically require fine-tuning or invasive manipulation of internal states, limiting their flexibility and scalability. We propose \textsc{DeCoVec} (Decoding Space based Task Vector), a train… ▽ More Task vectors, representing directions in model or activation spaces that encode task-specific behaviors, have emerged as a promising tool for steering large language models (LLMs). However, existing approaches typically require fine-tuning or invasive manipulation of internal states, limiting their flexibility and scalability. We propose \textsc{DeCoVec} (Decoding Space based Task Vector), a training-free and non-invasive framework that constructs task vectors directly in the \textit{decoding space} by leveraging in-context learning (ICL). Specifically, \textsc{DeCoVec} captures the task essence as the difference between the output logit distributions of few-shot and zero-shot prompts, then steers generation by injecting this vector into the decoding process. Experiments across seven LLMs (0.5B--9B) on TruthfulQA, Math-500, and AQUA-RAT show that \textsc{DeCoVec} consistently outperforms standard few-shot baselines, with gains up to +5.50 average accuracy. Further analysis demonstrates that \textsc{DeCoVec} effectively suppresses generation degeneration and logical flaws while exhibiting strong robustness to demonstration ordering, all without incurring additional input token costs. Our method offers a training-free and non-invasive solution for LLM steering without requiring weight updates or auxiliary models. △ Less

Submitted 13 April, 2026; originally announced April 2026.

Comments: Accepted to ACL 2026 Findings

arXiv:2604.10846 [pdf, ps, other]

PFAgent: A Tractable and Self-Evolving Power-Flow Agent for Interactive Grid Analysis

Authors: Buxin She, Brian Chen, Luanzheng Guo, Fangxing Li

Abstract: Power system simulation workflows remain expert-intensive. Engineers must translate study intents into code or API calls, execute analyses, and interpret outputs. To automate this workflow, this paper presents PFAgent, a tractable and self-evolving power-flow agent for interactive grid analysis. PFAgent integrates four key capabilities: i) a tractable and interactive architecture for intent parsin… ▽ More Power system simulation workflows remain expert-intensive. Engineers must translate study intents into code or API calls, execute analyses, and interpret outputs. To automate this workflow, this paper presents PFAgent, a tractable and self-evolving power-flow agent for interactive grid analysis. PFAgent integrates four key capabilities: i) a tractable and interactive architecture for intent parsing, knowledge retrieval, tool execution, and structured reporting; ii) a self-evolution mechanism combining verification-driven refinement and human-in-the-loop feedback; iii) an AI-assisted evaluation and debugging loop that leverages conversational context, generated code, and execution errors for iterative fixing; and iv) an evaluation framework covering task success, convergence validity, numerical consistency, and explanation quality. Verification on IEEE benchmark systems shows that PFAgent can automate case change, analyze voltage violations, perform N-1 contingency analysis, generate plots and concise summaries, and return reproducible results with transparent execution logs. The proposed framework highlights a shift from conventional simulation tools to interactive, tractable, and self-evolving agents for power system analysis. △ Less

Submitted 12 April, 2026; originally announced April 2026.

Comments: 10 pages, 7 figures

arXiv:2604.10710 [pdf, ps, other]

Causal mediation in cluster-randomized trials with multiple mediators: spillover-aware decomposition, identification, and semiparametric efficient inference

Authors: Jiaqi Tong, Chao Cheng, Fan Li

Abstract: Causal mediation analysis in cluster-randomized trials (CRTs) is complicated by the presence of multiple mediators, intracluster correlation, and within-cluster interference. Existing mediation methods often fall short in accommodating these features simultaneously, and semiparametric efficient estimators that fully address them remain unavailable. We develop a unified framework that defines a cla… ▽ More Causal mediation analysis in cluster-randomized trials (CRTs) is complicated by the presence of multiple mediators, intracluster correlation, and within-cluster interference. Existing mediation methods often fall short in accommodating these features simultaneously, and semiparametric efficient estimators that fully address them remain unavailable. We develop a unified framework that defines a class of mediation effect estimands, including exit indirect effects, exit spillover mediation effects, and their interaction effects, to investigate causal mechanisms in CRTs with an arbitrary number of mediators under an unknown causal structure. We introduce a set of interpretable causal assumptions for point identification of each estimand. For optimal inference, we first derive the efficient influence functions for the proposed estimands and construct corresponding one-step and debiased machine learning estimators. In particular, to flexibly model the joint mediator density, we employ an elliptical copula marginal regression model that combines a nonparametric marginal regression with an interpretable association structure. We assess the finite-sample performance of the proposed estimators through simulation studies and illustrate the methodology by reanalyzing the PPACT CRT data with three causally unordered mediators. △ Less

Submitted 12 April, 2026; originally announced April 2026.

arXiv:2604.10548 [pdf, ps, other]

Simple but Stable, Fast and Safe: Achieve End-to-end Control by High-Fidelity Differentiable Simulation

Authors: Fanxing Li, Shengyang Wang, Yuxiang Huang, Fangyu Sun, Yufei Yan, Danping Zou, Wenxian Yu

Abstract: Obstacle avoidance is a fundamental vision-based task essential for enabling quadrotors to perform advanced applications. When planning the trajectory, existing approaches both on optimization and learning typically regard quadrotor as a point-mass model, giving path or velocity commands then tracking the commands by outer-loop controller. However, at high speeds, planned trajectories sometimes be… ▽ More Obstacle avoidance is a fundamental vision-based task essential for enabling quadrotors to perform advanced applications. When planning the trajectory, existing approaches both on optimization and learning typically regard quadrotor as a point-mass model, giving path or velocity commands then tracking the commands by outer-loop controller. However, at high speeds, planned trajectories sometimes become dynamically infeasible in actual flight, which beyond the capacity of controller. In this paper, we propose a novel end-to-end policy that directly maps depth images to low-level bodyrate commands by reinforcement learning via differentiable simulation. The high-fidelity simulation in training after parameter identification significantly reduces all the gaps between training, simulation and real world. Analytical process by differentiable simulation provides accurate gradient to ensure efficiently training the low-level policy without expert guidance. The policy employs a lightweight and the most simple inference pipeline that runs without explicit mapping, backbone networks, primitives, recurrent structures, or backend controllers, nor curriculum or privileged guidance. By inferring low-level command directly to the hardware controller, the method enables full flight envelope control and avoids the dynamic-infeasible issue.Experimental results demonstrate that the proposed approach achieves the highest success rate and the lowest jerk among state-of-the-art baselines across multiple benchmarks. The policy also exhibits strong generalization, successfully deploying zero-shot in unseen, outdoor environments while reaching speeds of up to 7.5m/s as well as stably flying in the super-dense forest. △ Less

Submitted 12 April, 2026; originally announced April 2026.

arXiv:2604.10523 [pdf, ps, other]

Measurement of the branching fractions of $χ_{cJ} \to π^{+}π^{-}π^{0}π^{0}$ via $ψ(3686) \to γχ_{cJ}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (741 additional authors not shown)

Abstract: Using $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector operating at BEPCII, the branching fractions of $χ_{cJ}\toπ^+π^-π^0π^0$ ($J=0,~1,~2$) are measured via the radiative transition $ψ(3686)\toγχ_{cJ}$. The results are $\mathcal{B}(χ_{c0} \to π^{+}π^{-}π^{0}π^{0}) = (3.10 \pm 0.01 \pm 0.14) \times 10^{-2}$,… ▽ More Using $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector operating at BEPCII, the branching fractions of $χ_{cJ}\toπ^+π^-π^0π^0$ ($J=0,~1,~2$) are measured via the radiative transition $ψ(3686)\toγχ_{cJ}$. The results are $\mathcal{B}(χ_{c0} \to π^{+}π^{-}π^{0}π^{0}) = (3.10 \pm 0.01 \pm 0.14) \times 10^{-2}$, $\mathcal{B}(χ_{c1} \to π^{+}π^{-}π^{0}π^{0}) = (1.16 \pm 0.01 \pm 0.05) \times 10^{-2}$, and $\mathcal{B}(χ_{c2} \to π^{+}π^{-}π^{0}π^{0}) = (1.92 \pm 0.01 \pm 0.08) \times 10^{-2}$, where the first uncertainties are statistical and the second systematic. The dominant intermediate states are found to be $χ_{cJ}\toρ^+ρ^-$. These results supersede the previous most precise measurements and provide significantly improved precision. △ Less

Submitted 12 April, 2026; originally announced April 2026.

arXiv:2604.10444 [pdf, ps, other]

First Observation of \boldmath{$D^+ \to a_0(980)ρ$ and $D^+ \to a_0(980)^+ f_0(500)$} in \boldmath{$D^+ \to π^+π^+π^-η$ and $D^+ \to π^+π^0π^0η$} Decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (734 additional authors not shown)

Abstract: We perform the first amplitude analysis of the singly Cabibbo-suppressed decays $D^+ \to π^+ π^{+(0)} π^{-(0)} η$, using $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773\,GeV, corresponding to an integrated luminosity of 20.3 $\rm{fb}^{-1}$. The absolute branching fractions of the $D^+ \to π^+ π^+ π^- η$ and $D^+ \to π^+ π^0 π^0 η$ decays are measure… ▽ More We perform the first amplitude analysis of the singly Cabibbo-suppressed decays $D^+ \to π^+ π^{+(0)} π^{-(0)} η$, using $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773\,GeV, corresponding to an integrated luminosity of 20.3 $\rm{fb}^{-1}$. The absolute branching fractions of the $D^+ \to π^+ π^+ π^- η$ and $D^+ \to π^+ π^0 π^0 η$ decays are measured to be $(3.20\pm0.06_{\text{stat.}}\pm0.03_{\text{syst.}})\times 10^{-3}$ and $(2.43 \pm 0.11_{\text{stat.}} \pm 0.04_{\text{syst.}}) \times 10^{-3}$, respectively. % , both achieving three times better precision than the current PDG values. The decay process $D^{+}\to a_0(980)^{+}f_0(500)$ is observed for the first time with an unexpectedly large branching fraction. Moreover, we observe the decays $D^+ \to a_0(980)^{+(0)} ρ(770)^{0(+)}$ and measure the ratio $r_{+/0} \equiv \frac{\mathcal{B}(D^+ \to a_0(980)^+ ρ(770)^0)}{\mathcal{B}(D^+ \to a_0(980)^0 ρ(770)^+)}$ for the first time to be $0.55\pm0.08_{\text{stat.}}\pm0.05_{\text{syst.}}$. These results offer a novel insight into our comprehension of the nature of the $a_0(980)$ and $f_0(500)$ states. △ Less

Submitted 11 April, 2026; originally announced April 2026.

arXiv:2604.10417 [pdf, ps, other]

LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset

Authors: Aizihaierjiang Yusufu, Jiang Liu, Kamran Aziz, Abidan Ainiwaer, Bobo Li, Fei Li, Donghong Ji, Aizierguli Yusufu

Abstract: In recent years, aspect-based sentiment analysis (ABSA) has made rapid progress and shown strong practical value. However, existing research and benchmarks are largely concentrated on high-resource languages, leaving fine-grained sentiment extraction in low-resource languages under-explored. To address this gap, we constructed the first Low-resource languages Aspect-based Sentiment Quadruple datas… ▽ More In recent years, aspect-based sentiment analysis (ABSA) has made rapid progress and shown strong practical value. However, existing research and benchmarks are largely concentrated on high-resource languages, leaving fine-grained sentiment extraction in low-resource languages under-explored. To address this gap, we constructed the first Low-resource languages Aspect-based Sentiment Quadruple dataset, named LASQ, which includes two low-resource languages: Uzbek and Uyghur. Secondly, it includes a fine-grained target-aspect-opinion-sentiment quadruple extraction task. To facilitate future research, we designed a grid-tagging model that integrates syntactic knowledge. This model incorporates part-of-speech (POS) and dependency knowledge into the model through our designed Syntax Knowledge Embedding Module (SKEM), thereby alleviating the lexical sparsity problem caused by agglutinative languages. Experiments on LASQ demonstrate consistent gains over competitive baselines, validating both the dataset's utility and the effectiveness of the proposed modeling approach. △ Less

Submitted 11 April, 2026; originally announced April 2026.

arXiv:2604.10411 [pdf, ps, other]

CIR: Lightweight Container Image for Cross-Platform Deployment

Authors: Fengzhi Li, Xiaohui Peng, Qingru Xu, Qisong Shi, Tuo Zhou, Yongxuan Dai, Yifan Wang, Ninghui Sun, Zhiwei Xu

Abstract: In modern cloud and heterogeneous distributed infrastructures, container images are widely used as the deployment unit for machine learning applications. An image bundles the application with its entire platform-specific execution environment and can be directly launched into a container instance. However, this approach forces developers to build and maintain separate images for each target deploy… ▽ More In modern cloud and heterogeneous distributed infrastructures, container images are widely used as the deployment unit for machine learning applications. An image bundles the application with its entire platform-specific execution environment and can be directly launched into a container instance. However, this approach forces developers to build and maintain separate images for each target deployment platform. This limitation is particularly evident for widely used interpreted languages such as Python and R in data analytics and machine learning, where application code is inherently cross-platform, yet the runtime dependencies are highly platform-specific. With emerging computing paradigms such as sky computing and edge computing, which demand seamless workload migration and cross-platform deployment, traditional images not only introduce inefficiencies in storage and network usage, but also impose substantial burdens on developers, who must repeatedly craft and manage platform-specific builds. To address these challenges, we propose a lazy-build approach that defers platform-specific construction to the deployment stage, thus keeping the image itself cross-platform. To enable this, we introduce a new image format, CIR (Container Intermediate Representation), together with its pre-builder and lazy-builder. CIR targets interpreted-language applications and only stores the identifiers of the application's direct dependencies, leaving platform adaptation to the lazy-builder, which at deployment time assembles the actual dependencies into runnable containers. A single CIR can therefore be deployed across heterogeneous platforms while reducing image size by 95% compared to conventional images that bundle all dependencies. In our evaluation, CIR reduces deployment time by 40-60% compared with pre-built images, outperforming state-of-the-art systems such as Docker, Buildah, and Apptainer. △ Less

Submitted 11 April, 2026; originally announced April 2026.

arXiv:2604.10052 [pdf, ps, other]

Impact of Intelligent Technologies on IoV Security: Integrating Edge Computing and AI

Authors: Awais Bilal, Kashif Sharif, Liehuang Zhu, Chang Xu, Fan Li, Sadaf Bukhari, Sujit Biswas

Abstract: The rapid development and integration of intelligent technologies in the Internet of Vehicles (IoV) have revolutionized transportation systems by enhancing connectivity, automation, and safety. However, the complexity and connectivity of IoV networks also introduce security challenges, including data privacy concerns, cyber threats, and system vulnerabilities. This paper surveys the role of Edge C… ▽ More The rapid development and integration of intelligent technologies in the Internet of Vehicles (IoV) have revolutionized transportation systems by enhancing connectivity, automation, and safety. However, the complexity and connectivity of IoV networks also introduce security challenges, including data privacy concerns, cyber threats, and system vulnerabilities. This paper surveys the role of Edge Computing (EC), Machine Learning (ML), and Deep Learning (DL) in strengthening IoV security frameworks. It examines the synergy between these technologies, highlighting their individual capabilities and their collective impact on enhancing threat detection, response times, and adaptive security. Through real world case studies and practical deployments, we demonstrate how EC, ML, and DL are currently improving security and operational efficiency in IoV systems. The paper also identifies key research gaps and future directions for further advancements in IoV security, including the need for scalable, privacy preserving solutions and robust defense mechanisms against emerging cyber threats. By integrating EC, ML, and DL, this work lays the groundwork for developing adaptive, efficient, and resilient IoV security infrastructures capable of addressing evolving challenges in the transportation ecosystem. △ Less

Submitted 11 April, 2026; originally announced April 2026.

arXiv:2604.09359 [pdf, ps, other]

Bringing Clustering to MLL: Weakly-Supervised Clustering for Partial Multi-Label Learning

Authors: Yu Chen, Weijun Lv, Yue Huang, Xuhuan Zhu, Fang Li

Abstract: Label noise in multi-label learning (MLL) poses significant challenges for model training, particularly in partial multi-label learning (PML) where candidate labels contain both relevant and irrelevant labels. While clustering offers a natural approach to exploit data structure for noise identification, traditional clustering methods cannot be directly applied to multi-label scenarios due to a fun… ▽ More Label noise in multi-label learning (MLL) poses significant challenges for model training, particularly in partial multi-label learning (PML) where candidate labels contain both relevant and irrelevant labels. While clustering offers a natural approach to exploit data structure for noise identification, traditional clustering methods cannot be directly applied to multi-label scenarios due to a fundamental incompatibility: clustering produces membership values that sum to one per instance, whereas multi-label assignments require binary values that can sum to any number. We propose a novel weakly-supervised clustering approach for PML (WSC-PML) that bridges clustering and multi-label learning through membership matrix decomposition. Our key innovation decomposes the clustering membership matrix $\mathbf{A}$ into two components: $\mathbf{A} = \mathbfΠ \odot \mathbf{F}$, where $\mathbfΠ$ maintains clustering constraints while $\mathbf{F}$ preserves multi-label characteristics. This decomposition enables seamless integration of unsupervised clustering with multi-label supervision for effective label noise handling. WSC-PML employs a three-stage process: initial prototype learning from noisy labels, adaptive confidence-based weak supervision construction, and joint optimization via iterative clustering refinement. Extensive experiments on 24 datasets demonstrate that our approach outperforms six state-of-the-art methods across all evaluation metrics. △ Less

Submitted 10 April, 2026; originally announced April 2026.

arXiv:2604.08360 [pdf, ps, other]

2D Ferroelectric Ruddlesden-Popper Perovskites: an Emerging Fully Electronically Controllable Shift Current and Persistent Spin Helix

Authors: Yue Zhao, Fu Li, Vikrant Chaudhary, Hongbin Zhang, Gaoyang Gou, Niuzhuang Yang, Yue Hao, Wenyi Liu

Abstract: Two-dimensional (2D) hybrid organic--inorganic perovskites (HOIPs) are promising candidates for next-generation optoelectronic and spintronic applications. This work systematically investigates the relationship between structural distortions and functional responses in three $C_{2v}$-symmetric Ruddlesden--Popper (RP) ferroelectric perovskites, $(4,4\text{-DFPD})_{2}\mathrm{PbI}_{4}$,… ▽ More Two-dimensional (2D) hybrid organic--inorganic perovskites (HOIPs) are promising candidates for next-generation optoelectronic and spintronic applications. This work systematically investigates the relationship between structural distortions and functional responses in three $C_{2v}$-symmetric Ruddlesden--Popper (RP) ferroelectric perovskites, $(4,4\text{-DFPD})_{2}\mathrm{PbI}_{4}$, $(\mathrm{DFCHA})_{2}\mathrm{PbI}_{4}$, and PEPI, using first-principles calculations combined with irreducible representation decomposition and wave-vector point-group symmetry (WPGS) analysis. The results reveal that the lead--iodide framework yields shift-current (SC) magnitudes comparable to, and in specific cases even an order of magnitude larger than, those of traditional ferroelectric oxides, with PEPI reaching a maximum of $69.16\ μ\mathrm{A}/\mathrm{V}^{2}$. The SC magnitude correlates positively with the octahedral distortion index ($D_i$), while a competition mechanism is identified between covalent bond strength and structural asymmetry, where increased average bond lengths can offset the enhancement induced by $D_i$. Regarding spintronics, $C_{2v}$ symmetry-protected persistent spin textures (PST) are identified. A transition to $C_2$-protected quasi-PST occurs in monoclinic $(4,4\text{-DFHHA})_{2}\mathrm{PbI}_{4}$, leading to a persistent spin helix (PSH) with long-distance spin transport. The synergy among ferroelectricity, SC, and PST enables nonvolatile electrical control of both photocurrent direction and spin configurations. This work provides evaluation criteria and practical guidance for designing high-performance integrated spintronic--photovoltaic devices. △ Less

Submitted 9 April, 2026; originally announced April 2026.

arXiv:2604.07782 [pdf, ps, other]

Ghost imaging with zero photons

Authors: Meixue Chen, Yiqi Song, Yu Gu, Huafan Zhang, Huaibin Zheng, Yuchen He, Hui Chen, Yu Zhou, Fuli Li, Zhuo Xu, Jianbin Liu

Abstract: Ghost imaging was first demonstrated with entangled photon pairs and well-known for its peculiar properties. The signal beam that illuminates the object possesses no spatial resolution, whereas the reference beam, which never interacts with the object, is spatially resolved. Either beam alone cannot retrieve the image, which can only be obtained when the signal and reference beams are correlated.… ▽ More Ghost imaging was first demonstrated with entangled photon pairs and well-known for its peculiar properties. The signal beam that illuminates the object possesses no spatial resolution, whereas the reference beam, which never interacts with the object, is spatially resolved. Either beam alone cannot retrieve the image, which can only be obtained when the signal and reference beams are correlated. Here we will report a ghost imaging experiment with even more peculiar properties, in which the image can be reconstructed when no photon interacts with the object or even no photon in neither signal nor reference beam. All the photons interacted with the object are discarded. Only the time bins with zero photon are employed to retrieve the image, a process referred to as "ghost imaging with zero photons" hereafter. The reason why ghost image can be retrieved with zero photons is jointly determined by photon-number projection measurement and photon statistics of thermal light. The results are helpful to resolve the debate on the physics of ghost imaging and understand the relation between quantum and classical correlations. △ Less

Submitted 9 April, 2026; originally announced April 2026.

Comments: 6 pages, 6 figures

arXiv:2604.07756 [pdf, ps, other]

Fixed-Effects Models for Causal Inference in Longitudinal Cluster Randomized and Quasi-Experimental Trials

Authors: Kenneth M. Lee, Fan Li

Abstract: This article investigates the model-robustness of fixed-effects models for analyzing a broad class of longitudinal cluster trials (CTs) such as stepped-wedge, parallel-with-baseline and crossover designs, encompassing both randomized (CRTs) and quasi-experimental (CQTs) designs. We clarify a longstanding misconception in biostatistics, demonstrating that fixed-effects models, traditionally perceiv… ▽ More This article investigates the model-robustness of fixed-effects models for analyzing a broad class of longitudinal cluster trials (CTs) such as stepped-wedge, parallel-with-baseline and crossover designs, encompassing both randomized (CRTs) and quasi-experimental (CQTs) designs. We clarify a longstanding misconception in biostatistics, demonstrating that fixed-effects models, traditionally perceived as targeting only finite-sample conditional estimands, can effectively target super-population marginal estimands through an M-estimation framework. We comprehensively prove that linear and log-link fixed-effects models with correctly specified treatment effect structures can broadly yield consistent and asymptotically normal estimators for nonparametrically defined treatment effect estimands in longitudinal CRTs, even under arbitrary misspecification of other model components. We identify that the constant treatment effect estimator generally targets the period-average treatment effect for the overlap population (P-ATO); accordingly, some CRT designs don't even require correct specification of the treatment effect structure for model-robustness. We further characterize conditions where fixed-effects models can maintain consistency by adjusting for both cluster-level and individual-level time-invariant confounding in longitudinal CQTs. Altogether, supported by simulation and a case study re-analysis, we establish fixed-effects models as a robust and potentially preferable alternative to mixed-effects models for longitudinal CT analysis. △ Less

Submitted 8 April, 2026; originally announced April 2026.

Comments: 122 pages (35 main manuscript, 87 supplementary appendix), 10 figures (4 main manuscript, 6 supplementary appendix), 2 tables (2 supplementary appendix)

arXiv:2604.05712 [pdf, ps, other]

Precise measurement of the CKM angle $γ$ with a novel approach

Authors: The BESIII, LHCb Collaborations, :, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco , et al. (1936 additional authors not shown)

Abstract: A measurement of the CKM angle $γ$ is performed by applying a novel, unbinned, model-independent approach to datasets of electron-positron collisions collected by the BESIII experiment and proton-proton collisions by the LHCb experiment, corresponding to integrated luminosities of 8 fb$^{-1}$ and 9 fb$^{-1}$, respectively. The $C\!P$-violating phase $γ$ is determined from… ▽ More A measurement of the CKM angle $γ$ is performed by applying a novel, unbinned, model-independent approach to datasets of electron-positron collisions collected by the BESIII experiment and proton-proton collisions by the LHCb experiment, corresponding to integrated luminosities of 8 fb$^{-1}$ and 9 fb$^{-1}$, respectively. The $C\!P$-violating phase $γ$ is determined from ${B^{\pm}\rightarrow D(\rightarrow K_{\rm S}^{0} h^{\prime+}h^{\prime-}) h^{\pm}}$ decays in LHCb data, where $h^{(\prime)}$ is either a pion or kaon, while the corresponding strong-phase parameters are measured using doubly tagged ${D\rightarrow K_{\rm S/L}^0 h^{\prime+} h^{\prime-}}$ decays in the quantum-correlated $D\overline{D}$ system present in BESIII data. A joint fit to both datasets, which allows for a simultaneous determination of the associated $C\!P$-violating observables and strong-phase parameters, yields ${γ= (71.3\pm 5.0)^{\circ}}$. The result is the most precise to date and consistent with previous measurements and world averages. △ Less

Submitted 7 April, 2026; originally announced April 2026.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/5991/ (LHCb public pages)

Report number: LHCb-PAPER-2025-064, CERN-EP-2026-068

arXiv:2604.05701 [pdf, ps, other]

Measurement of the CKM angle $γ$ in $B^{\pm} \rightarrow D(\rightarrow K^{0}_{\rm S} h^{\prime+}h^{\prime-})h^{\pm}$ decays with a novel approach

Authors: The BESIII, LHCb Collaborations, :, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco , et al. (1936 additional authors not shown)

Abstract: A measurement of the CKM angle $γ$ and related strong-phase parameters is performed using a novel, model-independent approach in ${B^{\pm}\rightarrow D(\rightarrow K^{0}_{\rm S} h^{\prime+}h^{\prime-}) h^{\pm}}$ decays, where $h^{(\prime)} \equiv π, K$. The analysis uses a joint data sample of electron-positron collisions collected by the BESIII experiment at the Beijing Electron-Positron Collider… ▽ More A measurement of the CKM angle $γ$ and related strong-phase parameters is performed using a novel, model-independent approach in ${B^{\pm}\rightarrow D(\rightarrow K^{0}_{\rm S} h^{\prime+}h^{\prime-}) h^{\pm}}$ decays, where $h^{(\prime)} \equiv π, K$. The analysis uses a joint data sample of electron-positron collisions collected by the BESIII experiment at the Beijing Electron-Positron Collider II during 2010--2011 and 2021--2022, corresponding to an integrated luminosity of 8 fb$^{-1}$, and proton-proton collisions collected by the LHCb experiment at the Large Hadron Collider during 2011--2018, corresponding to an integrated luminosity of 9 fb$^{-1}$. The two datasets are analyzed simultaneously by applying per-event weights based on the amplitude variation over the $D$-decay phase space to enhance the sensitivity to $C\!P$-violating observables. The CKM angle $γ$ is determined to be $γ= (71.3\pm 5.0)^{\circ}$, which constitutes the most precise single measurement to date. △ Less

Submitted 7 April, 2026; originally announced April 2026.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3989/ (LHCb public pages)

Report number: LHCb-PAPER-2025-063, CERN-EP-2026-067

arXiv:2604.05299 [pdf, ps, other]

Multi-Scale Kinetic Simulation: Asymptotic Preserving IMEX-BDF-DG Schemes with Three Implicit-Explicit Partitionings

Authors: Kimberly Matsuda, Fengyan Li

Abstract: Kinetic transport models are mesoscopic mathematical descriptions of the transport of particles as well as their interactions with the background media or among themselves, and they have wide applications in many areas of mathematical physics such as nuclear and biomedical engineering, rarefied gas dynamics, and plasma physics. They are often multi-scale, with different characteristics (e.g. hyper… ▽ More Kinetic transport models are mesoscopic mathematical descriptions of the transport of particles as well as their interactions with the background media or among themselves, and they have wide applications in many areas of mathematical physics such as nuclear and biomedical engineering, rarefied gas dynamics, and plasma physics. They are often multi-scale, with different characteristics (e.g. hyperbolic, diffusive) depending on the material properties. As our continuing effort to design and analyze numerical methods for accurate and robust simulation of the multi-scale kinetic transport models, in this work, we consider a linear kinetic transport model, a simplified radiative transfer equation, in a diffusive scaling, and propose and analyze three families of asymptotic preserving (AP) methods. Numerical methods with the AP property, that is to preserve the asymptotic behavior of the models at the discrete level on under-resolved meshes, can work uniformly well to simulate multi-scale models across a wide range of scales. The proposed methods start from the micro-macro decomposition of the model, and involve discontinuous Galerkin (DG) methods in space, the discrete ordinates method (i.e. $S_N$ method) in velocity, and implicit-explicit (IMEX) BDF methods in time, with three different IMEX partitionings. A systematic study, both analytically and computationally, is presented regarding their difference in stability, accuracy, computational complexity and AP property. These methods, with multi-step time integrators, are also compared in terms of their accuracy and efficiency with the ones that only differ in using certain IMEX Runge-Kutta methods in time. Together with our previous developments, the present work further contributes to high order DG AP methods for multi-scale kinetic simulation, especially by utilizing the structure of the micro-macro decomposition of the models. △ Less

Submitted 6 April, 2026; originally announced April 2026.

arXiv:2604.05181 [pdf, ps, other]

General Multimodal Protein Design Enables DNA-Encoding of Chemistry

Authors: Jarrid Rector-Brooks, Théophile Lambert, Marta Skreta, Daniel Roth, Yueming Long, Zi-Qi Li, Xi Zhang, Miruna Cretu, Francesca-Zhoufan Li, Tanvi Ganapathy, Emily Jin, Avishek Joey Bose, Jason Yang, Kirill Neklyudov, Yoshua Bengio, Alexander Tong, Frances H. Arnold, Cheng-Hao Liu

Abstract: Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence a… ▽ More Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp$^3$)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis of a selected design further confirmed that enzyme activity can be improved through directed evolution. By providing a scalable route to evolvable enzymes, DISCO broadens the potential scope of genetically encodable transformations. Code is available at https://github.com/DISCO-design/DISCO. △ Less

Submitted 6 April, 2026; originally announced April 2026.

arXiv:2604.04429 [pdf, ps, other]

Future Amplification of Moist Weather Extremes in the Midlatitudes

Authors: Funing Li, Talia Tamarin-Brodsky

Abstract: Moist heatwaves and convective storms frequently co-occur, posing compound risks. Although historically concentrated in the tropics, these moist weather extremes are projected to intensify substantially towards the midlatitudes, with regions downstream of major highland terrains, including northeastern Asia and eastern North America, emerging as hotspots of future change. Yet their physical driver… ▽ More Moist heatwaves and convective storms frequently co-occur, posing compound risks. Although historically concentrated in the tropics, these moist weather extremes are projected to intensify substantially towards the midlatitudes, with regions downstream of major highland terrains, including northeastern Asia and eastern North America, emerging as hotspots of future change. Yet their physical drivers remain uncertain. Here we show that the intensification of concurrent moist heat and convection extremes in the midlatitudes is tightly constrained by changes in low-level atmospheric inversions. Specifically, we find that amplified warming over western highlands is transported downstream by prevailing westerlies, strengthening low-level thermal inversions and raising the attainable maxima of moist heat and convection. Targeted model experiments confirm the critical role of orographically elevated heating in driving these extremes. Our results reveal a mechanistic pathway for compound extremes and highlight low-level inversions as a key factor for emerging midlatitude risks of moist heat and severe weather under climate change. △ Less

Submitted 6 April, 2026; originally announced April 2026.

Comments: This manuscript is currently under peer-review by a journal

arXiv:2604.04360 [pdf, ps, other]

Generalized win fraction regression for composite survival endpoints

Authors: Zhiqiang Cao, Xi Fang, Fan Li

Abstract: We propose a generalized win fraction regression framework for prioritized composite survival outcomes. The framework models the conditional win fraction through a chosen link function (including identity, logit, or probit), thereby accommodating multi-component time-to-event endpoints within a unified regression structure. To handle right censoring, we construct inverse-probability-of-censoring-w… ▽ More We propose a generalized win fraction regression framework for prioritized composite survival outcomes. The framework models the conditional win fraction through a chosen link function (including identity, logit, or probit), thereby accommodating multi-component time-to-event endpoints within a unified regression structure. To handle right censoring, we construct inverse-probability-of-censoring-weighted estimating equations that target the win fraction as if censoring were absent. Under the identity link, regression parameters characterize covariate associations on the natural win fraction scale. Under the logit link, they characterize the log odds of winning -- a new and complementary effect measure that treats ties as failures to win, imposing a more conservative standard than the win ratio or win odds. When there are no ties, the logit win fraction model reduces to proportional win fraction regression; moreover, the unweighted version of our estimating equations numerically coincides with the proportional win fraction point estimator regardless of ties. We establish large-sample properties of the proposed estimators and derive a consistent sandwich variance estimator that accounts for uncertainty from the estimated censoring weights. Extensive simulations examine finite-sample performance across link functions and censoring rates, and our method is illustrated through a reanalysis of the HF-ACTION clinical trial. △ Less

Submitted 7 April, 2026; v1 submitted 5 April, 2026; originally announced April 2026.

arXiv:2604.04357 [pdf, ps, other]

Spatially-Weighted CLIP for Street-View Geo-localization

Authors: Ting Han, Fengjiao Li, Chunsong Chen, Haoling Huang, Yiping Chen, Meiliu Wu

Abstract: This paper proposes Spatially-Weighted CLIP (SW-CLIP), a novel framework for street-view geo-localization that explicitly incorporates spatial autocorrelation into vision-language contrastive learning. Unlike conventional CLIP-based methods that treat all non-matching samples as equally negative, SW-CLIP leverages Tobler's First Law of Geography to model geographic relationships through distance-a… ▽ More This paper proposes Spatially-Weighted CLIP (SW-CLIP), a novel framework for street-view geo-localization that explicitly incorporates spatial autocorrelation into vision-language contrastive learning. Unlike conventional CLIP-based methods that treat all non-matching samples as equally negative, SW-CLIP leverages Tobler's First Law of Geography to model geographic relationships through distance-aware soft supervision. Specifically, we introduce a location-as-text representation to encode geographic positions and replace one-hot InfoNCE targets with spatially weighted soft labels derived from geodesic distance. Additionally, a neighborhood-consistency regularization is employed to preserve local spatial structure in the embedding space. Experiments on a multi-city dataset demonstrate that SW-CLIP significantly improves geo-localization accuracy, reduces long-tail errors, and enhances spatial coherence compared to standard CLIP. The results highlight the importance of shifting from semantic alignment to geographic alignment for robust geo-localization and provide a general paradigm for integrating spatial principles into multimodal representation learning. △ Less

Submitted 5 April, 2026; originally announced April 2026.

arXiv:2604.03198 [pdf, ps, other]

The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, Hongyuan Yu, Pufan Xu, Chen Wu, Long Peng, Jiaojiao Yi, Siyang Yi, Yuning Cui, Jingyuan Xia, Xing Mou, Keji He, Jinlin Wu, Zongang Gao , et al. (38 additional authors not shown)

Abstract: This paper reviews the NTIRE 2026 challenge on efficient single-image super-resolution with a focus on the proposed solutions and results. The aim of this challenge is to devise a network that reduces one or several aspects, such as runtime, parameters, and FLOPs, while maintaining PSNR of around 26.90 dB on the DIV2K_LSDIR_valid dataset, and 26.99 dB on the DIV2K_LSDIR_test dataset. The challenge… ▽ More This paper reviews the NTIRE 2026 challenge on efficient single-image super-resolution with a focus on the proposed solutions and results. The aim of this challenge is to devise a network that reduces one or several aspects, such as runtime, parameters, and FLOPs, while maintaining PSNR of around 26.90 dB on the DIV2K_LSDIR_valid dataset, and 26.99 dB on the DIV2K_LSDIR_test dataset. The challenge had 95 registered participants, and 15 teams made valid submissions. They gauge the state-of-the-art results for efficient single-image super-resolution. △ Less

Submitted 3 April, 2026; originally announced April 2026.

Comments: CVPR 2026 NTIRE Workshop Paper, Efficient Super Resolution Technical Report

arXiv:2604.02935 [pdf, ps, other]

Modality-Specific Hierarchical Enhancement for RGB-D Camouflaged Object Detection

Authors: Yuzhen Niu, Yangqing Wang, Ri Cheng, Fusheng Li, Rongshen Wang, Zhichen Yang

Abstract: Camouflaged object detection (COD) is challenging due to high target-background similarity, and recent methods address this by complementarily using RGB-D texture and geometry cues. However, RGB-D COD methods still underutilize modality-specific cues, which limits fusion quality. We believe this is because RGB and depth features are fused directly after backbone extraction without modality-specifi… ▽ More Camouflaged object detection (COD) is challenging due to high target-background similarity, and recent methods address this by complementarily using RGB-D texture and geometry cues. However, RGB-D COD methods still underutilize modality-specific cues, which limits fusion quality. We believe this is because RGB and depth features are fused directly after backbone extraction without modality-specific enhancement. To address this limitation, we propose MHENet, an RGB-D COD framework that performs modality-specific hierarchical enhancement and adaptive fusion of RGB and depth features. Specifically, we introduce a Texture Hierarchical Enhancement Module (THEM) to amplify subtle texture variations by extracting high-frequency information and a Geometry Hierarchical Enhancement Module (GHEM) to enhance geometric structures via learnable gradient extraction, while preserving cross-scale semantic consistency. Finally, an Adaptive Dynamic Fusion Module (ADFM) adaptively fuses the enhanced texture and geometry features with spatially varying weights. Experiments on four benchmarks demonstrate that MHENet surpasses 16 state-of-the-art methods qualitatively and quantitatively. Code is available at https://github.com/afdsgh/MHENet. △ Less

Submitted 3 April, 2026; originally announced April 2026.

Comments: 11 pages, 7 figures, including supplementary material. Accepted by IEEE ICME 2026

arXiv:2604.02467 [pdf, ps, other]

VERTIGO: Visual Preference Optimization for Cinematic Camera Trajectory Generation

Authors: Mengtian Li, Yuwei Lu, Feifei Li, Chenqi Gan, Zhifeng Xie, Xi Wang

Abstract: Cinematic camera control relies on a tight feedback loop between director and cinematographer, where camera motion and framing are continuously reviewed and refined. Recent generative camera systems can produce diverse, text-conditioned trajectories, but they lack this "director in the loop" and have no explicit supervision of whether a shot is visually desirable. This results in in-distribution c… ▽ More Cinematic camera control relies on a tight feedback loop between director and cinematographer, where camera motion and framing are continuously reviewed and refined. Recent generative camera systems can produce diverse, text-conditioned trajectories, but they lack this "director in the loop" and have no explicit supervision of whether a shot is visually desirable. This results in in-distribution camera motion but poor framing, off-screen characters, and undesirable visual aesthetics. In this paper, we introduce VERTIGO, the first framework for visual preference optimization of camera trajectory generators. Our framework leverages a real-time graphics engine (Unity) to render 2D visual previews from generated camera motion. A cinematically fine-tuned vision-language model then scores these previews using our proposed cyclic semantic similarity mechanism, which aligns renders with text prompts. This process provides the visual preference signals for Direct Preference Optimization (DPO) post-training. Both quantitative evaluations and user studies on Unity renders and diffusion-based Camera-to-Video pipelines show consistent gains in condition adherence, framing quality, and perceptual realism. Notably, VERTIGO reduces the character off-screen rate from 38% to nearly 0% while preserving the geometric fidelity of camera motion. User study participants further prefer VERTIGO over baselines across composition, consistency, prompt adherence, and aesthetic quality, confirming the perceptual benefits of our visual preference post-training. △ Less

Submitted 13 April, 2026; v1 submitted 2 April, 2026; originally announced April 2026.

Comments: 28 pages, 10 figures, ECCV 2026

arXiv:2604.02214 [pdf, ps, other]

Quadratic gravity corrections to scalar QNMs of rapidly rotating black holes

Authors: Stef J. B. Husken, Tom van der Steen, Simon Maenaut, Kelvin Ka-Ho Lam, Maxim D. Jockwer, Adrian Ka-Wai Chung, Thomas Hertog, Tjonnie G. F. Li, Nicolás Yunes

Abstract: In an effective-field-theory framework for gravity, black-hole quasinormal mode spectra acquire corrections in quadratic-curvature, scalar-tensor extensions of general relativity. Previous calculations of such corrections were limited to moderate spins, since the corresponding background solutions relied on expansions in the spin parameter. Using recently constructed numerical black-hole solutions… ▽ More In an effective-field-theory framework for gravity, black-hole quasinormal mode spectra acquire corrections in quadratic-curvature, scalar-tensor extensions of general relativity. Previous calculations of such corrections were limited to moderate spins, since the corresponding background solutions relied on expansions in the spin parameter. Using recently constructed numerical black-hole solutions valid for large spin, we compute the leading-order deviations from general relativity in the scalar quasinormal mode spectrum of rotating black holes in scalar Gauss-Bonnet and dynamical Chern-Simons gravity. We solve the resulting perturbation equations with pseudo-spectral collocation methods, allowing us to determine the quasinormal-mode corrections for dimensionless spins up to $a/M=0.99$, with accuracy better than $\lesssim 10^{-3}$ for the $l=m=0$ mode and $\lesssim 10^{-6}$ for higher multipoles. For spins $a/M>0.9$, the corrections to certain modes can increase by orders of magnitude. △ Less

Submitted 2 April, 2026; originally announced April 2026.

Comments: 13 pages, 5 figures

arXiv:2604.02010 [pdf, ps, other]

Decouple and Rectify: Semantics-Preserving Structural Enhancement for Open-Vocabulary Remote Sensing Segmentation

Authors: Jie Feng, Fengze Li, Junpeng Zhang, Siyu Chen, Yuping Liang, Junying Chen, Ronghua Shang

Abstract: Open-vocabulary semantic segmentation in the remote sensing (RS) field requires both language-aligned recognition and fine-grained spatial delineation. Although CLIP offers robust semantic generalization, its global-aligned visual representations inherently struggle to capture structural details. Recent methods attempt to compensate for this by introducing RS-pretrained DINO features. However, the… ▽ More Open-vocabulary semantic segmentation in the remote sensing (RS) field requires both language-aligned recognition and fine-grained spatial delineation. Although CLIP offers robust semantic generalization, its global-aligned visual representations inherently struggle to capture structural details. Recent methods attempt to compensate for this by introducing RS-pretrained DINO features. However, these methods treat CLIP representations as a monolithic semantic space and cannot localize where structural enhancement is required, failing to effectively delineate boundaries while risking the disruption of CLIP's semantic integrity. To address this limitation, we propose DR-Seg, a novel decouple-and-rectify framework in this paper. Our method is motivated by the key observation that CLIP feature channels exhibit distinct functional heterogeneity rather than forming a uniform semantic space. Building on this insight, DR-Seg decouples CLIP features into semantics-dominated and structure-dominated subspaces, enabling targeted structural enhancement by DINO without distorting language-aligned semantics. Subsequently, a prior-driven graph rectification module injects high-fidelity structural priors under DINO guidance to form a refined branch, while an uncertainty-guided adaptive fusion module dynamically integrates this refined branch with the original CLIP branch for final prediction. Comprehensive experiments across eight benchmarks demonstrate that DR-Seg establishes a new state-of-the-art. △ Less

Submitted 2 April, 2026; originally announced April 2026.

arXiv:2604.01826 [pdf, ps, other]

SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers

Authors: Xiang Yang, Feifei Li, Mi Zhang, Geng Hong, Xiaoyu You, Min Yang

Abstract: Recent Text-to-Image (T2I) models based on rectified-flow transformers (e.g., SD3, FLUX) achieve high generative fidelity but remain vulnerable to unsafe semantics, especially when triggered by multi-token interactions. Existing mitigation methods largely rely on fine-tuning or attention modulation for concept unlearning; however, their expensive computational overhead and design tailored to U-Net… ▽ More Recent Text-to-Image (T2I) models based on rectified-flow transformers (e.g., SD3, FLUX) achieve high generative fidelity but remain vulnerable to unsafe semantics, especially when triggered by multi-token interactions. Existing mitigation methods largely rely on fine-tuning or attention modulation for concept unlearning; however, their expensive computational overhead and design tailored to U-Net-based denoisers hinder direct adaptation to transformer-based diffusion models (e.g., MMDiT). In this paper, we conduct an in-depth analysis of the attention mechanism in MMDiT and find that unsafe semantics concentrate within interpretable, low-dimensional subspaces at head level, where a finite set of safety-critical heads is responsible for unsafe feature extraction. We further observe that perturbing the Rotary Positional Embedding (RoPE) applied to the query and key vectors can effectively modify some specific concepts in the generated images. Motivated by these insights, we propose SafeRoPE, a lightweight and fine-grained safe generation framework for MMDiT. Specifically, SafeRoPE first constructs head-wise unsafe subspaces by decomposing unsafe embeddings within safety-critical heads, and computes a Latent Risk Score (LRS) for each input vector via projection onto these subspaces. We then introduce head-wise RoPE perturbations that can suppress unsafe semantics without degrading benign content or image quality. SafeRoPE combines both head-wise LRS and RoPE perturbations to perform risk-specific head-wise rotation on query and key vector embeddings, enabling precise suppression of unsafe outputs while maintaining generation fidelity. Extensive experiments demonstrate that SafeRoPE achieves SOTA performance in balancing effective harmful content mitigation and utility preservation for safe generation of MMDiT. Codes are available at https://github.com/deng12yx/SafeRoPE. △ Less

Submitted 2 April, 2026; originally announced April 2026.

Comments: CVPR26

arXiv:2604.01738 [pdf, ps, other]

AeroTherm-GPT: A Verification-Centered LLM Framework for Thermal Protection System Engineering Workflows

Authors: Chuhan Qiao, Jinglai Zheng, Jie Huang, Buyue Zhao, Fan Li, Haiming Huang

Abstract: Integrating Large Language Models (LLMs) into hypersonic thermal protection system (TPS) design is bottlenecked by cascading constraint violations when generating executable simulation artifacts. General-purpose LLMs, treating generation as single-pass text completion, fail to satisfy the sequential, multi-gate constraints inherent in safety-critical engineering workflows. To address this, we prop… ▽ More Integrating Large Language Models (LLMs) into hypersonic thermal protection system (TPS) design is bottlenecked by cascading constraint violations when generating executable simulation artifacts. General-purpose LLMs, treating generation as single-pass text completion, fail to satisfy the sequential, multi-gate constraints inherent in safety-critical engineering workflows. To address this, we propose AeroTherm-GPT, the first TPS-specialized LLM Agent, instantiated through a Constraint-Closed-Loop Generation (CCLG) framework. CCLG organizes TPS artifact generation as an iterative workflow comprising generation, validation, CDG-guided repair, execution, and audit. The Constraint Dependency Graph (CDG) encodes empirical co-resolution structure among constraint categories, directing repair toward upstream fault candidates based on lifecycle ordering priors and empirical co-resolution probabilities. This upstream-priority mechanism resolves multiple downstream violations per action, achieving a Root-Cause Fix Efficiency of 4.16 versus 1.76 for flat-checklist repair. Evaluated on HyTPS-Bench and validated against external benchmarks, AeroTherm-GPT achieves 88.7% End-to-End Success Rate (95% CI: 87.5-89.9), a gain of +12.5 pp over the matched non-CDG ablation baseline, without catastrophic forgetting on scientific reasoning and code generation tasks. △ Less

Submitted 2 April, 2026; originally announced April 2026.

arXiv:2604.00813 [pdf, ps, other]

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

Authors: Sicheng Zuo, Zixun Xie, Wenzhao Zheng, Shaoqing Xu, Fang Li, Hanbing Li, Long Chen, Zhi-Xin Yang, Jiwen Lu

Abstract: End-to-end autonomous driving has evolved from the conventional paradigm based on sparse perception into vision-language-action (VLA) models, which focus on learning language descriptions as an auxiliary task to facilitate planning. In this paper, we propose an alternative Vision-Geometry-Action (VGA) paradigm that advocates dense 3D geometry as the critical cue for autonomous driving. As vehicles… ▽ More End-to-end autonomous driving has evolved from the conventional paradigm based on sparse perception into vision-language-action (VLA) models, which focus on learning language descriptions as an auxiliary task to facilitate planning. In this paper, we propose an alternative Vision-Geometry-Action (VGA) paradigm that advocates dense 3D geometry as the critical cue for autonomous driving. As vehicles operate in a 3D world, we think dense 3D geometry provides the most comprehensive information for decision-making. However, most existing geometry reconstruction methods (e.g., DVGT) rely on computationally expensive batch processing of multi-frame inputs and cannot be applied to online planning. To address this, we introduce a streaming Driving Visual Geometry Transformer (DVGT-2), which processes inputs in an online manner and jointly outputs dense geometry and trajectory planning for the current frame. We employ temporal causal attention and cache historical features to support on-the-fly inference. To further enhance efficiency, we propose a sliding-window streaming strategy and use historical caches within a certain interval to avoid repetitive computations. Despite the faster speed, DVGT-2 achieves superior geometry reconstruction performance on various datasets. The same trained DVGT-2 can be directly applied to planning across diverse camera configurations without fine-tuning, including closed-loop NAVSIM and open-loop nuScenes benchmarks. △ Less

Submitted 7 April, 2026; v1 submitted 1 April, 2026; originally announced April 2026.

Comments: Code is available at https://github.com/wzzheng/DVGT

arXiv:2603.29854 [pdf, ps, other]

First energy scan measurement of $e^{+}e^{-}\to K^{+}K^{-}$ around the $ψ(2S)$ resonance

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (683 additional authors not shown)

Abstract: We report the first measurement of the $e^{+}e^{-}\to K^{+}K^{-}$ cross sections around the $ψ(2S)$ resonance using the energy scan method. The analysis is based on $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of 495~pb$^{-1}$ collected with the BESIII detector at BEPCII. By analyzing the cross section line-shape, we extract the relative phase $Φ$ between the strong and el… ▽ More We report the first measurement of the $e^{+}e^{-}\to K^{+}K^{-}$ cross sections around the $ψ(2S)$ resonance using the energy scan method. The analysis is based on $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of 495~pb$^{-1}$ collected with the BESIII detector at BEPCII. By analyzing the cross section line-shape, we extract the relative phase $Φ$ between the strong and electromagnetic amplitudes of the $ψ(2S)$ resonance, a fundamental parameter in charmonium physics, based on the assumption that the relative phase between the electromagnetic amplitude of the $ψ(2S)$ resonance and the continuum is zero. Two distinct solutions for the branching fraction $\mathcal{B}$ of $ψ(2S)\to K^{+}K^{-}$ are observed: a constructive interference solution with $\mathcal{B}=(7.49\pm0.41)\times10^{-5}$ and $Φ=(110.1 \pm6.7)^\circ$, and a destructive interference solution with $\mathcal{B}=(10.94\pm0.48)\times10^{-5}$ and $Φ=(-106.8\pm5.7)^\circ$. A significant correlation between $Φ$ and $\mathcal{B}$ is established, demonstrating that interference effects must be taken into account in the $ψ(2S)$ branching fraction measurements. Additionally, the first results for both the $ψ(2S)$ strong form factor, which characterizes the strong coupling between $ψ(2S)$ and $K^{+}K^{-}$, and the energy-dependent electromagnetic form factor of the charged kaon in this energy region are here reported. △ Less

Submitted 31 March, 2026; originally announced March 2026.

Comments: 9 pages, 4 figures

arXiv:2603.28232 [pdf, ps, other]

Observation of $Λ^+_c\to nπ^+η$ and search for $Λ^+_c\to na_0(980)^+$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (722 additional authors not shown)

Abstract: By analysing 6.1 ${\rm fb}^{-1}$ of data collected at center-of-mass energies between $\sqrt{s}=4.600$ and 4.843 $\rm GeV$ with the BESIII detector at the BEPCII collider, we observe the decay $Λ_c^+\to nπ^+η$ for the first time with a statistical significance of $9.5σ$. The ratio of branching fractions $\mathcal{B}(Λ_c^+\to nπ^+η)/\mathcal{B}(Λ_c^+\to Λπ^+η)$ is measured to be… ▽ More By analysing 6.1 ${\rm fb}^{-1}$ of data collected at center-of-mass energies between $\sqrt{s}=4.600$ and 4.843 $\rm GeV$ with the BESIII detector at the BEPCII collider, we observe the decay $Λ_c^+\to nπ^+η$ for the first time with a statistical significance of $9.5σ$. The ratio of branching fractions $\mathcal{B}(Λ_c^+\to nπ^+η)/\mathcal{B}(Λ_c^+\to Λπ^+η)$ is measured to be $0.155\pm0.031_{\rm stat.}\pm0.012_{\rm syst.}$ Taking the world average of $\mathcal{B}(Λ_c^+\to Λπ^+η)$ as reference, the absolute branching fraction is calculated to be $\mathcal{B}(Λ_c^+\to nπ^+η)=(2.94\pm0.59_{\rm stat.}\pm0.23_{\rm syst.}\pm0.13_{\rm ref.})\times10^{-3}$. The intermediate process $Λ_c^+\to na_0(980)^+$ is also searched for in the $π^+η$ invariant mass spectrum. Since no significant signal is found, the upper limit on $\mathcal{B}(Λ_c^+\to na_0(980)^+)\times\mathcal{B}(a_0(980)^+\toπ^+η)$ is set to $8.4\times10^{-4}$ at 90\% confidence level. A sophisticated deep learning approach using a Transformer-based architecture is employed to distinguish signals from prevalent hadronic backgrounds, complemented by thorough validation and systematic uncertainty quantification. △ Less

Submitted 30 March, 2026; originally announced March 2026.

Comments: 25 pages, 6 figures

arXiv:2603.28214 [pdf]

Staged Laser Wakefield Acceleration for Saturated Lasing of Bandwidth-Tunable Free-Electron Lasers from EUV to X-ray

Authors: Hengyuan Xiao, Fei Li, Shuang Liu, Yuchen Jiang, Siqin Ding, Zhi Song, Jianfei Hua, Wei Lu

Abstract: Free-electron lasers (FELs) provide a revolutionary tool for capturing the structure and dynamics of matter in real time at the atomic scale. The size and cost of FELs can be substantially reduced by using laser wakefield acceleration (LWFA), which offers acceleration gradients orders of magnitude beyond radiofrequency technology, producing multi-GeV electron beams within tens of centimeters. This… ▽ More Free-electron lasers (FELs) provide a revolutionary tool for capturing the structure and dynamics of matter in real time at the atomic scale. The size and cost of FELs can be substantially reduced by using laser wakefield acceleration (LWFA), which offers acceleration gradients orders of magnitude beyond radiofrequency technology, producing multi-GeV electron beams within tens of centimeters. This compactness opens the possibility of integrating multiple operating modes - from the EUV to X-rays including broadband operation - into one facility. Realizing this vision, however, faces key challenges: current LWFA bunches are too short to sustain sufficient radiation slippage, limiting FEL pulse energy at EUV wavelengths, while the large energy spread and emittance make X-ray lasing even more demanding. Here we present a LWFA-driven FEL scheme that addresses these challenges, enabling multi-mode operation spanning different wavelengths and bandwidths within a single facility. The scheme employs staged acceleration to reach multi-GeV energies while preserving beam quality, combined with a dual-chicane beamline that stretches the bunch to mitigate the radiation slippage for EUV FEL and tailors the energy chirp for diverse FEL bandwidth modes. Simulations demonstrate that the scheme can generate high-quality electron beams with energies up to 7 GeV and tunable energy chirp, enabling both FEL saturation from the EUV to X-ray wavelengths and large bandwidth operation with a bandwidth of up to 11%. This work provides a roadmap for compact, multi-mode FELs based on plasma acceleration, and the high-energy, high-quality beams achieved also point toward compact injectors for next-generation storage-ring light sources. △ Less

Submitted 30 March, 2026; originally announced March 2026.

arXiv:2603.28162 [pdf, ps, other]

ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization

Authors: Bingchen Li, Zhixin Wang, Fan Li, Jiaqi Xu, Jiaming Guo, Renjing Pei, Xin Li, Zhibo Chen

Abstract: Old photos preserve invaluable historical memories, making their restoration and colorization highly desirable. While existing restoration models can address some degradation issues like denoising and scratch removal, they often struggle with accurate colorization. This limitation arises from the unique degradation inherent in old photos, such as faded brightness and altered color hues, which are… ▽ More Old photos preserve invaluable historical memories, making their restoration and colorization highly desirable. While existing restoration models can address some degradation issues like denoising and scratch removal, they often struggle with accurate colorization. This limitation arises from the unique degradation inherent in old photos, such as faded brightness and altered color hues, which are different from modern photo distributions, creating a substantial domain gap during colorization. In this paper, we propose a novel old photo colorization framework based on the generative diffusion model FLUX. Our approach introduces a structure-color decoupling strategy that separates structure preservation from color restoration, enabling accurate colorization of old photos while maintaining structural consistency. We further enhance the model with a progressive Direct Preference Optimization (Pro-DPO) strategy, which allows the model to learn subtle color preferences through coarse-to-fine transitions in color augmentation. Additionally, we address the limitations of text-based prompts by introducing visual semantic prompts, which extract fine-grained semantic information directly from old photos, helping to eliminate the color bias inherent in old photos. Experimental results on both synthetic and real datasets demonstrate that our approach outperforms existing state-of-the-art colorization methods, including closed-source commercial models, producing high-quality and vivid colorization. △ Less

Submitted 30 March, 2026; originally announced March 2026.

Comments: Accepted by CVPR26

arXiv:2603.27925 [pdf, ps, other]

Universal $R$-matrix of double parameter quantum affine algebra $U_{q,Q}({\hat {sl_2}})$

Authors: Fengchang Li, Masatake Maruyama, Hiroyuki Yamane

Abstract: We give the explicit formula of the universal $R$-matrix of a double parameter (or two-parameter, or multi-parameter) quantum affine algebra of type ${\mathrm{A}}_1^{(1)}$. For $N$ with $q_{00}q_{01}$ being a primitive $N$-th root of unity, we introduce its $2N$-dimensional representation and explicitly calculate the $R$-matrix associated with it via the universal $R$-matrix. We give the explicit formula of the universal $R$-matrix of a double parameter (or two-parameter, or multi-parameter) quantum affine algebra of type ${\mathrm{A}}_1^{(1)}$. For $N$ with $q_{00}q_{01}$ being a primitive $N$-th root of unity, we introduce its $2N$-dimensional representation and explicitly calculate the $R$-matrix associated with it via the universal $R$-matrix. △ Less

Submitted 29 March, 2026; originally announced March 2026.

Comments: 32 pages, any comment is welcome

arXiv:2603.27703 [pdf, ps, other]

KAT-Coder-V2 Technical Report

Authors: Fengxiang Li, Han Zhang, Haoyang Huang, Jinghui Wang, Jinhua Hao, Kun Yuan, Mengtong Li, Minglei Zhang, Pengcheng Xu, Wenhao Zhuang, Yizhen Shao, Zongxian Feng, Can Tang, Chao Wang, Chengxiao Tong, Fan Yang, Gang Xiong, Haixuan Gao, Han Gao, Hao Wang, Haochen Liu, Hongliang Sun, Jiabao Li, Jingwen Chang, Jun Du , et al. (21 additional authors not shown)

Abstract: We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy disti… ▽ More We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy distillation. We develop KwaiEnv, a modular infrastructure sustaining tens of thousands of concurrent sandbox instances, and scale RL training along task complexity, intent alignment, and scaffold generalization. We further propose MCLA for stabilizing MoE RL training and Tree Training for eliminating redundant computation over tree-structured trajectories with up to 6.2x speedup. KAT-Coder-V2 achieves 79.6% on SWE-bench Verified (vs. Claude Opus 4.6 at 80.8%), 88.7 on PinchBench (surpassing GLM-5 and MiniMax M2.7), ranks first across all three frontend aesthetics scenarios, and maintains strong generalist scores on Terminal-Bench Hard (46.8) and tau^2-Bench (93.9). Our model is publicly available at https://streamlake.com/product/kat-coder. △ Less

Submitted 29 March, 2026; originally announced March 2026.

Comments: 22 pages, 7 figures

arXiv:2603.26174 [pdf, ps, other]

CREval: An Automated Interpretable Evaluation for Creative Image Manipulation under Complex Instructions

Authors: Chonghuinan Wang, Zihan Chen, Yuxiang Wei, Tianyi Jiang, Xiaohe Wu, Fan Li, Wangmeng Zuo, Hongxun Yao

Abstract: Instruction-based multimodal image manipulation has recently made rapid progress. However, existing evaluation methods lack a systematic and human-aligned framework for assessing model performance on complex and creative editing tasks. To address this gap, we propose CREval, a fully automated question-answer (QA)-based evaluation pipeline that overcomes the incompleteness and poor interpretability… ▽ More Instruction-based multimodal image manipulation has recently made rapid progress. However, existing evaluation methods lack a systematic and human-aligned framework for assessing model performance on complex and creative editing tasks. To address this gap, we propose CREval, a fully automated question-answer (QA)-based evaluation pipeline that overcomes the incompleteness and poor interpretability of opaque Multimodal Large Language Models (MLLMs) scoring. Simultaneously, we introduce CREval-Bench, a comprehensive benchmark specifically designed for creative image manipulation under complex instructions. CREval-Bench covers three categories and nine creative dimensions, comprising over 800 editing samples and 13K evaluation queries. Leveraging this pipeline and benchmark, we systematically evaluate a diverse set of state-of-the-art open and closed-source models. The results reveal that while closed-source models generally outperform open-source ones on complex and creative tasks, all models still struggle to complete such edits effectively. In addition, user studies demonstrate strong consistency between CREval's automated metrics and human judgments. Therefore, CREval provides a reliable foundation for evaluating image editing models on complex and creative image manipulation tasks, and highlights key challenges and opportunities for future research. △ Less

Submitted 27 March, 2026; originally announced March 2026.

Comments: Accepted by CVPR2026

arXiv:2603.25938 [pdf, ps, other]

Narrowband searches for continuous gravitational waves from known pulsars in the first two parts of the fourth LIGO--Virgo--KAGRA observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, A. Adam, C. Adamcewicz, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, A. Agapito, D. Agarwal, M. Agathos, N. Aggarwal, S. Aggarwal, O. D. Aguiar, I. -L. Ahrend, L. Aiello, A. Ain, P. Ajith , et al. (1831 additional authors not shown)

Abstract: Rotating non-axisymmetric neutron stars (NSs) are promising sources for continuous gravitational waves (CWs). Such CWs can, if detected, inform us about the internal structure and equation of state of NSs. Here, we present a narrowband search for CWs from known pulsars, for which an efficient and sensitive matched-filter search can be applied. Narrowband searches are designed to be robust to misma… ▽ More Rotating non-axisymmetric neutron stars (NSs) are promising sources for continuous gravitational waves (CWs). Such CWs can, if detected, inform us about the internal structure and equation of state of NSs. Here, we present a narrowband search for CWs from known pulsars, for which an efficient and sensitive matched-filter search can be applied. Narrowband searches are designed to be robust to mismatches between the electromagnetic (EM) and gravitational emissions, in contrast to fully targeted searches where the CW emission is assumed to be phase-locked to the EM one. In this work, we search for the CW counterparts emitted by 34 pulsars using data from the first and second parts of the fourth LIGO--Virgo--KAGRA observing run. This is the largest number of pulsars so far targeted for narrowband searches in the advanced detector era. We use the 5n-vector narrowband pipeline, which applies frequency-domain matched filtering. In previous searches, it covered a narrow range in the frequency -- frequency time derivative ($f$ -- $\dot{f}$) space. Here, we also explore a range in the second time derivative of the frequency $\ddot{f}$ around the value indicated by EM observations. Additionally, for the first time, we target sources in a binary system with this kind of search. We find no evidence for CWs and therefore set upper limits on the strain amplitude emitted by each pulsar, using simulated signals added in real data. For 20 analyses, we report an upper limit below the theoretical spin-down limit. The tightest constraint is for pulsar PSR J0534+2200 (the Crab pulsar), for which our strain upper limit on the CW amplitude is $\lesssim 2\%$ of its spin-down limit, corresponding to less than $0.04\%$ of the spin-down power being radiated in the CW channel. △ Less

Submitted 26 March, 2026; originally announced March 2026.

Comments: 30 pages, 6 figures, submitted to ApJ

Report number: LIGO-P2500612

arXiv:2603.25808 [pdf, ps, other]

Searches for Continuous Gravitational Waves from Supernova Remnants in the first part of the LIGO-Virgo-KAGRA Fourth Observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, A. Agapito, D. Agarwal, M. Agathos, N. Aggarwal, S. Aggarwal, O. D. Aguiar, I. -L. Ahrend, L. Aiello, A. Ain, P. Ajith, T. Akutsu , et al. (1742 additional authors not shown)

Abstract: We present results from directed searches for continuous gravitational waves from a sample of 15 nearby supernova remnants, likely hosting young neutron star candidates, using data from the first eight months of the fourth observing run (O4) of the LIGO-Virgo-KAGRA Collaboration. The analysis employs five pipelines: four semi-coherent methods -- the Band-Sampled-Data directed pipeline, Weave and t… ▽ More We present results from directed searches for continuous gravitational waves from a sample of 15 nearby supernova remnants, likely hosting young neutron star candidates, using data from the first eight months of the fourth observing run (O4) of the LIGO-Virgo-KAGRA Collaboration. The analysis employs five pipelines: four semi-coherent methods -- the Band-Sampled-Data directed pipeline, Weave and two Viterbi pipelines (single- and dual-harmonic) -- and PyStoch, a cross-correlation-based pipeline. These searches cover wide frequency bands and do not assume prior knowledge of the targets' ephemerides. No evidence of a signal is found from any of the 15 sources. We set 95\% confidence-level upper limits on the intrinsic strain amplitude, with the most stringent constraints reaching $\sim 4 \times 10^{-26}$ near 300 Hz for the nearby source G266.2$-$1.2 (Vela Jr.). We also derive limits on neutron star ellipticity and $r$-mode amplitudes for the same source, with the best constraints reaching $\lesssim 10^{-7}$ and $\lesssim 10^{-5}$, respectively, at frequencies above 400 Hz. These results represent the most sensitive wide-band directed searches for continuous gravitational waves from supernova remnants to date. △ Less

Submitted 2 April, 2026; v1 submitted 26 March, 2026; originally announced March 2026.

arXiv:2603.25649 [pdf, ps, other]

Amplitude analysis and branching fraction measurement of the decay $D^0 \to K^+K^-π^0π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, M. S. Anderson, Y. Bai, O. Bakina, H. R. Bao, X. L. Bao, M. Barbagiovanni, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone , et al. (749 additional authors not shown)

Abstract: An amplitude analysis of the singly Cabibbo-suppressed decay $D^0 \to K^+ K^- π^0 π^0$ is performed, for the first time, to determine the relative magnitudes and phases of different intermediate processes. The analysis uses $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy 3.773~GeV corresponding to an integrated luminosity of 20.3 $\rm fb^{-1}$. The absolute… ▽ More An amplitude analysis of the singly Cabibbo-suppressed decay $D^0 \to K^+ K^- π^0 π^0$ is performed, for the first time, to determine the relative magnitudes and phases of different intermediate processes. The analysis uses $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy 3.773~GeV corresponding to an integrated luminosity of 20.3 $\rm fb^{-1}$. The absolute branching fraction of $D^0 \to K^+ K^- π^0 π^0$ is measured to be \BF. The dominant intermediate process is $D^0 \to K^{*}(892)^+K^{*}(892)^-$, with a branching fraction of $(2.79 \pm 0.13_{\rm{stat.}} \pm 0.11_{\rm{syst.}}) \times 10^{-3}$. Amplitude analysis reveals that the $D^0 \to K^{*}(892)^+K^{*}(892)^-$ decay is S-wave dominant. The longitudinal polarization fraction of $D^0 \to K^{*}(892)^+ K^{*}(892)^-$ is measured to be $0.468\pm0.046_{\rm{stat.}}\pm0.011_{\rm{syst.}}$. △ Less

Submitted 30 March, 2026; v1 submitted 26 March, 2026; originally announced March 2026.

arXiv:2603.24272 [pdf, ps, other]

Cross Section Measurements of $\bar{n}p \rightarrow K^{+}K^{-}π^{+}(π^{0})$ via Antineutrons Produced by $J/ψ\to p π^{-} \bar{n}$ Decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Aliberti, A. Amoroso, Q. An, Y. H. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, X. L. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. B. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (737 additional authors not shown)

Abstract: Based on a novel method for producing antineutrons via $J/ψ$ decays, we report a study of $\bar{n}p$ inelastic scattering into final states containing kaons. The analysis uses $(10087\pm44)\times 10^6$ $J/ψ$ events collected at the BESIII detector operating at the BEPCII storage ring. Antineutrons are produced via $J/ψ\to p π^{-} \bar{n}$ decays and tagged by the detected protons and pions, result… ▽ More Based on a novel method for producing antineutrons via $J/ψ$ decays, we report a study of $\bar{n}p$ inelastic scattering into final states containing kaons. The analysis uses $(10087\pm44)\times 10^6$ $J/ψ$ events collected at the BESIII detector operating at the BEPCII storage ring. Antineutrons are produced via $J/ψ\to p π^{-} \bar{n}$ decays and tagged by the detected protons and pions, resulting in antineutron momenta ranging from 0 to 1174~MeV/$c$, while target protons are provided by the hydrogen in the beam-pipe material. The cross sections of the reactions $\bar{n}p \rightarrow K^{+}K^{-}π^{+}$ and $\bar{n}p \rightarrow K^{+}K^{-}π^{+}π^{0}$ are measured to be $0.53^{+0.15}_{-0.12} \pm 0.08$~mb and $1.09^{+0.36}_{-0.30} \pm 0.31$~mb respectively, where the first uncertainties are statistical and the second systematic. Due to limited statistics, the intermediate states in these processes are not investigated. The observation of clean antineutron-proton scattering events indicates the potential of this approach for future investigations of antineutron-proton interactions. △ Less

Submitted 25 March, 2026; originally announced March 2026.

arXiv:2603.23737 [pdf, ps, other]

Risk-Aware Linear-Quadratic Regulation with Temporally Coupled States

Authors: Chuanning Wei, Kin Fung Li, Dionysis Kalogerias, Margaret P. Chapman

Abstract: We formulate and solve a discrete-time linear-quadratic regulation (LQR) problem in a finite horizon that penalizes temporal variability and stochastic variability of the state trajectory. Our approach enables the user to strike a balance between regulating the state and reducing temporal variability, with explicit sensitivity to risk. We achieve this by extending a risk measure called predictive… ▽ More We formulate and solve a discrete-time linear-quadratic regulation (LQR) problem in a finite horizon that penalizes temporal variability and stochastic variability of the state trajectory. Our approach enables the user to strike a balance between regulating the state and reducing temporal variability, with explicit sensitivity to risk. We achieve this by extending a risk measure called predictive variance to a setting with temporally coupled states. Numerical examples demonstrate the effect of temporal coupling in both risk-aware and risk-neutral control settings. Particularly, we observe that explicitly penalizing temporal variability alone can also reduce stochastic variability. △ Less