-
Full-polarization millimeter wavelength variability of Sagittarius A* during the 2018 EHT campaign
Authors:
Ezequiel Albentosa-Ruiz,
Jasmin E. Washington,
Nicola Marchili,
Iván Martí-Vidal,
Ciriaco Goddi,
Maciek Wielgus,
Alejandro Mus,
Angelo Ricarte,
Daniel P. Marrone,
León D. S. Salas,
Yuhei Iwata,
Douglas F. Carlos,
Alexandra J. Tetarenko,
Kotaro Moriyama,
Vedant Dhruv,
Kazunori Akiyama,
Antxon Alberdi,
Walter Alef,
Juan Carlos Algaba,
Richard Anantua,
Keiichi Asada,
Rebecca Azulay,
Uwe Bach,
Anne-Kathrin Baczko,
David Ball
, et al. (250 additional authors not shown)
Abstract:
Sagittarius A* (Srg A*), the supermassive black hole at the center of the Milky Way, provides a unique laboratory to study accretion dynamics and plasma processes near the event horizon. We investigated the variability and polarization properties of Srg A* using ALMA observations during the 2018 Event Horizon Telescope campaign. We analyzed high-cadence full-polarization light curves from ALMA at…
▽ More
Sagittarius A* (Srg A*), the supermassive black hole at the center of the Milky Way, provides a unique laboratory to study accretion dynamics and plasma processes near the event horizon. We investigated the variability and polarization properties of Srg A* using ALMA observations during the 2018 Event Horizon Telescope campaign. We analyzed high-cadence full-polarization light curves from ALMA at millimeter wavelengths, performed time-series analysis, and investigated the temporal behavior during an X-ray flare observed by Chandra on 2018 April 24. The variability characteristics are compared with expectations from standard accretion flow models. We find low variability in total intensity ($σ/μ< 10\%$), but significantly higher variability in linear and circular polarization (~ 30% and ~ 50%, respectively). A time-series analysis reveals red-noise variability, with power spectral densities between -2 and -3 across all Stokes parameters. Polarized intensity shows stable intra-day timescales, while total intensity exhibits more variable timescales, suggesting distinct emission regions, with polarization likely arising from a coherent structure. On April 24, a statistically significant inter-band delay in polarized intensity coincides with a near-simultaneous X-ray and millimeter peak that deviates from the typical delayed flare scenario. This event also features enhanced millimeter variability and coherent polarization loop evolution. The observed simultaneity challenges standard models of transient synchrotron emission with cooling delays, favoring instead a scenario of continuous energy injection in an optically thin region. Our results offer new constraints on the physical mechanisms driving variability in Srg A*, and provide key observational input for refining theoretical models of accretion and plasma behavior in the vicinity of supermassive black holes.
△ Less
Submitted 11 April, 2026;
originally announced April 2026.
-
Inhomogeneous Scaling Function and Heat Kernel Estimates on Fractals Satisfying Some Resistance Conditions
Authors:
Diwen Chang,
Guanhua Liu
Abstract:
In this paper, we focus on strongly local regular Dirichlet forms, especially those satisfying Morrey-type inequalities. We prove the equivalence between resistance estimates and heat kernel estimates in this case. Self-similar forms on fractals serve as a major application, where we construct a spatially inhomogeneous scaling function and characterize all the doubling self-similar measures. Furth…
▽ More
In this paper, we focus on strongly local regular Dirichlet forms, especially those satisfying Morrey-type inequalities. We prove the equivalence between resistance estimates and heat kernel estimates in this case. Self-similar forms on fractals serve as a major application, where we construct a spatially inhomogeneous scaling function and characterize all the doubling self-similar measures. Further, on some special examples, the resistance conditions are reduced to some geometric conditions, on which a complete theory on self-similar Dirichlet spaces is established therein. In particular, we construct a concrete example on rotated triangle fractals, where the optimal heat kernel estimate is not related at all to the lower scaling exponent.
△ Less
Submitted 4 April, 2026;
originally announced April 2026.
-
MuonEq: Balancing Before Orthogonalization with Lightweight Equilibration
Authors:
Da Chang,
Qiankun Shi,
Lvgang Zhang,
Yu Li,
Ruijie Zhang,
Yao Lu,
Yongxiang Liu,
Ganzhao Yuan
Abstract:
Orthogonalized-update optimizers such as Muon improve training of matrix-valued parameters, but existing extensions mostly act either after orthogonalization by rescaling updates or before it with heavier whitening-based preconditioners. We introduce {\method}, a lightweight family of pre-orthogonalization equilibration schemes for Muon in three forms: two-sided row/column normalization (RC), row…
▽ More
Orthogonalized-update optimizers such as Muon improve training of matrix-valued parameters, but existing extensions mostly act either after orthogonalization by rescaling updates or before it with heavier whitening-based preconditioners. We introduce {\method}, a lightweight family of pre-orthogonalization equilibration schemes for Muon in three forms: two-sided row/column normalization (RC), row normalization (R), and column normalization (C). These variants rebalance the momentum matrix before finite-step Newton--Schulz using row/column squared-norm statistics and only $\mathcal{O}(m+n)$ auxiliary state. We show that finite-step orthogonalization is governed by input spectral properties, especially stable rank and condition number, and that row/column normalization is a zeroth-order whitening surrogate that removes marginal scale mismatch. For the hidden matrix weights targeted by {\method}, the row-normalized variant R is the natural default and preserves the $\widetilde{\mathcal{O}}(T^{-1/4})$ stationarity guarantee of Muon-type methods. In LLaMA2 pretraining on C4, the default R variant consistently outperforms Muon on 130M and 350M models, yielding faster convergence and lower validation perplexity.
△ Less
Submitted 30 March, 2026;
originally announced March 2026.
-
Entire Period Transient Stability of Synchronous Generators Considering LVRT Switching of Nearby Renewable Energy Sources
Authors:
Bingfang Li,
Songhao Yang,
Guosong Wang,
Yiwen Hu,
Xu Zhang,
Zhiguo Hao,
Dongxu Chang,
Baohui Zhang
Abstract:
In scenarios where synchronous generators (SGs) and grid-following renewable energy sources (GFLR) are co-located, existing research, which mainly focuses on the first-swing stability of SGs, often overlooks ongoing dynamic interactions between GFLRs and SGs throughout the entire rotor swing period. To address this gap, this study first reveals that the angle oscillations of SG can cause periodic…
▽ More
In scenarios where synchronous generators (SGs) and grid-following renewable energy sources (GFLR) are co-located, existing research, which mainly focuses on the first-swing stability of SGs, often overlooks ongoing dynamic interactions between GFLRs and SGs throughout the entire rotor swing period. To address this gap, this study first reveals that the angle oscillations of SG can cause periodic grid voltage fluctuations, potentially triggering low-voltage ride-through (LVRT) control switching of GFLR repeatedly. Then, the periodic energy changes of SGs under "circular" and "rectangular" LVRT limits are analyzed. The results indicate that circular limits are detrimental to SG's first-swing stability, while rectangular limits and their slow recovery strategies can lead to SG's multi-swing instability. Conservative stability criteria are also proposed for these phenomena. Furthermore, an additional controller based on feedback linearization is introduced to enhance the entire period transient stability of SG by adjusting the post-fault GFLR output current. Finally, the efficacy of the analysis is validated through electromagnetic transient simulations and controller hardware-in-the-loop (CHIL) tests.
△ Less
Submitted 26 March, 2026;
originally announced March 2026.
-
Photon Ring Astrometry I: A Simple Spin Measurement Technique for High-Resolution Images of M87*
Authors:
Delilah E. A. Gates,
Dominic O. Chang,
Aaron Held,
Daniel C. M. Palumbo
Abstract:
The central supermassive black hole of the galaxy M87 is currently a target for precision spin measurement using high-resolution, horizon-scale imaging. Such observations aim to resolve the first lensed (${n}~{=}~{1}$) sub-image of the photon ring from the broader direct image. In this work, we identify a concrete observable -- the displacement between the centers of the ${n}~{=}~{1}$ photon-ring…
▽ More
The central supermassive black hole of the galaxy M87 is currently a target for precision spin measurement using high-resolution, horizon-scale imaging. Such observations aim to resolve the first lensed (${n}~{=}~{1}$) sub-image of the photon ring from the broader direct image. In this work, we identify a concrete observable -- the displacement between the centers of the ${n}~{=}~{1}$ photon-ring sub-image and the direct image -- and propose its use in a simple spin-measurement technique. Leveraging the assumption that the observed large-scale jet of M87 is aligned with the black-hole spin axis, we separate the relative position of the photon ring into components parallel and transverse to the projected spin axis, normalizing both components with respect to the measured diameter of the ${n}~{=}~{1}$ sub-image. We show that the parallel shift is primarily determined by inclination and emission radius, while the transverse shift is tightly correlated with inclination and spin. We demonstrate these effects both in a simple geometric model (to explain the underlying physics) and in GRMHD simulations with magnetically arrested disks (to provide realistic instantiations of the effect). We find that a relative astrometric resolution of ${\lesssim}~{0.1\;μ\rm{as}}$ is sufficient to constrain the spin to better than 9% if the accretion flow is prograde or 22% if the flow is retrograde. If the direction of the accretion flow is undetermined, the spin can be constrained to within 26%. More generally, this identifies relative photon ring astrometry as a promising method to constrain the underlying spacetime geometry and introduces a spin-constraint technique that does not rely on geometric modeling of the observed emission.
△ Less
Submitted 25 March, 2026;
originally announced March 2026.
-
CAF-Score: Calibrating CLAP with LALMs for Reference-free Audio Captioning Evaluation
Authors:
Insung Lee,
Taeyoung Jeong,
Haejun Yoo,
Du-Seong Chang,
Myoung-Wan Koo
Abstract:
While Large Audio-Language Models (LALMs) have advanced audio captioning, robust evaluation remains difficult. Reference-based metrics are expensive and often fail to assess acoustic fidelity, while Contrastive Language-Audio Pretraining (CLAP)-based approaches frequently overlook syntactic errors and fine-grained details. We propose CAF-Score, a reference-free metric that calibrates CLAP's coarse…
▽ More
While Large Audio-Language Models (LALMs) have advanced audio captioning, robust evaluation remains difficult. Reference-based metrics are expensive and often fail to assess acoustic fidelity, while Contrastive Language-Audio Pretraining (CLAP)-based approaches frequently overlook syntactic errors and fine-grained details. We propose CAF-Score, a reference-free metric that calibrates CLAP's coarse-grained semantic alignment with the fine-grained comprehension and syntactic awareness of LALMs. By combining contrastive audio-text embeddings with LALM reasoning, CAF-Score effectively detects syntactic inconsistencies and subtle hallucinations. Experiments on the BRACE benchmark demonstrate that our approach achieves the highest correlation with human judgments, even outperforming reference-based baselines in challenging scenarios. These results highlight the efficacy of CAF-Score for reference-free audio captioning evaluation. Code and results are available at https://github.com/inseong00/CAF-Score.
△ Less
Submitted 19 March, 2026;
originally announced March 2026.
-
Recolour What Matters: Region-Aware Colour Editing via Token-Level Diffusion
Authors:
Yuqi Yang,
Dongliang Chang,
Yijia Ling,
Ruoyi Du,
Zhanyu Ma
Abstract:
Colour is one of the most perceptually salient yet least controllable attributes in image generation. Although recent diffusion models can modify object colours from user instructions, their results often deviate from the intended hue, especially for fine-grained and local edits. Early text-driven methods rely on discrete language descriptions that cannot accurately represent continuous chromatic…
▽ More
Colour is one of the most perceptually salient yet least controllable attributes in image generation. Although recent diffusion models can modify object colours from user instructions, their results often deviate from the intended hue, especially for fine-grained and local edits. Early text-driven methods rely on discrete language descriptions that cannot accurately represent continuous chromatic variations. To overcome this limitation, we propose ColourCrafter, a unified diffusion framework that transforms colour editing from global tone transfer into a structured, region-aware generation process. Unlike traditional colour driven methods, ColourCrafter performs token-level fusion of RGB colour tokens and image tokens in latent space, selectively propagating colour information to semantically relevant regions while preserving structural fidelity. A perceptual Lab-space Loss further enhances pixel-level precision by decoupling luminance and chrominance and constraining edits within masked areas. Additionally, we build ColourfulSet, a largescale dataset of high-quality image pairs with continuous and diverse colour variations. Extensive experiments demonstrate that ColourCrafter achieves state-of-the-art colour accuracy, controllability and perceptual fidelity in fine-grained colour editing. Our project is available at https://yangyuqi317.github.io/ColourCrafter.github.io/.
△ Less
Submitted 18 March, 2026;
originally announced March 2026.
-
Integrating Inductive Biases in Transformers via Distillation for Financial Time Series Forecasting
Authors:
Yu-Chen Den,
Kuan-Yu Chen,
Kendro Vincent,
Darby Tien-Hao Chang
Abstract:
Transformer-based models have been widely adopted for time-series forecasting due to their high representational capacity and architectural flexibility. However, many Transformer variants implicitly assume stationarity and stable temporal dynamics -- assumptions routinely violated in financial markets characterized by regime shifts and non-stationarity. Empirically, state-of-the-art time-series Tr…
▽ More
Transformer-based models have been widely adopted for time-series forecasting due to their high representational capacity and architectural flexibility. However, many Transformer variants implicitly assume stationarity and stable temporal dynamics -- assumptions routinely violated in financial markets characterized by regime shifts and non-stationarity. Empirically, state-of-the-art time-series Transformers often underperform even vanilla Transformers on financial tasks, while simpler architectures with distinct inductive biases, such as CNNs and RNNs, can achieve stronger performance with substantially lower complexity. At the same time, no single inductive bias dominates across markets or regimes, suggesting that robust financial forecasting requires integrating complementary temporal priors. We propose TIPS (Transformer with Inductive Prior Synthesis), a knowledge distillation framework that synthesizes diverse inductive biases -- causality, locality, and periodicity -- within a unified Transformer. TIPS trains bias-specialized Transformer teachers via attention masking, then distills their knowledge into a single student model with regime-dependent alignment across inductive biases. Across four major equity markets, TIPS achieves state-of-the-art performance, outperforming strong ensemble baselines by 55%, 9%, and 16% in annual return, Sharpe ratio, and Calmar ratio, while requiring only 38% of the inference-time computation. Further analyses show that TIPS generates statistically significant excess returns beyond both vanilla Transformers and its teacher ensembles, and exhibits regime-dependent behavioral alignment with classical architectures during their profitable periods. These results highlight the importance of regime-dependent inductive bias utilization for robust generalization in non-stationary financial time series.
△ Less
Submitted 17 March, 2026;
originally announced March 2026.
-
Quantifying surface losses in superconducting aluminum microwave resonators
Authors:
Elizabeth Hedrick,
Faranak Bahrami,
Alexander C. Pakpour-Tabrizi,
Atharv Joshi,
Q. Rumman Rahman,
Ambrose Yang,
Ray D. Chang,
Matthew P. Bland,
Apoorv Jindal,
Guangming Cheng,
Nan Yao,
Robert J. Cava,
Andrew A. Houck,
Nathalie P. de Leon
Abstract:
The recent realization of millisecond-scale coherence with tantalum-on-silicon transmon qubits showed that depositing the Al/AlOx/Al Josephson junction in a high purity, ultrahigh vacuum environment was critical for achieving lifetime-limited coherence, motivating careful examination of the aluminum surface two-level system (TLS) bath. Here, we measure the microwave absorption arising from surface…
▽ More
The recent realization of millisecond-scale coherence with tantalum-on-silicon transmon qubits showed that depositing the Al/AlOx/Al Josephson junction in a high purity, ultrahigh vacuum environment was critical for achieving lifetime-limited coherence, motivating careful examination of the aluminum surface two-level system (TLS) bath. Here, we measure the microwave absorption arising from surface TLSs in superconducting aluminum resonators, following methodology developed for tantalum resonators. We vary film and surface properties and correlate microwave measurements with materials characterization. We find that the lifetimes of superconducting aluminum resonators are primarily limited by surface losses associated with TLSs in the 2.7 nm-thick native AlOx. Treatment with 49% HF removes surface AlOx completely; however, rapid oxide regrowth limits improvements in surface loss and long term device stability. Using these measurements we estimate that TLSs in aluminum interfaces contribute around 27% of the relaxation rate of state-of-the-art tantalum-on-silicon qubits that incorporate aluminum-based Josephson junctions.
△ Less
Submitted 25 March, 2026; v1 submitted 13 March, 2026;
originally announced March 2026.
-
Interface Engineered Moiré Graphene Superlattices: Breaking the Auger Carrier Multiplication Limit for Infrared Single-Photon Detection
Authors:
Sichao Du,
Ning Li,
Zhufeng Pan,
Munir Ali,
Hengrui Zhang,
Duokai Chang,
Yuehang Zhang,
Qiang Wen,
Shuo Zhang,
Hao Wu,
Yunlei Sun,
Qiuting Wang,
Hao Xie,
Chaohao Chen,
Zhenyi Ni,
Qiangbing Guo,
Duo Xiao,
Wen-Yan Yin
Abstract:
Hot electrons undergo Auger scattering during their relaxation process has a multiplication effect,which can generate more electrons above the Fermi level, thus improving the efficiency of photoelectric signal conversion.However,the photo-current gain brought by the Auger carrier multiplication is generally limited with a value less than 5,due to the rapid recombination of photo-generated charge-c…
▽ More
Hot electrons undergo Auger scattering during their relaxation process has a multiplication effect,which can generate more electrons above the Fermi level, thus improving the efficiency of photoelectric signal conversion.However,the photo-current gain brought by the Auger carrier multiplication is generally limited with a value less than 5,due to the rapid recombination of photo-generated charge-carriers and the inherently low light absorption of two-dimensional materials.Herein,by twisting graphene to an interlayer angle of 10<sub>o</sub>,we report a layer-dependent electronic correlations leading to an efficient carrier multiplication gain of 10<sup>3</sup>.This is primarily offered by the additional localized density-of-states at interface of the bi-layer 10<sub>o</sub>,moire graphene,and the enhanced interlayer coupling of electron waves in a five-layer moire graphene superlattice structure.Therefore,we can harvest the hot electrons during their energy relaxation through a thermalized optical phonon bottleneck effect.It is this effect that promotes the accumulated hot electrons to achieve a maximum Auger scattering rate ~ 10<sup>10</sup>*ps<sup>-1</sup>*cm<sup>-2</sup>.Furthermore,the ballistic transport of these hot electrons and Schottky barrier from a 90 nm thick silicon-on-insulator (SOI) silicon effectively block the thermal noise,thus leading to a highly sensitive near-infrared detection characteristic.At a low incident light power of ~ 10<sup>-13</sup> W/cm<sup>2</sup>,the resulting signal-to-noise ratio is more than 100 dB.The strengthened electromagnetic interaction from highly thermalized optical phonon in stacked moire graphene is utilized in this work.The hot electron multiplication suggests the applicability of Van der Waals moire superlattice architecture for harvesting charge carriers,thus paving the pathway to design infrared single-photon avalanche detectors.
△ Less
Submitted 10 March, 2026;
originally announced March 2026.
-
Black Hole Vision: An Interactive iOS Application for Visualizing Black Holes
Authors:
Roman Berens,
Dominic O. Chang,
Trevor Gravely,
Alexandru Lupsasca
Abstract:
The Black Hole Explorer (BHEX) is a proposed mission to launch a sub-millimeter radio telescope into Earth orbit that will take the sharpest images in the history of astronomy and reveal novel horizon-scale features of supermassive black holes. Black Hole Vision is an open-source application, freely available on the iOS App Store, that produces lensed images which highlight the key features expect…
▽ More
The Black Hole Explorer (BHEX) is a proposed mission to launch a sub-millimeter radio telescope into Earth orbit that will take the sharpest images in the history of astronomy and reveal novel horizon-scale features of supermassive black holes. Black Hole Vision is an open-source application, freely available on the iOS App Store, that produces lensed images which highlight the key features expected to appear in the black hole images BHEX will capture. The app combines video feeds from the front- and rear-facing iPhone cameras and uses the black hole lensing equations to synthesize an onscreen image displaying the user's surroundings as if they were gravitationally lensed by a black hole within the cameras' field of view. Here, we describe how light rays are lensed by non-rotating (Schwarzschild) and rotating (Kerr) black holes, and we list the equations needed for computing black-hole-lensed images. We also describe their specific implementation within Black Hole Vision.
△ Less
Submitted 5 March, 2026;
originally announced March 2026.
-
KARL: Knowledge Agents via Reinforcement Learning
Authors:
Jonathan D. Chang,
Andrew Drozdov,
Shubham Toshniwal,
Owen Oertell,
Alexander Trott,
Jacob Portes,
Abhay Gupta,
Pallavi Koppol,
Ashutosh Baheti,
Sean Kulinski,
Ivan Zhou,
Irene Dea,
Krista Opsahl-Ong,
Simon Favreau-Lessard,
Sean Owen,
Jose Javier Gonzalez Ortiz,
Arnav Singhvi,
Xabi Andrade,
Cindy Wang,
Kartik Sreenivasan,
Sam Havens,
Jialu Liu,
Peyton DeNiro,
Wen Sun,
Michael Bendersky
, et al. (1 additional authors not shown)
Abstract:
We present a system for training enterprise search agents via reinforcement learning that achieves state-of-the-art performance across a diverse suite of hard-to-verify agentic search tasks. Our work makes four core contributions. First, we introduce KARLBench, a multi-capability evaluation suite spanning six distinct search regimes, including constraint-driven entity search, cross-document report…
▽ More
We present a system for training enterprise search agents via reinforcement learning that achieves state-of-the-art performance across a diverse suite of hard-to-verify agentic search tasks. Our work makes four core contributions. First, we introduce KARLBench, a multi-capability evaluation suite spanning six distinct search regimes, including constraint-driven entity search, cross-document report synthesis, tabular numerical reasoning, exhaustive entity retrieval, procedural reasoning over technical documentation, and fact aggregation over internal enterprise notes. Second, we show that models trained across heterogeneous search behaviors generalize substantially better than those optimized for any single benchmark. Third, we develop an agentic synthesis pipeline that employs long-horizon reasoning and tool use to generate diverse, grounded, and high-quality training data, with iterative bootstrapping from increasingly capable models. Fourth, we propose a new post-training paradigm based on iterative large-batch off-policy RL that is sample efficient, robust to train-inference engine discrepancies, and naturally extends to multi-task training with out-of-distribution generalization. Compared to Claude 4.6 and GPT 5.2, KARL is Pareto-optimal on KARLBench across cost-quality and latency-quality trade-offs, including tasks that were out-of-distribution during training. With sufficient test-time compute, it surpasses the strongest closed models. These results show that tailored synthetic data in combination with multi-task reinforcement learning enables cost-efficient and high-performing knowledge agents for grounded reasoning.
△ Less
Submitted 5 March, 2026;
originally announced March 2026.
-
Distributional Reinforcement Learning with Information Bottleneck for Uncertainty-Aware DRAM Equalization
Authors:
Muhammad Usama,
Dong Eui Chang
Abstract:
Equalizer parameter optimization is critical for signal integrity in high-speed memory systems operating at multi-gigabit data rates. However, existing methods suffer from computationally expensive eye diagram evaluation, optimization of expected rather than worst-case performance, and absence of uncertainty quantification for deployment decisions. In this paper, we propose a distributional risk-s…
▽ More
Equalizer parameter optimization is critical for signal integrity in high-speed memory systems operating at multi-gigabit data rates. However, existing methods suffer from computationally expensive eye diagram evaluation, optimization of expected rather than worst-case performance, and absence of uncertainty quantification for deployment decisions. In this paper, we propose a distributional risk-sensitive reinforcement learning framework integrating Information Bottleneck latent representations with Conditional Value-at-Risk optimization. We introduce rate-distortion optimal signal compression achieving 51 times speedup over eye diagrams while quantifying epistemic uncertainty through Monte Carlo dropout. Distributional reinforcement learning with quantile regression enables explicit worst-case optimization, while PAC-Bayesian regularization certifies generalization bounds. Experimental validation on 2.4 million waveforms from eight memory units demonstrated mean improvements of 37.1\% and 41.5\% for 4-tap and 8-tap equalizer configurations with worst-case guarantees of 33.8\% and 38.2\%, representing 80.7\% and 89.1\% improvements over Q-learning baselines. The framework achieved 62.5\% high-reliability classification eliminating manual validation for most configurations. These results suggest the proposed framework provides a practical solution for production-scale equalizer optimization with certified worst-case guarantees.
△ Less
Submitted 4 March, 2026;
originally announced March 2026.
-
Seeing as Experts Do: A Knowledge-Augmented Agent for Open-Set Fine-Grained Visual Understanding
Authors:
Junhan Chen,
Zilu Zhou,
Yujun Tong,
Dongliang Chang,
Yitao Luo,
Zhanyu Ma
Abstract:
Fine-grained visual understanding is shifting from static classification to knowledge-augmented reasoning, where models must justify as well as recognise. Existing approaches remain limited by closed-set taxonomies and single-label prediction, leading to significant degradation under open-set or context-dependent conditions. We present the Knowledge-Augmented Fine-Grained Reasoning Agent (KFRA), a…
▽ More
Fine-grained visual understanding is shifting from static classification to knowledge-augmented reasoning, where models must justify as well as recognise. Existing approaches remain limited by closed-set taxonomies and single-label prediction, leading to significant degradation under open-set or context-dependent conditions. We present the Knowledge-Augmented Fine-Grained Reasoning Agent (KFRA), a unified framework that transforms fine-grained perception into evidence-driven reasoning. KFRA operates through a three-stage closed reasoning loop that emulates expert analysis. It first performs open-vocabulary detection and web-scale retrieval to generate category hypotheses. It then conducts discriminative regions localisation by aligning textual knowledge with visual evidence through a global-to-local focusing mechanism. Finally, it integrates all multimodal evidence within a large multimodal model to perform interpretable reasoning. Unlike existing agents that treat retrieval and reasoning as independent processes, KFRA establishes a retrieval-grounding coupling that converts retrieved knowledge into spatially grounded evidence for verification. This design enables factual, interpretable, and task-agnostic reasoning across diverse fine-grained scenarios. To evaluate this capability, we construct FGExpertBench, a benchmark designed to assess reasoning depth and cross-task generalisation across six knowledge dimensions. Extensive experiments demonstrate that KFRA consistently surpasses both standalone large multimodal models and current agent frameworks, achieving up to 19 percent improvement in reasoning accuracy and delivering evidence-grounded interpretability in open-set fine-grained visual understanding.
△ Less
Submitted 4 March, 2026;
originally announced March 2026.
-
HairWeaver: Few-Shot Photorealistic Hair Motion Synthesis with Sim-to-Real Guided Video Diffusion
Authors:
Di Chang,
Ji Hou,
Aljaz Bozic,
Assaf Neuberger,
Felix Juefei-Xu,
Olivier Maury,
Gene Wei-Chin Lin,
Tuur Stuyck,
Doug Roble,
Mohammad Soleymani,
Stephane Grabli
Abstract:
We present HairWeaver, a diffusion-based pipeline that animates a single human image with realistic and expressive hair dynamics. While existing methods successfully control body pose, they lack specific control over hair, and as a result, fail to capture the intricate hair motions, resulting in stiff and unrealistic animations. HairWeaver overcomes this limitation using two specialized modules: a…
▽ More
We present HairWeaver, a diffusion-based pipeline that animates a single human image with realistic and expressive hair dynamics. While existing methods successfully control body pose, they lack specific control over hair, and as a result, fail to capture the intricate hair motions, resulting in stiff and unrealistic animations. HairWeaver overcomes this limitation using two specialized modules: a Motion-Context-LoRA to integrate motion conditions and a Sim2Real-Domain-LoRA to preserve the subject's photoreal appearance across different data domains. These lightweight components are designed to guide a video diffusion backbone while maintaining its core generative capabilities. By training on a specialized dataset of dynamic human motion generated from a CG simulator, HairWeaver affords fine control over hair motion and ultimately learns to produce highly realistic hair that responds naturally to movement. Comprehensive evaluations demonstrate that our approach sets a new state of the art, producing lifelike human hair animations with dynamic details.
△ Less
Submitted 11 February, 2026;
originally announced February 2026.
-
Self-Evolving Recommendation System: End-To-End Autonomous Model Optimization With LLM Agents
Authors:
Haochen Wang,
Yi Wu,
Daryl Chang,
Li Wei,
Lukasz Heldt
Abstract:
Optimizing large-scale machine learning systems, such as recommendation models for global video platforms, requires navigating a massive hyperparameter search space and, more critically, designing sophisticated optimizers, architectures, and reward functions to capture nuanced user behaviors. Achieving substantial improvements in these areas is a non-trivial task, traditionally relying on extensiv…
▽ More
Optimizing large-scale machine learning systems, such as recommendation models for global video platforms, requires navigating a massive hyperparameter search space and, more critically, designing sophisticated optimizers, architectures, and reward functions to capture nuanced user behaviors. Achieving substantial improvements in these areas is a non-trivial task, traditionally relying on extensive manual iterations to test new hypotheses. We propose a self-evolving system that leverages Large Language Models (LLMs), specifically those from Google's Gemini family, to autonomously generate, train, and deploy high-performing, complex model changes within an end-to-end automated workflow. The self-evolving system is comprised of an Offline Agent (Inner Loop) that performs high-throughput hypothesis generation using proxy metrics, and an Online Agent (Outer Loop) that validates candidates against delayed north star business metrics in live production. Our agents act as specialized Machine Learning Engineers (MLEs): they exhibit deep reasoning capabilities, discovering novel improvements in optimization algorithms and model architecture, and formulating innovative reward functions that target long-term user engagement. The effectiveness of this approach is demonstrated through several successful production launches at YouTube, confirming that autonomous, LLM-driven evolution can surpass traditional engineering workflows in both development velocity and model performance.
△ Less
Submitted 10 February, 2026;
originally announced February 2026.
-
Constraining Black Hole Parameters from Shadow and Inner-Shadow Morphology Considering Effects from Thick Disk Accretion Flows
Authors:
Julien A. Kearns,
Dominic O. Chang,
Daniel C. M. Palumbo,
Shane W. Davis
Abstract:
We study the effects of emission geometry on the capability to constrain black hole parameters from measurements of the shadow and inner-shadow of a Reissner-Nordström black hole. We investigate the capability to constrain mass, charge, observer inclination, and emission co-latitude from images of black hole accretion flows that would arise from thick and thin accretion disks. We confirm previous…
▽ More
We study the effects of emission geometry on the capability to constrain black hole parameters from measurements of the shadow and inner-shadow of a Reissner-Nordström black hole. We investigate the capability to constrain mass, charge, observer inclination, and emission co-latitude from images of black hole accretion flows that would arise from thick and thin accretion disks. We confirm previous studies that have shown that independent radii measurements of the shadow and inner-shadow can constrain black hole parameters if the viewing inclination is known, but find that it is only possible if the true emission geometry is also assumed. We study the constraining capabilities of the shadow and inner-shadow observations of M87* and Sgr A* like systems within the context of the BHEX and NgEHT future observatories.
△ Less
Submitted 14 February, 2026; v1 submitted 29 January, 2026;
originally announced January 2026.
-
Measuring the Black Hole and Accretion Parameters of Sagittarius A* from EHT Observations using a Semi-Analytic Model
Authors:
Braden J. Marazzo-Nowicki,
Paul Tiede,
Dominic O. Chang,
Daniel C. M. Palumbo,
Michael D. Johnson
Abstract:
The Event Horizon Telescope (EHT) Collaboration produced the first image of the apparent shadow of the central black hole of Sagittarius\,A$^*$ (\sgra). \sgra source structure varies significantly on timescales shorter than the duration of an observation, preventing improved data coverage through Earth rotation aperture synthesis. This rapid variability provides the opportunity to quantify intrins…
▽ More
The Event Horizon Telescope (EHT) Collaboration produced the first image of the apparent shadow of the central black hole of Sagittarius\,A$^*$ (\sgra). \sgra source structure varies significantly on timescales shorter than the duration of an observation, preventing improved data coverage through Earth rotation aperture synthesis. This rapid variability provides the opportunity to quantify intrinsic variability and separate time-variable emission features from stable signatures of strong gravity and the accretion environment. To infer the properties \sgra and its surrounding accretion flow, we perform Bayesian inference on a series of EHT data segments (``snapshots''). We directly fit parameters of a semi-analytic emission model jointly with complex station gains to snapshot visibilities, then extract estimates of the time-averaged, persistent source structure and temporal variability by stacking snapshots in a Bayesian hierarchical model. This approach successfully reproduces parameters of General Relativistic Magnetohydrodynamics simulations using synthetic EHT observations. Even with physically motivated assumptions about the \sgra environment, black hole spin and magnetic field parameters are poorly constrained by 2017 EHT observations. Our inference constrains other parameters, favoring a nearly face-on observer inclination ($θ_{\rm o} = 9.2\degree \pm 3.6 \degree \pm_{\rm v} 11.6\degree$), an emission peak near the horizon ($R_{\rm peak} = 4.9 \pm 0.1 \pm_{\rm v} 0.5\,GM/c^2$), near-vertical projected spin position angle ($p.a. = 7.3\degree \pm 7.08 \degree \pm_{\rm v} 43.5\degree$ counterclockwise from vertical), and dominant emission $43.4\degree \pm 2.0\degree \pm_{\rm v} 5.9\degree$ above the equatorial plane, where we separate average structure uncertainty ($\pm$) from the impacts of temporal variability and model misspecification ($\pm_{\rm v}$).
△ Less
Submitted 22 January, 2026;
originally announced January 2026.
-
Locating the missing large-scale emission in the jet of M87* with short EHT baselines
Authors:
Boris Georgiev,
Paul Tiede,
Sebastiano D. von Fellenberg,
Michael Janssen,
Iniyan Natarajan,
Lindy Blackburn,
Jongho Park,
Erandi Chavez,
Andrew T. West,
Kotaro Moriyama,
Jun Yi Koay,
Hendrik Müller,
Dhanya G. Nair,
Avery E. Broderick,
Maciek Wielgus,
Kazunori Akiyama,
Ezequiel Albentosa-Ruíz,
Antxon Alberdi,
Walter Alef,
Juan Carlos Algaba,
Richard Anantua,
Keiichi Asada,
Rebecca Azulay,
Uwe Bach,
Anne-Kathrin Baczko
, et al. (258 additional authors not shown)
Abstract:
In Very-Long Baseline Interferometric arrays, nearly co-located stations probe the largest scales and typically cannot resolve the observed source. In the absence of large-scale structure, closure phases constructed with these stations are zero and, since they are independent of station-based errors, they can be used to probe data issues. Here, we show with an expansion about co-located stations,…
▽ More
In Very-Long Baseline Interferometric arrays, nearly co-located stations probe the largest scales and typically cannot resolve the observed source. In the absence of large-scale structure, closure phases constructed with these stations are zero and, since they are independent of station-based errors, they can be used to probe data issues. Here, we show with an expansion about co-located stations, how these trivial closure phases become non-zero with brightness distribution on smaller scales than their short baseline would suggest. When applied to sources that are made up of a bright compact and large-scale diffuse component, the trivial closure phases directly measure the centroid relative to the compact source and higher-order image moments. We present a technique to measure these image moments with minimal model assumptions and validate it on synthetic Event Horizon Telescope (EHT) data. We then apply this technique to 2017 and 2018 EHT observations of M87* and find a weak preference for extended emission in the direction of the large-scale jet. We also apply it to 2021 EHT data and measure the source centroid about 1 mas northwest of the compact ring, consistent with the jet observed at lower frequencies.
△ Less
Submitted 19 January, 2026;
originally announced January 2026.
-
Shifting the Sweet Spot: High-Performance Matrix-Free Method for High-Order Elasticity
Authors:
Dali Chang,
Chong Zhang,
Kaiqi Zhang,
Mingguan Yang,
Huiyuan Li,
Weiqiang Kong
Abstract:
In high-order finite element analysis for elasticity, matrix-free (PA) methods are a key technology for overcoming the memory bottleneck of traditional Full Assembly (FA). However, existing implementations fail to fully exploit the special structure of modern CPU architectures and tensor-product elements, causing their performance "sweet spot" to anomalously remain at the low order of…
▽ More
In high-order finite element analysis for elasticity, matrix-free (PA) methods are a key technology for overcoming the memory bottleneck of traditional Full Assembly (FA). However, existing implementations fail to fully exploit the special structure of modern CPU architectures and tensor-product elements, causing their performance "sweet spot" to anomalously remain at the low order of $p \approx 2$, which severely limits the potential of high-order methods. To address this challenge, we design and implement a highly optimized PA operator within the MFEM framework, deeply integrated with a Geometric Multigrid (GMG) preconditioner. Our multi-level optimization strategy includes replacing the original $O(p^6)$ generic algorithm with an efficient $O(p^4)$ one based on tensor factorization, exploiting Voigt symmetry to reduce redundant computations for the elasticity problem, and employing macro-kernel fusion to enhance data locality and break the memory bandwidth bottleneck. Extensive experiments on mainstream x86 and ARM architectures demonstrate that our method successfully shifts the performance "sweet spot" to the higher-order region of $p \ge 6$. Compared to the MFEM baseline, the optimized core operator (kernel) achieves speedups of 7x to 83x, which translates to a 3.6x to 16.8x end-to-end performance improvement in the complete solution process. This paper provides a validated and efficient practical path for conducting large-scale, high-order elasticity simulations on mainstream CPU hardware.
△ Less
Submitted 13 January, 2026;
originally announced January 2026.
-
Solar Open Technical Report
Authors:
Sungrae Park,
Sanghoon Kim,
Jungho Cho,
Gyoungjin Gim,
Dawoon Jung,
Mikyoung Cha,
Eunhae Choo,
Taekgyu Hong,
Minbyul Jeong,
SeHwan Joo,
Minsoo Khang,
Eunwon Kim,
Minjeong Kim,
Sujeong Kim,
Yunsu Kim,
Hyeonju Lee,
Seunghyun Lee,
Sukyung Lee,
Siyoung Park,
Gyungin Shin,
Inseo Song,
Wonho Song,
Seonghoon Yang,
Seungyoun Yi,
Sanghoon Yoon
, et al. (12 additional authors not shown)
Abstract:
We introduce Solar Open, a 102B-parameter bilingual Mixture-of-Experts language model for underserved languages. Solar Open demonstrates a systematic methodology for building competitive LLMs by addressing three interconnected challenges. First, to train effectively despite data scarcity for underserved languages, we synthesize 4.5T tokens of high-quality, domain-specific, and RL-oriented data. Se…
▽ More
We introduce Solar Open, a 102B-parameter bilingual Mixture-of-Experts language model for underserved languages. Solar Open demonstrates a systematic methodology for building competitive LLMs by addressing three interconnected challenges. First, to train effectively despite data scarcity for underserved languages, we synthesize 4.5T tokens of high-quality, domain-specific, and RL-oriented data. Second, we coordinate this data through a progressive curriculum jointly optimizing composition, quality thresholds, and domain coverage across 20 trillion tokens. Third, to enable reasoning capabilities through scalable RL, we apply our proposed framework SnapPO for efficient optimization. Across benchmarks in English and Korean, Solar Open achieves competitive performance, demonstrating the effectiveness of this methodology for underserved language AI development.
△ Less
Submitted 11 January, 2026;
originally announced January 2026.
-
Ring Asymmetry and Spin in M87*
Authors:
Vadim Bernshteyn,
Nicholas S. Conroy,
Michi Bauböck,
Paul Tiede,
Abhishek V. Joshi,
Ben S. Prather,
Charles F. Gammie,
the Event Horizon Telescope Collaboration,
:,
Kazunori Akiyama,
Ezequiel Albentosa-Ruíz,
Antxon Alberdi,
Walter Alef,
Juan Carlos Algaba,
Richard Anantua,
Keiichi Asada,
Rebecca Azulay,
Anne-Kathrin Baczko,
David Ball,
Bidisha Bandyopadhyay,
John Barrett,
Bradford A. Benson,
Dan Bintley,
Lindy Blackburn,
Raymond Blundell
, et al. (241 additional authors not shown)
Abstract:
Event Horizon Telescope (EHT) images of the supermassive black hole M87* depict an asymmetric ring of emission. General relativistic magnetohydrodynamic (GRMHD) models of M87* and its accretion disk predict that the amplitude and location of the ring's peak brightness asymmetry should fluctuate due to turbulence in the source plasma. We compare the observed distribution of brightness asymmetry amp…
▽ More
Event Horizon Telescope (EHT) images of the supermassive black hole M87* depict an asymmetric ring of emission. General relativistic magnetohydrodynamic (GRMHD) models of M87* and its accretion disk predict that the amplitude and location of the ring's peak brightness asymmetry should fluctuate due to turbulence in the source plasma. We compare the observed distribution of brightness asymmetry amplitudes to the simulated distribution in GRMHD models, across varying black hole spin $a_{*}$. We show that, for strongly magnetized (MAD) models, three epochs of EHT data marginally disfavor $|a_{*}| \lesssim 0.2$. This is consistent with the Blandford-Znajek model for M87's jet, which predicts that M87* should have nonzero spin. We show quantitatively how future observations could improve spin constraints, and discuss how improved spin constraints could distinguish between differing jet-launching mechanisms and black hole growth scenarios.
△ Less
Submitted 1 April, 2026; v1 submitted 1 January, 2026;
originally announced January 2026.
-
Correctness of Extended RSA Public Key Cryptosystem
Authors:
Dar-jen Chang,
Suranjan Gautam
Abstract:
This paper proposes an alternative approach to formally establishing the correctness of the RSA public key cryptosystem. The methodology presented herein deviates slightly from conventional proofs found in existing literature. Specifically, this study explores the conditions under which the choice of the positive integer N, a fundamental component of RSA, can be extended beyond the standard select…
▽ More
This paper proposes an alternative approach to formally establishing the correctness of the RSA public key cryptosystem. The methodology presented herein deviates slightly from conventional proofs found in existing literature. Specifically, this study explores the conditions under which the choice of the positive integer N, a fundamental component of RSA, can be extended beyond the standard selection criteria. We derive explicit conditions that determine when certain values of N are valid for the encryption scheme and explain why others may fail to satisfy the correctness requirements. The scope of this paper is limited to the mathematical proof of correctness for RSA-like schemes, deliberately omitting issues related to the cryptographic security of RSA.
△ Less
Submitted 30 December, 2025;
originally announced December 2025.
-
Reconstructing Relativistic Magnetohydrodynamics with Physics-Informed Neural Networks
Authors:
Corwin Cheung,
Marcos Johnson-Noya,
Michael Xiang,
Dominic Chang,
Alfredo Guevara
Abstract:
We construct the first physics-informed neural-network (PINN) surrogates for relativistic magnetohydrodynamics (RMHD) using a hybrid PDE and data-driven workflow. Instead of training for the conservative form of the equations, we work with Jacobians or PDE characteristics directly in terms of primitive variables. We further add to the trainable system the divergence-free condition, without the nee…
▽ More
We construct the first physics-informed neural-network (PINN) surrogates for relativistic magnetohydrodynamics (RMHD) using a hybrid PDE and data-driven workflow. Instead of training for the conservative form of the equations, we work with Jacobians or PDE characteristics directly in terms of primitive variables. We further add to the trainable system the divergence-free condition, without the need of cleaning modes. Using a novel MUON optimizer implementation, we show that a baseline PINN trained on early-time snapshots can extrapolate RMHD dynamics in one and two spatial dimensions, and that posterior residual-guided networks can systematically reduce PDE violations.
△ Less
Submitted 28 December, 2025;
originally announced December 2025.
-
Unbiased Visual Reasoning with Controlled Visual Inputs
Authors:
Zhaonan Li,
Shijie Lu,
Fei Wang,
Jacob Dineen,
Xiao Ye,
Zhikun Xu,
Siyi Liu,
Young Min Cho,
Bangzheng Li,
Daniel Chang,
Kenny Nguyen,
Qizheng Yang,
Muhao Chen,
Ben Zhou
Abstract:
End-to-end Vision-language Models (VLMs) often answer visual questions by exploiting spurious correlations instead of causal visual evidence, and can become more shortcut-prone when fine-tuned. We introduce VISTA (Visual-Information Separation for Text-based Analysis), a modular framework that decouples perception from reasoning via an explicit information bottleneck. A frozen VLM sensor is restri…
▽ More
End-to-end Vision-language Models (VLMs) often answer visual questions by exploiting spurious correlations instead of causal visual evidence, and can become more shortcut-prone when fine-tuned. We introduce VISTA (Visual-Information Separation for Text-based Analysis), a modular framework that decouples perception from reasoning via an explicit information bottleneck. A frozen VLM sensor is restricted to short, objective perception queries, while a text-only LLM reasoner decomposes each question, plans queries, and aggregates visual facts in natural language. This controlled interface defines a reward-aligned environment for training unbiased visual reasoning with reinforcement learning. Instantiated with Qwen2.5-VL and Llama3.2-Vision sensors, and trained with GRPO from only 641 curated multi-step questions, VISTA significantly improves robustness to real-world spurious correlations on SpuriVerse (+16.29% with Qwen-2.5-VL-7B and +6.77% with Llama-3.2-Vision-11B), while remaining competitive on MMVP and a balanced SeedBench subset. VISTA transfers robustly across unseen VLM sensors and is able to recognize and recover from VLM perception failures. Human analysis further shows that VISTA's reasoning traces are more neutral, less reliant on spurious attributes, and more explicitly grounded in visual evidence than end-to-end VLM baselines.
△ Less
Submitted 19 December, 2025;
originally announced December 2025.
-
Probing jet base emission of M87* with the 2021 Event Horizon Telescope observations
Authors:
Saurabh,
Hendrik Müller,
Sebastiano D. von Fellenberg,
Paul Tiede,
Michael Janssen,
Lindy Blackburn,
Avery E. Broderick,
Erandi Chavez,
Boris Georgiev,
Thomas P. Krichbaum,
Kotaro Moriyama,
Dhanya G. Nair,
Iniyan Natarajan,
Jongho Park,
Andrew Thomas West,
Maciek Wielgus,
Kazunori Akiyama,
Ezequiel Albentosa-Ruíz,
Antxon Alberdi,
Walter Alef,
Juan Carlos Algaba,
Richard Anantua,
Keiichi Asada,
Rebecca Azulay,
Uwe Bach
, et al. (260 additional authors not shown)
Abstract:
We investigate the presence and spatial characteristics of the jet base emission in M87* at 230 GHz, enabled by the enhanced uv coverage in the 2021 Event Horizon Telescope (EHT) observations. The addition of the 12-m Kitt Peak Telescope and NOEMA provides two key intermediate-length baselines to SMT and the IRAM 30-m, giving sensitivity to emission structures at scales of $\sim250~μ$as and…
▽ More
We investigate the presence and spatial characteristics of the jet base emission in M87* at 230 GHz, enabled by the enhanced uv coverage in the 2021 Event Horizon Telescope (EHT) observations. The addition of the 12-m Kitt Peak Telescope and NOEMA provides two key intermediate-length baselines to SMT and the IRAM 30-m, giving sensitivity to emission structures at scales of $\sim250~μ$as and $\sim2500~μ$as (0.02 pc and 0.2 pc). Without these baselines, earlier EHT observations lacked the capability to constrain emission on large scales, where a "missing flux" of order $\sim1$ Jy is expected. To probe these scales, we analyzed closure phases, robust against station-based gain errors, and modeled the jet base emission using a simple Gaussian offset from the compact ring emission at separations $>100~μ$as. Our analysis reveals a Gaussian feature centered at ($Δ$RA $\approx320~μ$as, $Δ$Dec $\approx60~μ$as), a projected separation of $\approx5500$ AU, with a flux density of only $\sim60$ mJy, implying that most of the missing flux in previous studies must arise from larger scales. Brighter emission at these scales is ruled out, and the data do not favor more complex models. This component aligns with the inferred direction of the large-scale jet and is consistent with emission from the jet base. While our findings indicate detectable jet base emission at 230 GHz, coverage from only two intermediate baselines limits reconstruction of its morphology. We therefore treat the recovered Gaussian as an upper limit on the jet base flux density. Future EHT observations with expanded intermediate-baseline coverage will be essential to constrain the structure and nature of this component.
△ Less
Submitted 1 December, 2025;
originally announced December 2025.
-
Fast and Robust T1 Mapping Based on a 3D Dual-Echo UTE Sequence (PETALUTE) for SPION Biodistribution Assessment
Authors:
Zhen Jiang,
Stephen Sawiak,
Alexandra Lipka,
Xin Shen,
Uzay Emir,
Ali Özen,
Mark Chiew,
Justin Geise,
Joseph Speth,
Deng-Yuan Chang,
Jessica Veenstra,
Mitchell Gabalski,
Luis Solorio,
Gregory Tamer Jr.,
Matthew Scarpelli
Abstract:
Superparamagnetic iron oxide nanoparticles (SPIONs) such as ferumoxytol are promising theranostic agents detectable with MRI. Relaxation time mapping offers reproducible, quantitative biomarkers of SPION distribution, but conventional methods suffer from susceptibility artifacts, long echo times, and extended scan durations, limiting accurate quantification. This study developed a fast, B1-correct…
▽ More
Superparamagnetic iron oxide nanoparticles (SPIONs) such as ferumoxytol are promising theranostic agents detectable with MRI. Relaxation time mapping offers reproducible, quantitative biomarkers of SPION distribution, but conventional methods suffer from susceptibility artifacts, long echo times, and extended scan durations, limiting accurate quantification. This study developed a fast, B1-corrected T1-mapping protocol using PETALUTE, a 3D dual-echo ultrashort-echo MRI sequence with a rosette k-space trajectory and variable flip-angle acquisition for quantitative ferumoxytol imaging. Agarose phantoms containing 0-5000 ppm ferumoxytol were scanned at 7T with PETALUTE and vendor-supplied RARE-VTR. PETALUTE T1 maps were derived from two flip angles (4 deg and 20 deg), and mean R1 values were correlated with ferumoxytol concentration. For in vivo feasibility, mice bearing 4T1 mammary and flank tumors were scanned 24 h post-injection (ferumoxytol: n=2, 40 mg/kg; control: n=1). Regions of interest in muscle and tumors were analyzed to compare T1 and R1 values obtained with both methods. PETALUTE produced positive contrast for all phantom concentrations except 5000 ppm, whereas RARE-VTR did not. PETALUTE demonstrated a significant linear correlation between R1 and ferumoxytol concentration (R=0.975, p<0.01), in contrast to RARE-VTR (R=0.672, p=0.144). In vivo, PETALUTE enabled high-resolution, whole-abdominal imaging in 4 min 19 s. Ferumoxytol-injected mice showed T1 shortening in flank tumors, consistent with iron uptake, and PETALUTE revealed elevated T1 value with preserved T2*-weighted signal in one mammary tumor. PETALUTE-based T1 mapping provides fast, quantitative, positive-contrast ferumoxytol imaging with greater spatial coverage and a wider usable concentration range than conventional RARE-VTR.
△ Less
Submitted 5 December, 2025;
originally announced December 2025.
-
Squeezing Classical Antiferromagnets into Quantum Spin Liquids via Global Cavity Fluctuations
Authors:
Charlie-Ray Mann,
Mark A. Oehlgrien,
Błażej Jaworowski,
Giuseppe Calajó,
Jamir Marino,
Kyung S. Choi,
Darrick E. Chang
Abstract:
Cavity quantum electrodynamics with atomic ensembles is typically associated with collective spin phenomena, such as superradiance and spin squeezing, in which the atoms evolve collectively as a macroscopic spin ($S\sim N/2$) on the Bloch sphere. Surprisingly, we show that the tendency toward a collective spin description need not imply collective spin phenomena; rather, it can be exploited to gen…
▽ More
Cavity quantum electrodynamics with atomic ensembles is typically associated with collective spin phenomena, such as superradiance and spin squeezing, in which the atoms evolve collectively as a macroscopic spin ($S\sim N/2$) on the Bloch sphere. Surprisingly, we show that the tendency toward a collective spin description need not imply collective spin phenomena; rather, it can be exploited to generate new forms of strongly correlated quantum matter. The key idea is to use uniform cavity-mediated interactions to energetically project the system into the total-spin singlet sector ($S=0$) - a highly entangled subspace where the physics is governed entirely by cavity fluctuations. Focusing on Rydberg atom arrays coupled to a single-mode cavity, we show that global cavity fluctuations can effectively squeeze classical antiferromagnets into quantum spin liquids, characterized by non-local entanglement, fractionalized excitations, and emergent gauge fields. This work suggests that cavity QED can be a surprising resource for inducing strongly correlated phenomena, which could be explored in the new generation of hybrid tweezer-cavity platforms.
△ Less
Submitted 5 December, 2025;
originally announced December 2025.
-
Feedback Integrators Revisited
Authors:
Juho Bae,
Dong Eui Chang
Abstract:
We revisit the notion of Feedback Integrators introduced by D. E. Chang in 2016. Feedback integrators allow for numerically integrating dynamical systems on manifold while preserving the first integrals of the system. However, its performance was stated and proved in an asymptotic manner, which left a gap between its empirical success and theoretical understandings. In response, we prove preservat…
▽ More
We revisit the notion of Feedback Integrators introduced by D. E. Chang in 2016. Feedback integrators allow for numerically integrating dynamical systems on manifold while preserving the first integrals of the system. However, its performance was stated and proved in an asymptotic manner, which left a gap between its empirical success and theoretical understandings. In response, we prove preservation of first integrals over entire integration region up to arbitrarily small deviation under Feedback Integrator framework. Furthermore, we propose an adaptive gain selection scheme that significantly improves the performance. Numerical demonstrations are conducted on free rigid body motion in SO(3), the Kepler problem, and a perturbed Kepler problem with rotational symmetry. All demonstration codes are available at: https://github.com/johnbae1901/Feedback-Integrator.
△ Less
Submitted 1 December, 2025;
originally announced December 2025.
-
PIM or CXL-PIM? Understanding Architectural Trade-offs Through Large-Scale Benchmarking
Authors:
I-Ting Lee,
Bao-Kai Wang,
Liang-Chi Chen,
Wen Sheng Lim,
Da-Wei Chang,
Yu-Ming Chang,
Chieng-Chung Ho
Abstract:
Processing-in-memory (PIM) reduces data movement by executing near memory, but our large-scale characterization on real PIM hardware shows that end-to-end performance is often limited by disjoint host and device address spaces that force explicit staging transfers. In contrast, CXL-PIM provides a unified address space and cache-coherent access at the cost of higher access latency. These opposing i…
▽ More
Processing-in-memory (PIM) reduces data movement by executing near memory, but our large-scale characterization on real PIM hardware shows that end-to-end performance is often limited by disjoint host and device address spaces that force explicit staging transfers. In contrast, CXL-PIM provides a unified address space and cache-coherent access at the cost of higher access latency. These opposing interface models create workload-dependent tradeoffs that are not captured by small-scale studies. This work presents a side-by-side, large-scale comparison of PIM and CXL-PIM using measurements from real PIM hardware and trace-driven CXL modeling. We identify when unified-address access amortizes link latency enough to overcome transfer bottlenecks, and when tightly coupled PIM remains preferable. Our results reveal phase- and dataset-size regimes in which the relative ranking between the two architectures reverses, offering practical guidance for future near-memory system design.
△ Less
Submitted 18 November, 2025; v1 submitted 18 November, 2025;
originally announced November 2025.
-
Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT
Authors:
Da Chang,
Peng Xue,
Yu Li,
Yongxiang Liu,
Pengxiang Xu,
Shixun Zhang
Abstract:
Parameter-Efficient Fine-Tuning (PEFT) methods are crucial for adapting large pre-trained models. Among these, LoRA is considered a foundational approach. Building on this, the influential DoRA method enhances performance by decomposing weight updates into magnitude and direction. However, its underlying mechanism remains unclear, and it introduces significant computational overhead. In this work,…
▽ More
Parameter-Efficient Fine-Tuning (PEFT) methods are crucial for adapting large pre-trained models. Among these, LoRA is considered a foundational approach. Building on this, the influential DoRA method enhances performance by decomposing weight updates into magnitude and direction. However, its underlying mechanism remains unclear, and it introduces significant computational overhead. In this work, we first identify that DoRA's success stems from its capacity to increase the singular value entropy of the weight update matrix, which promotes a more uniform update distribution akin to full fine-tuning. We then reformulate DoRA into a mathematically equivalent and more efficient matrix form, revealing it as a learnable weight conditioning method. Based on this insight, we propose a unified framework for designing advanced PEFT methods by exploring two orthogonal dimensions: the architectural placement and the transformation type of the conditioning matrix. Within this framework, we introduce two novel methods: (1) \textbf{Pre-Diag}, which applies a diagonal conditioning matrix before the LoRA update to efficiently calibrate the pre-trained weights, thereby enhancing performance while reducing training time; and (2) \textbf{S}kewed \textbf{O}rthogonal \textbf{R}otation \textbf{A}daptation (\textbf{SORA}), which employs a parameter-efficient orthogonal rotation to perform a more powerful, norm-preserving transformation of the feature space. Extensive experiments on natural language understanding and generation tasks demonstrate that our proposed methods achieve superior performance and efficiency compared to both LoRA and DoRA. The code is available at https://github.com/MaeChd/SORA.
△ Less
Submitted 10 November, 2025; v1 submitted 28 October, 2025;
originally announced November 2025.
-
When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation
Authors:
Xunyi Jiang,
Dingyi Chang,
Julian McAuley,
Xin Xu
Abstract:
The rapid evolution of large language models (LLMs) and the real world has outpaced the static nature of widely used evaluation benchmarks, raising concerns about their reliability for evaluating LLM factuality. While substantial works continue to rely on the popular but old benchmarks, their temporal misalignment with real-world facts and modern LLMs, and their effects on LLM factuality evaluatio…
▽ More
The rapid evolution of large language models (LLMs) and the real world has outpaced the static nature of widely used evaluation benchmarks, raising concerns about their reliability for evaluating LLM factuality. While substantial works continue to rely on the popular but old benchmarks, their temporal misalignment with real-world facts and modern LLMs, and their effects on LLM factuality evaluation remain underexplored. Therefore, in this work, we present a systematic investigation of this issue by examining five popular factuality benchmarks and eight LLMs released across different years. An up-to-date fact retrieval pipeline and three metrics are tailored to quantify benchmark aging and its impact on LLM factuality evaluation. Experimental results and analysis illustrate that a considerable portion of samples in the widely used factuality benchmarks are outdated, leading to unreliable assessments of LLM factuality. We hope our work can provide a testbed to assess the reliability of a benchmark for LLM factuality evaluation and inspire more research on the benchmark aging issue. Codes are available in https://github.com/JiangXunyi/BenchAge.
△ Less
Submitted 20 January, 2026; v1 submitted 8 October, 2025;
originally announced October 2025.
-
The (PXP)$^2$ model: long-range quantum scars in optical cavities
Authors:
Hossein Hosseinabadi,
Riccardo J. Valencia-Tortora,
Aleksandr N. Mikheev,
Darrick E. Chang,
Johannes Zeiher,
Roderich Moessner,
Jamir Marino
Abstract:
Rydberg-cavity systems are emerging as promising platforms for quantum simulation and quantum information processing. These hybrid architectures combine two complementary interaction mechanisms: cavity photons mediate collective long-range couplings, while Rydberg excitations generate strong short-range interactions. Together, they offer a setting for engineering many-body phases characterized by…
▽ More
Rydberg-cavity systems are emerging as promising platforms for quantum simulation and quantum information processing. These hybrid architectures combine two complementary interaction mechanisms: cavity photons mediate collective long-range couplings, while Rydberg excitations generate strong short-range interactions. Together, they offer a setting for engineering many-body phases characterized by a hierarchy of interactions across widely different length scales. In this work, we introduce a minimal and scalable model for such systems. Focusing on the strong Rydberg blockade regime, we restrict the Hilbert space to the subspace enforced by the blockade, yielding a kinetically constrained long-range model in one spatial dimension. This approach both captures the physics of Rydberg-cavity experiments in the regime of strong Rydberg interactions and provides a conceptually transparent framework for studying the interplay of long-range and short-range interactions. At equilibrium, in addition to paramagnetic and Néel-ordered phases, the system supports a blockaded ferromagnetic/superradiant phase, distinct from the conventional superradiant phase. Out of equilibrium, we identify long-range quantum many-body scars, which are atypical nonthermal eigenstates that evade the eigenstate thermalization hypothesis, and giving rise to slow entanglement growth. In contrast to the linear-in-time entanglement growth characteristic of short-range scarred models, these long-range scars exhibit logarithmic entanglement dynamics. Our results establish a minimal yet versatile framework for Rydberg-cavity systems, and provide a stepping stone for future theoretical and experimental studies of this frontier platform in quantum many-body physics.
△ Less
Submitted 2 October, 2025;
originally announced October 2025.
-
Relative-Absolute Fusion: Rethinking Feature Extraction in Image-Based Iterative Method Selection for Solving Sparse Linear Systems
Authors:
Kaiqi Zhang,
Mingguan Yang,
Dali Chang,
Chun Chen,
Yuxiang Zhang,
Kexun He,
Jing Zhao
Abstract:
Iterative method selection is crucial for solving sparse linear systems because these methods inherently lack robustness. Though image-based selection approaches have shown promise, their feature extraction techniques might encode distinct matrices into identical image representations, leading to the same selection and suboptimal method. In this paper, we introduce RAF (Relative-Absolute Fusion),…
▽ More
Iterative method selection is crucial for solving sparse linear systems because these methods inherently lack robustness. Though image-based selection approaches have shown promise, their feature extraction techniques might encode distinct matrices into identical image representations, leading to the same selection and suboptimal method. In this paper, we introduce RAF (Relative-Absolute Fusion), an efficient feature extraction technique to enhance image-based selection approaches. By simultaneously extracting and fusing image representations as relative features with corresponding numerical values as absolute features, RAF achieves comprehensive matrix representations that prevent feature ambiguity across distinct matrices, thus improving selection accuracy and unlocking the potential of image-based selection approaches. We conducted comprehensive evaluations of RAF on SuiteSparse and our developed BMCMat (Balanced Multi-Classification Matrix dataset), demonstrating solution time reductions of 0.08s-0.29s for sparse linear systems, which is 5.86%-11.50% faster than conventional image-based selection approaches and achieves state-of-the-art (SOTA) performance. BMCMat is available at https://github.com/zkqq/BMCMat.
△ Less
Submitted 1 October, 2025;
originally announced October 2025.
-
Encoding Structural Constraints into Segment Anything Models via Probabilistic Graphical Models
Authors:
Yu Li,
Da Chang,
Xi Xiao
Abstract:
While the Segment Anything Model (SAM) has achieved remarkable success in image segmentation, its direct application to medical imaging remains hindered by fundamental challenges, including ambiguous boundaries, insufficient modeling of anatomical relationships, and the absence of uncertainty quantification. To address these limitations, we introduce KG-SAM, a knowledge-guided framework that syner…
▽ More
While the Segment Anything Model (SAM) has achieved remarkable success in image segmentation, its direct application to medical imaging remains hindered by fundamental challenges, including ambiguous boundaries, insufficient modeling of anatomical relationships, and the absence of uncertainty quantification. To address these limitations, we introduce KG-SAM, a knowledge-guided framework that synergistically integrates anatomical priors with boundary refinement and uncertainty estimation. Specifically, KG-SAM incorporates (i) a medical knowledge graph to encode fine-grained anatomical relationships, (ii) an energy-based Conditional Random Field (CRF) to enforce anatomically consistent predictions, and (iii) an uncertainty-aware fusion module to enhance reliability in high-stakes clinical scenarios. Extensive experiments across multi-center medical datasets demonstrate the effectiveness of our approach: KG-SAM achieves an average Dice score of 82.69% on prostate segmentation and delivers substantial gains in abdominal segmentation, reaching 78.05% on MRI and 79.68% on CT. These results establish KG-SAM as a robust and generalizable framework for advancing medical image segmentation.
△ Less
Submitted 10 January, 2026; v1 submitted 25 September, 2025;
originally announced September 2025.
-
Instruction-tuned Self-Questioning Framework for Multimodal Reasoning
Authors:
You-Won Jang,
Yu-Jung Heo,
Jaeseok Kim,
Minsu Lee,
Du-Seong Chang,
Byoung-Tak Zhang
Abstract:
The field of vision-language understanding has been actively researched in recent years, thanks to the development of Large Language Models~(LLMs). However, it still needs help with problems requiring multi-step reasoning, even for very simple questions. Recent studies adopt LLMs to tackle this problem by iteratively generating sub-questions and answers. However, there are disadvantages such as 1)…
▽ More
The field of vision-language understanding has been actively researched in recent years, thanks to the development of Large Language Models~(LLMs). However, it still needs help with problems requiring multi-step reasoning, even for very simple questions. Recent studies adopt LLMs to tackle this problem by iteratively generating sub-questions and answers. However, there are disadvantages such as 1) the fine-grained visual contents of images are not available using LLMs that cannot read visual information, 2) internal mechanisms are inaccessible and difficult to reproduce by using black-box LLMs. To solve these problems, we propose the SQ (Self-Questioning)-InstructBLIP, which improves inference performance by generating image-aware informative sub-questions and sub-answers iteratively. The SQ-InstructBLIP, which consists of a Questioner, Answerer, and Reasoner that share the same architecture. Questioner and Answerer generate sub-questions and sub-answers to help infer the main-question, and Reasoner performs reasoning on the main-question considering the generated sub-question information. Our experiments show that the proposed method SQ-InstructBLIP, which uses the generated sub-questions as additional information when solving the VQA task, performs more accurate reasoning than the previous works.
△ Less
Submitted 25 September, 2025;
originally announced September 2025.
-
On the Convergence of Muon and Beyond
Authors:
Da Chang,
Yongxiang Liu,
Ganzhao Yuan
Abstract:
The Muon optimizer has demonstrated remarkable empirical success in handling matrix-structured parameters for training neural networks. However, a significant gap remains between its practical performance and theoretical understanding. Existing analyses show that the Muon variants achieve only a suboptimal iteration complexity of $\mathcal{O}(T^{-1/4})$ in stochastic non-convex settings, where…
▽ More
The Muon optimizer has demonstrated remarkable empirical success in handling matrix-structured parameters for training neural networks. However, a significant gap remains between its practical performance and theoretical understanding. Existing analyses show that the Muon variants achieve only a suboptimal iteration complexity of $\mathcal{O}(T^{-1/4})$ in stochastic non-convex settings, where $T$ denotes the number of iterations. To study the theoretical limits of Muon, we analyze two momentum-based variance-reduced variants: the one-batch Muon-MVR1 and the two-batch Muon-MVR2. We provide the first rigorous proof that, under horizon-free learning-rate schedules, variance reduction enables Muon-MVR2 to attain the optimal anytime convergence rate $\tilde{\mathcal{O}}(T^{-1/3})$, matching the lower bound for this problem class. Under the Polyak--Łojasiewicz (PL) condition, we further establish anytime best-iterate guarantees for the expected square-root suboptimality: Muon-MVR1 achieves $\widetilde{\mathcal{O}}(T^{-1/4})$, while Muon-MVR2 achieves $\widetilde{\mathcal{O}}(T^{-1/3})$. Experiments on CIFAR-10 and C4 support the practical effectiveness of the proposed variance-reduced Muon variants.
△ Less
Submitted 31 March, 2026; v1 submitted 19 September, 2025;
originally announced September 2025.
-
Association and Consolidation: Evolutionary Memory-Enhanced Incremental Multi-View Clustering
Authors:
Zisen Kong,
Bo Zhong,
Pengyuan Li,
Dongxia Chang,
Yiming Wang,
Yongyong Chen
Abstract:
Incremental multi-view clustering aims to achieve stable clustering results while addressing the stability-plasticity dilemma (SPD) in view-incremental scenarios. The core challenge is that the model must have enough plasticity to quickly adapt to new data, while maintaining sufficient stability to consolidate long-term knowledge. To address this challenge, we propose a novel Evolutionary Memory-E…
▽ More
Incremental multi-view clustering aims to achieve stable clustering results while addressing the stability-plasticity dilemma (SPD) in view-incremental scenarios. The core challenge is that the model must have enough plasticity to quickly adapt to new data, while maintaining sufficient stability to consolidate long-term knowledge. To address this challenge, we propose a novel Evolutionary Memory-Enhanced Incremental Multi-View Clustering (EMIMC), inspired by the memory regulation mechanisms of the human brain. Specifically, we design a rapid association module to establish connections between new and historical views, thereby ensuring the plasticity required for learning new knowledge. Second, a cognitive forgetting module with a decay mechanism is introduced. By dynamically adjusting the contribution of the historical view to optimize knowledge integration. Finally, we propose a knowledge consolidation module to progressively refine short-term knowledge into stable long-term memory using temporal tensors, thereby ensuring model stability. By integrating these modules, EMIMC achieves strong knowledge retention capabilities in scenarios with growing views. Extensive experiments demonstrate that EMIMC exhibits remarkable advantages over existing state-of-the-art methods.
△ Less
Submitted 11 November, 2025; v1 submitted 17 September, 2025;
originally announced September 2025.
-
Local-Canonicalization Equivariant Graph Neural Networks for Sample-Efficient and Generalizable Swarm Robot Control
Authors:
Keqin Wang,
Tao Zhong,
David Chang,
Christine Allen-Blanchette
Abstract:
Multi-agent reinforcement learning (MARL) has emerged as a powerful paradigm for coordinating swarms of agents in complex decision-making, yet major challenges remain. In competitive settings such as pursuer-evader tasks, simultaneous adaptation can destabilize training; non-kinetic countermeasures often fail under adverse conditions; and policies trained in one configuration rarely generalize to…
▽ More
Multi-agent reinforcement learning (MARL) has emerged as a powerful paradigm for coordinating swarms of agents in complex decision-making, yet major challenges remain. In competitive settings such as pursuer-evader tasks, simultaneous adaptation can destabilize training; non-kinetic countermeasures often fail under adverse conditions; and policies trained in one configuration rarely generalize to environments with a different number of agents. To address these issues, we propose the Local-Canonicalization Equivariant Graph Neural Networks (LEGO) framework, which integrates seamlessly with popular MARL algorithms such as MAPPO. LEGO employs graph neural networks to capture permutation equivariance and generalization to different agent numbers, canonicalization to enforce E(n)-equivariance, and heterogeneous representations to encode role-specific inductive biases. Experiments on cooperative and competitive swarm benchmarks show that LEGO outperforms strong baselines and improves generalization. In real-world experiments, LEGO demonstrates robustness to varying team sizes and agent failure.
△ Less
Submitted 17 September, 2025;
originally announced September 2025.
-
Controllable-Continuous Color Editing in Diffusion Model via Color Mapping
Authors:
Yuqi Yang,
Dongliang Chang,
Yuanchen Fang,
Yi-Zhe SonG,
Zhanyu Ma,
Jun Guo
Abstract:
In recent years, text-driven image editing has made significant progress. However, due to the inherent ambiguity and discreteness of natural language, color editing still faces challenges such as insufficient precision and difficulty in achieving continuous control. Although linearly interpolating the embedding vectors of different textual descriptions can guide the model to generate a sequence of…
▽ More
In recent years, text-driven image editing has made significant progress. However, due to the inherent ambiguity and discreteness of natural language, color editing still faces challenges such as insufficient precision and difficulty in achieving continuous control. Although linearly interpolating the embedding vectors of different textual descriptions can guide the model to generate a sequence of images with varying colors, this approach lacks precise control over the range of color changes in the output images. Moreover, the relationship between the interpolation coefficient and the resulting image color is unknown and uncontrollable. To address these issues, we introduce a color mapping module that explicitly models the correspondence between the text embedding space and image RGB values. This module predicts the corresponding embedding vector based on a given RGB value, enabling precise color control of the generated images while maintaining semantic consistency. Users can specify a target RGB range to generate images with continuous color variations within the desired range, thereby achieving finer-grained, continuous, and controllable color editing. Experimental results demonstrate that our method performs well in terms of color continuity and controllability.
△ Less
Submitted 17 September, 2025;
originally announced September 2025.
-
ACT: Automated Constraint Targeting for Multi-Objective Recommender Systems
Authors:
Daryl Chang,
Yi Wu,
Jennifer She,
Li Wei,
Lukasz Heldt
Abstract:
Recommender systems often must maximize a primary objective while ensuring secondary ones satisfy minimum thresholds, or "guardrails." This is critical for maintaining a consistent user experience and platform ecosystem, but enforcing these guardrails despite orthogonal system changes is challenging and often requires manual hyperparameter tuning. We introduce the Automated Constraint Targeting (A…
▽ More
Recommender systems often must maximize a primary objective while ensuring secondary ones satisfy minimum thresholds, or "guardrails." This is critical for maintaining a consistent user experience and platform ecosystem, but enforcing these guardrails despite orthogonal system changes is challenging and often requires manual hyperparameter tuning. We introduce the Automated Constraint Targeting (ACT) framework, which automatically finds the minimal set of hyperparameter changes needed to satisfy these guardrails. ACT uses an offline pairwise evaluation on unbiased data to find solutions and continuously retrains to adapt to system and user behavior changes. We empirically demonstrate its efficacy and describe its deployment in a large-scale production environment.
△ Less
Submitted 3 September, 2025;
originally announced September 2025.
-
Retrieval Augmented Large Language Model System for Comprehensive Drug Contraindications
Authors:
Byeonghun Bang,
Jongsuk Yoon,
Dong-Jin Chang,
Seho Park,
Yong Oh Lee
Abstract:
The versatility of large language models (LLMs) has been explored across various sectors, but their application in healthcare poses challenges, particularly in the domain of pharmaceutical contraindications where accurate and reliable information is required. This study enhances the capability of LLMs to address contraindications effectively by implementing a Retrieval Augmented Generation (RAG) p…
▽ More
The versatility of large language models (LLMs) has been explored across various sectors, but their application in healthcare poses challenges, particularly in the domain of pharmaceutical contraindications where accurate and reliable information is required. This study enhances the capability of LLMs to address contraindications effectively by implementing a Retrieval Augmented Generation (RAG) pipeline. Utilizing OpenAI's GPT-4o-mini as the base model, and the text-embedding-3-small model for embeddings, our approach integrates Langchain to orchestrate a hybrid retrieval system with re-ranking. This system leverages Drug Utilization Review (DUR) data from public databases, focusing on contraindications for specific age groups, pregnancy, and concomitant drug use. The dataset includes 300 question-answer pairs across three categories, with baseline model accuracy ranging from 0.49 to 0.57. Post-integration of the RAG pipeline, we observed a significant improvement in model accuracy, achieving rates of 0.94, 0.87, and 0.89 for contraindications related to age groups, pregnancy, and concomitant drug use, respectively. The results indicate that augmenting LLMs with a RAG framework can substantially reduce uncertainty in prescription and drug intake decisions by providing more precise and reliable drug contraindication information.
△ Less
Submitted 8 August, 2025;
originally announced August 2025.
-
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Authors:
Shijie Zhou,
Alexander Vilesov,
Xuehai He,
Ziyu Wan,
Shuwang Zhang,
Aditya Nagachandra,
Di Chang,
Dongdong Chen,
Xin Eric Wang,
Achuta Kadambi
Abstract:
Vision language models (VLMs) have shown remarkable capabilities in integrating linguistic and visual reasoning but remain fundamentally limited in understanding dynamic spatiotemporal interactions. Humans effortlessly track and reason about object movements, rotations, and perspective shifts-abilities essential for robust dynamic real-world understanding yet notably lacking in current VLMs. In th…
▽ More
Vision language models (VLMs) have shown remarkable capabilities in integrating linguistic and visual reasoning but remain fundamentally limited in understanding dynamic spatiotemporal interactions. Humans effortlessly track and reason about object movements, rotations, and perspective shifts-abilities essential for robust dynamic real-world understanding yet notably lacking in current VLMs. In this paper, we introduce VLM4D, the first benchmark specifically designed to evaluate the spatiotemporal reasoning capabilities of VLMs. Our benchmark comprises diverse real-world and synthetic videos accompanied by carefully curated question-answer pairs emphasizing translational and rotational motions, perspective awareness, and motion continuity. Through comprehensive evaluations of state-of-the-art open and closed-source VLMs, we identify significant performance gaps compared to human baselines, highlighting fundamental deficiencies in existing models. Extensive analysis reveals that VLMs struggle particularly with integrating multiple visual cues and maintaining temporal coherence. We further explore promising directions, such as leveraging 4D feature field reconstruction and targeted spatiotemporal supervised fine-tuning, demonstrating their effectiveness in enhancing spatiotemporal comprehension. Our work aims to encourage deeper exploration into improving VLMs' spatial and temporal grounding, paving the way towards more capable and reliable visual intelligence for dynamic environments.
△ Less
Submitted 6 August, 2025; v1 submitted 4 August, 2025;
originally announced August 2025.
-
On the Convergence Speed of Spatially Coupled LDPC Ensembles Under Window Decoding
Authors:
Qingqing Peng,
Dongxu Chang,
Guanghui Wang,
Guiying Yan
Abstract:
It is known that windowed decoding (WD) can effectively balance the performance and complexity of spatially coupled low-density parity-check (LDPC) codes. In this study, we show that information can propagate in a wave-like manner at a constant speed under WD. Additionally, we provide an upper bound for the information propagation speed on the binary erasure channel, which can assist in designing…
▽ More
It is known that windowed decoding (WD) can effectively balance the performance and complexity of spatially coupled low-density parity-check (LDPC) codes. In this study, we show that information can propagate in a wave-like manner at a constant speed under WD. Additionally, we provide an upper bound for the information propagation speed on the binary erasure channel, which can assist in designing the number of iterations required within each window.
△ Less
Submitted 9 July, 2025;
originally announced July 2025.
-
Deep Reinforcement Learning-Based DRAM Equalizer Parameter Optimization Using Latent Representations
Authors:
Muhammad Usama,
Dong Eui Chang
Abstract:
Equalizer parameter optimization for signal integrity in high-speed Dynamic Random Access Memory systems is crucial but often computationally demanding or model-reliant. This paper introduces a data-driven framework employing learned latent signal representations for efficient signal integrity evaluation, coupled with a model-free Advantage Actor-Critic reinforcement learning agent for parameter o…
▽ More
Equalizer parameter optimization for signal integrity in high-speed Dynamic Random Access Memory systems is crucial but often computationally demanding or model-reliant. This paper introduces a data-driven framework employing learned latent signal representations for efficient signal integrity evaluation, coupled with a model-free Advantage Actor-Critic reinforcement learning agent for parameter optimization. The latent representation captures vital signal integrity features, offering a fast alternative to direct eye diagram analysis during optimization, while the reinforcement learning agent derives optimal equalizer settings without explicit system models. Applied to industry-standard Dynamic Random Access Memory waveforms, the method achieved significant eye-opening window area improvements: 42.7\% for cascaded Continuous-Time Linear Equalizer and Decision Feedback Equalizer structures, and 36.8\% for Decision Feedback Equalizer-only configurations. These results demonstrate superior performance, computational efficiency, and robust generalization across diverse Dynamic Random Access Memory units compared to existing techniques. Core contributions include an efficient latent signal integrity metric for optimization, a robust model-free reinforcement learning strategy, and validated superior performance for complex equalizer architectures.
△ Less
Submitted 3 July, 2025;
originally announced July 2025.
-
Is Lindblad for me?
Authors:
Martino Stefanini,
Aleksandra A. Ziolkowska,
Dmitry Budker,
Ulrich Poschinger,
Ferdinand Schmidt-Kaler,
Antoine Browaeys,
Atac Imamoglu,
Darrick Chang,
Jamir Marino
Abstract:
The Lindblad master equation is a foundational tool for modeling the dynamics of open quantum systems. As its use has extended far beyond its original domain, the boundaries of its validity have grown opaque. In particular, the rise of new research areas including open quantum many-body systems, non-equilibrium condensed matter, and the possibility to test its limits in driven-open quantum simulat…
▽ More
The Lindblad master equation is a foundational tool for modeling the dynamics of open quantum systems. As its use has extended far beyond its original domain, the boundaries of its validity have grown opaque. In particular, the rise of new research areas including open quantum many-body systems, non-equilibrium condensed matter, and the possibility to test its limits in driven-open quantum simulators, call for a critical revision of its regimes of applicability. In this pedagogical review, we re-examine the folklore surrounding its three standard approximations (Born, Markov, and Rotating Wave Approximation), as we build our narrative by employing a series of examples and case studies accessible to any reader with a solid background on the fundamentals of quantum mechanics. As a synthesis of our work, we offer a checklist that contrasts common lore with refined expectations, offering a practical guideline for assessing the breakdown of the Lindblad framework in the problem at hand.
△ Less
Submitted 8 April, 2026; v1 submitted 27 June, 2025;
originally announced June 2025.
-
Learning High-Quality Latent Representations for Anomaly Detection and Signal Integrity Enhancement in High-Speed Signals
Authors:
Muhammad Usama,
Hee-Deok Jang,
Soham Shanbhag,
Yoo-Chang Sung,
Seung-Jun Bae,
Dong Eui Chang
Abstract:
This paper addresses the dual challenge of improving anomaly detection and signal integrity in high-speed dynamic random access memory signals. To achieve this, we propose a joint training framework that integrates an autoencoder with a classifier to learn more distinctive latent representations by focusing on valid data features. Our approach is evaluated across three anomaly detection algorithms…
▽ More
This paper addresses the dual challenge of improving anomaly detection and signal integrity in high-speed dynamic random access memory signals. To achieve this, we propose a joint training framework that integrates an autoencoder with a classifier to learn more distinctive latent representations by focusing on valid data features. Our approach is evaluated across three anomaly detection algorithms and consistently outperforms two baseline methods. Detailed ablation studies further support these findings. Furthermore, we introduce a signal integrity enhancement algorithm that improves signal integrity by an average of 11.3%. The source code and data used in this study are available at https://github.com/Usama1002/learning-latent-representations.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
Individual Causal Inference with Structural Causal Model
Authors:
Daniel T. Chang
Abstract:
Individual causal inference (ICI) uses causal inference methods to understand and predict the effects of interventions on individuals, considering their specific characteristics / facts. It aims to estimate individual causal effect (ICE), which varies across individuals. Estimating ICE can be challenging due to the limited data available for individuals, and the fact that most causal inference met…
▽ More
Individual causal inference (ICI) uses causal inference methods to understand and predict the effects of interventions on individuals, considering their specific characteristics / facts. It aims to estimate individual causal effect (ICE), which varies across individuals. Estimating ICE can be challenging due to the limited data available for individuals, and the fact that most causal inference methods are population-based. Structural Causal Model (SCM) is fundamentally population-based. Therefore, causal discovery (structural learning and parameter learning), association queries and intervention queries are all naturally population-based. However, exogenous variables (U) in SCM can encode individual variations and thus provide the mechanism for individualized population per specific individual characteristics / facts. Based on this, we propose ICI with SCM as a "rung 3" causal inference, because it involves "imagining" what would be the causal effect of a hypothetical intervention on an individual, given the individual's observed characteristics / facts. Specifically, we propose the indiv-operator, indiv(W), to formalize/represent the population individualization process, and the individual causal query, P(Y | indiv(W), do(X), Z), to formalize/represent ICI. We show and argue that ICI with SCM is inference on individual alternatives (possible), not individual counterfactuals (non-actual).
△ Less
Submitted 11 July, 2025; v1 submitted 17 June, 2025;
originally announced June 2025.
-
Dynamic Layered Decoding Scheduling for LDPC Codes Aided by Check Node Error Probabilities
Authors:
Chenyuan Jia,
Dongxu Chang,
Ruiyuan Wang,
Guanghui Wang,
Guiying Yan,
Cunquan Qu
Abstract:
In this study, a new scheduling strategies for low-density parity-check (LDPC) codes under layered belief propagation (LBP) is designed. Based on the criteria of prioritizing the update of check nodes with lower error probabilities, we propose two dynamic scheduling methods: dynamic error belief propagation (Dyn-EBP) and dynamic penalty error belief propagation (Dyn-PEBP). In Dyn-EBP, each check n…
▽ More
In this study, a new scheduling strategies for low-density parity-check (LDPC) codes under layered belief propagation (LBP) is designed. Based on the criteria of prioritizing the update of check nodes with lower error probabilities, we propose two dynamic scheduling methods: dynamic error belief propagation (Dyn-EBP) and dynamic penalty error belief propagation (Dyn-PEBP). In Dyn-EBP, each check node is restricted from being updated the same number of times, whereas Dyn-PEBP removes this restriction and instead introduces a penalty term to balance the number of updates. Simulation results show that, for 5G new radio (NR) LDPC codes, our proposed scheduling methods can outperform existing dynamic and offline scheduling strategies under various blocklengths and code rates. This demonstrates that prioritizing the update of check nodes with lower error probabilities can lead to higher decoding efficiency and validates the effectiveness of our algorithms.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions
Authors:
Di Chang,
Mingdeng Cao,
Yichun Shi,
Bo Liu,
Shengqu Cai,
Shijie Zhou,
Weilin Huang,
Gordon Wetzstein,
Mohammad Soleymani,
Peng Wang
Abstract:
Editing images with instructions to reflect non-rigid motions, camera viewpoint shifts, object deformations, human articulations, and complex interactions, poses a challenging yet underexplored problem in computer vision. Existing approaches and datasets predominantly focus on static scenes or rigid transformations, limiting their capacity to handle expressive edits involving dynamic motion. To ad…
▽ More
Editing images with instructions to reflect non-rigid motions, camera viewpoint shifts, object deformations, human articulations, and complex interactions, poses a challenging yet underexplored problem in computer vision. Existing approaches and datasets predominantly focus on static scenes or rigid transformations, limiting their capacity to handle expressive edits involving dynamic motion. To address this gap, we introduce ByteMorph, a comprehensive framework for instruction-based image editing with an emphasis on non-rigid motions. ByteMorph comprises a large-scale dataset, ByteMorph-6M, and a strong baseline model built upon the Diffusion Transformer (DiT), named ByteMorpher. ByteMorph-6M includes over 6 million high-resolution image editing pairs for training, along with a carefully curated evaluation benchmark ByteMorph-Bench. Both capture a wide variety of non-rigid motion types across diverse environments, human figures, and object categories. The dataset is constructed using motion-guided data generation, layered compositing techniques, and automated captioning to ensure diversity, realism, and semantic coherence. We further conduct a comprehensive evaluation of recent instruction-based image editing methods from both academic and commercial domains.
△ Less
Submitted 11 June, 2025; v1 submitted 3 June, 2025;
originally announced June 2025.