-
Seedance 2.0: Advancing Video Generation for World Complexity
Authors:
Team Seedance,
De Chen,
Liyang Chen,
Xin Chen,
Ying Chen,
Zhuo Chen,
Zhuowei Chen,
Feng Cheng,
Tianheng Cheng,
Yufeng Cheng,
Mojie Chi,
Xuyan Chi,
Jian Cong,
Qinpeng Cui,
Fei Ding,
Qide Dong,
Yujiao Du,
Haojie Duanmu,
Junliang Fan,
Jiarui Fang,
Jing Fang,
Zetao Fang,
Chengjian Feng,
Yu Gao,
Diandian Gu
, et al. (146 additional authors not shown)
Abstract:
Seedance 2.0 is a new native multi-modal audio-video generation model, officially released in China in early February 2026. Compared with its predecessors, Seedance 1.0 and 1.5 Pro, Seedance 2.0 adopts a unified, highly efficient, and large-scale architecture for multi-modal audio-video joint generation. This allows it to support four input modalities: text, image, audio, and video, by integrating…
▽ More
Seedance 2.0 is a new native multi-modal audio-video generation model, officially released in China in early February 2026. Compared with its predecessors, Seedance 1.0 and 1.5 Pro, Seedance 2.0 adopts a unified, highly efficient, and large-scale architecture for multi-modal audio-video joint generation. This allows it to support four input modalities: text, image, audio, and video, by integrating one of the most comprehensive suites of multi-modal content reference and editing capabilities available in the industry to date. It delivers substantial, well-rounded improvements across all key sub-dimensions of video and audio generation. In both expert evaluations and public user tests, the model has demonstrated performance on par with the leading levels in the field. Seedance 2.0 supports direct generation of audio-video content with durations ranging from 4 to 15 seconds, with native output resolutions of 480p and 720p. For multi-modal inputs as reference, its current open platform supports up to 3 video clips, 9 images, and 3 audio clips. In addition, we provide Seedance 2.0 Fast version, an accelerated variant of Seedance 2.0 designed to boost generation speed for low-latency scenarios. Seedance 2.0 has delivered significant improvements to its foundational generation capabilities and multi-modal generation performance, bringing an enhanced creative experience for end users.
△ Less
Submitted 15 April, 2026;
originally announced April 2026.
-
A Natural $\gtrsim 100\times$ Telescope: Discovery of the Strongly Lensed Type II SN 2025mkn at $z=1.37$
Authors:
Cameron Lemon,
Ariel Goobar,
Joel Johansson,
Edvard Mörtsell,
Steve Schulze,
Igor Andreoni,
Aleksandra Bochenek,
Seán J. Brennan,
Malte Busmann,
Michael Coughlin,
Kaustav K. Das,
Suhail Dhawan,
Christoffer Fremling,
Anjasha Gangopadhyay,
Daniel Gruen,
Xander J. Hall,
Anna Y. Q. Ho,
Mansi M. Kasliwal,
Daniel A. Perley,
Mickael Rigault,
Genevieve Schroeder,
Mathew Smith,
Jesper Sollerman,
Jean J. Somalwar,
Robert Stein
, et al. (68 additional authors not shown)
Abstract:
We present the discovery of SN 2025mkn, a gravitationally lensed Type II supernova. First detected as a blue transient in ZTF, 0.83$^{\prime\prime}$ from a $z=0.42$ elliptical galaxy, follow-up SNIFS/UH2.2m and LRIS/Keck spectra revealed absorption lines at $z=1.371$. Later JWST NIRCam imaging shows that the bright transient is a close pair of point sources separated by $\sim 0.07^{\prime\prime}$,…
▽ More
We present the discovery of SN 2025mkn, a gravitationally lensed Type II supernova. First detected as a blue transient in ZTF, 0.83$^{\prime\prime}$ from a $z=0.42$ elliptical galaxy, follow-up SNIFS/UH2.2m and LRIS/Keck spectra revealed absorption lines at $z=1.371$. Later JWST NIRCam imaging shows that the bright transient is a close pair of point sources separated by $\sim 0.07^{\prime\prime}$, and a 30 times fainter counterimage opposite the lens, for which NIRSpec reveals strong H$α$ emission also at $z=1.371$. The light curves and spectra are consistent with the Type II supernova source being magnified $\gtrsim 100$ times, with $\sim 250$ required to reconcile its luminosity with that of nearby events such as SN 2023ixf. Lens models are consistent with such high magnifications, and always show that the faint image arrived first (undetected in earlier ZTF imaging), consistent with the later spectral phase of this fainter image. A fourth image is also predicted and possibly detected in the NIRSpec data. Light-curve-based time-delay measurements are not possible due to the first image being the faintest; however, the resolved NIRSpec spectra offer a future opportunity for time-delay cosmography through supernova phase measurements.
△ Less
Submitted 9 April, 2026;
originally announced April 2026.
-
Geometrically-Constrained Radar-Inertial Odometry via Continuous Point-Pose Uncertainty Modeling
Authors:
Wooseong Yang,
Dongjae Lee,
Minwoo Jung,
Ayoung Kim
Abstract:
Radar odometry is crucial for robust localization in challenging environments; however, the sparsity of reliable returns and distinctive noise characteristics impede its performance. This paper introduces geometrically-constrained radar-inertial odometry and mapping that jointly consolidates point and pose uncertainty. We employ the continuous trajectory model to estimate the pose uncertainty at a…
▽ More
Radar odometry is crucial for robust localization in challenging environments; however, the sparsity of reliable returns and distinctive noise characteristics impede its performance. This paper introduces geometrically-constrained radar-inertial odometry and mapping that jointly consolidates point and pose uncertainty. We employ the continuous trajectory model to estimate the pose uncertainty at any arbitrary timestamp by propagating uncertainties of the control points. These pose uncertainties are continuously integrated with heteroscedastic measurement uncertainty during point projection, thereby enabling dynamic evaluation of observation confidence and adaptive down-weighting of uninformative radar points. By leveraging quantified uncertainties in radar mapping, we construct a high-fidelity map that improves odometry accuracy under imprecise radar measurements. Moreover, we reveal the effectiveness of explicit geometrical constraints in radar-inertial odometry when incorporated with the proposed uncertainty-aware mapping framework. Extensive experiments on diverse real-world datasets demonstrate the superiority of our method, yielding substantial performance improvements in both accuracy and efficiency compared to existing baselines.
△ Less
Submitted 3 April, 2026;
originally announced April 2026.
-
The Self Driving Portfolio: Agentic Architecture for Institutional Asset Management
Authors:
Andrew Ang,
Nazym Azimbayev,
Andrey Kim
Abstract:
Agentic AI shifts the investor's role from analytical execution to oversight. We present an agentic strategic asset allocation pipeline in which approximately 50 specialized agents produce capital market assumptions, construct portfolios using over 20 competing methods, and critique and vote on each other's output. A researcher agent proposes new portfolio construction methods not yet represented,…
▽ More
Agentic AI shifts the investor's role from analytical execution to oversight. We present an agentic strategic asset allocation pipeline in which approximately 50 specialized agents produce capital market assumptions, construct portfolios using over 20 competing methods, and critique and vote on each other's output. A researcher agent proposes new portfolio construction methods not yet represented, and a meta-agent compares past forecasts against realized returns and rewrites agent code and prompts to improve future performance. The entire pipeline is governed by the Investment Policy Statement--the same document that guides human portfolio managers can now constrain and direct autonomous agents.
△ Less
Submitted 2 April, 2026;
originally announced April 2026.
-
Accelerating Low-Frequency Convergence for Limited-Angle DBT via Two-Channel Fidelity in PDHG
Authors:
Taro Iyadomi,
Ricardo Parada,
Anna Kim,
Lily Jiang,
Emil Sidky,
William Chang
Abstract:
Reconstruction in limited-angle digital breast tomosynthesis (DBT) suffers from slow convergence of low spatial-frequency components when using weighted data-fidelity terms within primal-dual optimization. We introduce a two-channel fidelity strategy that decomposes the sinogram residual into complementary low-pass and high-pass bands using square-root Hanning (Hann^{1/2}) filter families, each dr…
▽ More
Reconstruction in limited-angle digital breast tomosynthesis (DBT) suffers from slow convergence of low spatial-frequency components when using weighted data-fidelity terms within primal-dual optimization. We introduce a two-channel fidelity strategy that decomposes the sinogram residual into complementary low-pass and high-pass bands using square-root Hanning (Hann^{1/2}) filter families, each driven by an independent \ell_2-ball constraint and dual update in the PDHG (Chambolle-Pock) algorithm with He-Yuan predictor-corrector relaxation. By assigning a larger dual step size and slightly looser tolerance to the low-frequency channel, the method delivers stronger per-iteration correction to the near-DC band without violating global PDHG stability. Experiments on a 2D digital breast phantom across multiple resolutions demonstrate that the two-channel approach yields 19%--61% RMSE improvement over the single-channel baseline, with larger gains at coarser discretizations where problem conditioning is more favorable, supporting more balanced spectral convergence in clinically realistic limited-angle regimes.
△ Less
Submitted 25 March, 2026;
originally announced March 2026.
-
Mixture of Mini Experts: Overcoming the Linear Layer Bottleneck in Multiple Instance Learning
Authors:
Daniel Shao,
Joel Runevic,
Richard J. Chen,
Drew F. K. Williamson,
Ahrong Kim,
Andrew H. Song,
Faisal Mahmood
Abstract:
Multiple Instance Learning (MIL) is the predominant framework for classifying gigapixel whole-slide images in computational pathology. MIL follows a sequence of 1) extracting patch features, 2) applying a linear layer to obtain task-specific patch features, and 3) aggregating the patches into a slide feature for classification. While substantial efforts have been devoted to optimizing patch featur…
▽ More
Multiple Instance Learning (MIL) is the predominant framework for classifying gigapixel whole-slide images in computational pathology. MIL follows a sequence of 1) extracting patch features, 2) applying a linear layer to obtain task-specific patch features, and 3) aggregating the patches into a slide feature for classification. While substantial efforts have been devoted to optimizing patch feature extraction and aggregation, none have yet addressed the second point, the critical layer which transforms general-purpose features into task-specific features. We hypothesize that this layer constitutes an overlooked performance bottleneck and that stronger representations can be achieved with a low-rank transformation tailored to each patch's phenotype, yielding synergistic effects with any of the existing MIL approaches. To this end, we introduce MAMMOTH, a parameter-efficient, multi-head mixture of experts module designed to improve the performance of any MIL model with minimal alterations to the total number of parameters. Across eight MIL methods and 19 different classification tasks, we find that such task-specific transformation has a larger effect on performance than the choice of aggregation method. For instance, when equipped with MAMMOTH, even simple methods such as max or mean pooling attain higher average performance than any method with the standard linear layer. Overall, MAMMOTH improves performance in 130 of the 152 examined configurations, with an average $+3.8\%$ change in performance. Code is available at https://github.com/mahmoodlab/mammoth.
△ Less
Submitted 23 March, 2026;
originally announced March 2026.
-
ROBOGATE: Adaptive Failure Discovery for Safe Robot Policy Deployment via Two-Stage Boundary-Focused Sampling
Authors:
Azuki Kim
Abstract:
Deploying learned robot manipulation policies in industrial settings requires rigorous pre-deployment validation, yet exhaustive testing across high-dimensional parameter spaces is intractable. We present ROBOGATE, a deployment risk management framework that combines physics-based simulation with a two-stage adaptive sampling strategy to efficiently discover failure boundaries in the operational p…
▽ More
Deploying learned robot manipulation policies in industrial settings requires rigorous pre-deployment validation, yet exhaustive testing across high-dimensional parameter spaces is intractable. We present ROBOGATE, a deployment risk management framework that combines physics-based simulation with a two-stage adaptive sampling strategy to efficiently discover failure boundaries in the operational parameter space. Stage 1 employs Latin Hypercube Sampling (LHS) across an 8-dimensional parameter space; Stage 2 applies boundary-focused sampling concentrated in the 30-70% success rate transition zone. Using NVIDIA Isaac Sim with Newton physics, we evaluate a scripted pick-and-place controller across four robot embodiments -- Franka Panda (7-DOF), UR3e (6-DOF), UR5e (6-DOF), and UR10e (6-DOF) -- totaling over 50,000 experiments. Our logistic regression risk model achieves AUC 0.780 and identifies a closed-form failure boundary equation. We further benchmark eight VLA (Vision-Language-Action) policies, including a fine-tuned NVIDIA GR00T N1.6 (3B) trained on LIBERO-Spatial for 20K steps. The same checkpoint achieves 97.65% success rate on LIBERO (MuJoCo) but 0% on RoboGate's 68 industrial scenarios in NVIDIA Isaac Sim -- a 97.65 percentage point cross-simulator gap on a single model that underscores the deployment validation challenge. Inspired by the validation-layer paradigm NVIDIA codified for quantum computing with Ising, ROBOGATE provides this validation layer for Physical AI. Open-source.
△ Less
Submitted 15 April, 2026; v1 submitted 23 March, 2026;
originally announced March 2026.
-
Scientific Rigor and Human Warmth: Remembering Vladimir Sidorenko (1949-2025)
Authors:
Christian Deppe,
Haider Al Kim,
Jessica Bariffi,
Hannes Bartz,
Minglai Cai,
Pau Colomer,
Gohar Kyureghyan
Abstract:
During the Foundations of Future Communication Systems (FFCS) conference in Braunschweig, a dedicated memorial session was held in honor of Dr. Vladimir (Volodya) Sidorenko (1949-2025). The session, chaired by Minglai Cai, brought together colleagues, collaborators, and former students to commemorate his scientific achievements and his exceptional human qualities. This report summarizes the biogra…
▽ More
During the Foundations of Future Communication Systems (FFCS) conference in Braunschweig, a dedicated memorial session was held in honor of Dr. Vladimir (Volodya) Sidorenko (1949-2025). The session, chaired by Minglai Cai, brought together colleagues, collaborators, and former students to commemorate his scientific achievements and his exceptional human qualities. This report summarizes the biographical tribute, the personal recollections shared by speakers, and the broader impact of Volodya's work in coding theory, cryptography, telecommunications, and quantum error correction. Beyond his more than 150 publications and substantial technical contributions, the session highlighted his intellectual rigor, mentorship, humor, generosity, and lasting influence on the international research community.
△ Less
Submitted 10 March, 2026;
originally announced March 2026.
-
TreeLoc++: Robust 6-DoF LiDAR Localization in Forests with a Compact Digital Forest Inventory
Authors:
Minwoo Jung,
Dongjae Lee,
Nived Chebrolu,
Haedam Oh,
Maurice Fallon,
Ayoung Kim
Abstract:
Reliable localization is essential for sustainable forest management, as it allows robots or sensor systems to revisit and monitor the status of individual trees over long periods. In modern forestry, this management is structured around Digital Forest Inventories (DFIs), which encode stems using compact geometric attributes rather than raw data. Despite their central role, DFIs have been overlook…
▽ More
Reliable localization is essential for sustainable forest management, as it allows robots or sensor systems to revisit and monitor the status of individual trees over long periods. In modern forestry, this management is structured around Digital Forest Inventories (DFIs), which encode stems using compact geometric attributes rather than raw data. Despite their central role, DFIs have been overlooked in localization research, and most methods still rely on dense gigabyte-sized point clouds that are costly to store and maintain. To improve upon this, we propose TreeLoc++, a global localization framework that operates directly on DFIs as a discriminative representation, eliminating the need to use the raw point clouds. TreeLoc++ reduces false matches in structurally ambiguous forests and improves the reliability of full 6-DoF pose estimation. It augments coarse retrieval with a pairwise distance histogram that encodes local tree-layout context, subsequently refining candidates via DBH-based filtering and yaw-consistent inlier selection to further reduce mismatches. Furthermore, a constrained optimization leveraging tree geometry jointly estimates roll, pitch, and height, enhancing pose stability and enabling accurate localization without reliance on dense 3D point cloud data. Evaluations on 27 sequences recorded in forests across three datasets and four countries show that TreeLoc++ achieves precise localization with centimeter-level accuracy. We further demonstrate robustness to long-term change by localizing data recorded in 2025 against inventories built from 2023 data, spanning a two-year interval. The system represents 15 sessions spanning 7.98 km of trajectories using only 250KB of map data and outperforms both hand-crafted and learning-based baselines that rely on point cloud maps. This demonstrates the scalability of TreeLoc++ for long-term deployment.
△ Less
Submitted 3 March, 2026;
originally announced March 2026.
-
FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
Authors:
Aro Kim,
Myeongjin Jang,
Chaewon Moon,
Youngjin Shin,
Jinwoo Jeong,
Sang-hyo Park
Abstract:
Diffusion-based approaches have recently driven remarkable progress in real-world image super-resolution (SR). However, existing methods still struggle to simultaneously preserve fine details and ensure high-fidelity reconstruction, often resulting in suboptimal visual quality. In this paper, we propose FiDeSR, a high-fidelity and detail-preserving one-step diffusion super-resolution framework. Du…
▽ More
Diffusion-based approaches have recently driven remarkable progress in real-world image super-resolution (SR). However, existing methods still struggle to simultaneously preserve fine details and ensure high-fidelity reconstruction, often resulting in suboptimal visual quality. In this paper, we propose FiDeSR, a high-fidelity and detail-preserving one-step diffusion super-resolution framework. During training, we introduce a detail-aware weighting strategy that adaptively emphasizes regions where the model exhibits higher prediction errors. During inference, low- and high-frequency adaptive enhancers further refine the reconstruction without requiring model retraining, enabling flexible enhancement control. To further improve the reconstruction accuracy, FiDeSR incorporates a residual-in-residual noise refinement, which corrects prediction errors in the diffusion noise and enhances fine detail recovery. FiDeSR achieves superior real-world SR performance compared to existing diffusion-based methods, producing outputs with both high perceptual quality and faithful content restoration. The source code will be released at: https://github.com/Ar0Kim/FiDeSR.
△ Less
Submitted 3 March, 2026;
originally announced March 2026.
-
LLM Novice Uplift on Dual-Use, In Silico Biology Tasks
Authors:
Chen Bo Calvin Zhang,
Christina Q. Knight,
Nicholas Kruus,
Jason Hausenloy,
Pedro Medeiros,
Nathaniel Li,
Aiden Kim,
Yury Orlovskiy,
Coleman Breen,
Bryce Cai,
Jasper Götting,
Andrew Bo Liu,
Samira Nedungadi,
Paula Rodriguez,
Yannis Yiming He,
Mohamed Shaaban,
Zifan Wang,
Seth Donoughe,
Julian Michael
Abstract:
Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to perform better than with internet-only resources. This uncertainty is central to understanding both scientific acceleration and dual-use risk. We conducted a multi-model, multi-benchmark human uplift study comparing novices with LLM access…
▽ More
Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to perform better than with internet-only resources. This uncertainty is central to understanding both scientific acceleration and dual-use risk. We conducted a multi-model, multi-benchmark human uplift study comparing novices with LLM access versus internet-only access across eight biosecurity-relevant task sets. Participants worked on complex problems with ample time (up to 13 hours for the most involved tasks). We found that LLM access provided substantial uplift: novices with LLMs were 4.16 times more accurate than controls (95% CI [2.63, 6.87]). On four benchmarks with available expert baselines (internet-only), novices with LLMs outperformed experts on three of them. Perhaps surprisingly, standalone LLMs often exceeded LLM-assisted novices, indicating that users were not eliciting the strongest available contributions from the LLMs. Most participants (89.6%) reported little difficulty obtaining dual-use-relevant information despite safeguards. Overall, LLMs substantially uplift novices on biological tasks previously reserved for trained practitioners, underscoring the need for sustained, interactive uplift evaluations alongside traditional benchmarks.
△ Less
Submitted 13 March, 2026; v1 submitted 26 February, 2026;
originally announced February 2026.
-
Measurement of the near-threshold J$/ψ$ photoproduction cross section with the CLAS12 experiment
Authors:
P. Chatagnon,
V. Kubarovsky,
R. Paremuzyan,
S. Stepanyan,
M. Tenorio,
R. Tyson,
A. G. Acar,
P. Achenbach,
J. S. Alvarado,
M. J. Amaryan,
W. R. Armstrong,
H. Avakian,
N. A. Baltzell,
L. Barion,
M. Bashkanov,
M. Battaglieri,
F. Benmokhtar,
A. Bianconi,
A. S. Biselli,
S. Boiarinov,
M. Bondi,
F. Bossù,
K. -Th. Brinkmann,
W. J. Briscoe,
S. Bueltmann
, et al. (125 additional authors not shown)
Abstract:
We present measurements of the total and differential cross sections for near-threshold J/$ψ$ photoproduction obtained with the CLAS12 detector at the Thomas Jefferson National Accelerator Facility. The results are based on data collected during the Fall 2018 and Spring 2019 running periods, using electron beams with energies of 10.6 and 10.2 GeV, respectively, scattered off a liquid-hydrogen targ…
▽ More
We present measurements of the total and differential cross sections for near-threshold J/$ψ$ photoproduction obtained with the CLAS12 detector at the Thomas Jefferson National Accelerator Facility. The results are based on data collected during the Fall 2018 and Spring 2019 running periods, using electron beams with energies of 10.6 and 10.2 GeV, respectively, scattered off a liquid-hydrogen target. Near-threshold J$/ψ$ photoproduction offers a unique sensitivity to the strong interaction in the non-perturbative regime of Quantum Chromodynamics (QCD). The energy dependence of the cross section constrains the underlying J$/ψ$ production mechanisms, including multi-gluon exchange and potential baryonic excitations. Additionally, the $t$-dependence of the differential cross section can be related to the transverse spatial distribution of gluons in the proton, providing critical input for theoretical descriptions of the gluonic structure of the proton. An interpretation of the results in terms of the gluon content of the proton is presented, providing new experimental constraints on QCD-inspired models of the proton structure and the role of gluonic degrees of freedom in hadronic mass generation.
△ Less
Submitted 25 February, 2026;
originally announced February 2026.
-
Satellite-Based Detection of Looted Archaeological Sites Using Machine Learning
Authors:
Girmaw Abebe Tadesse,
Titien Bartette,
Andrew Hassanali,
Allen Kim,
Jonathan Chemla,
Andrew Zolli,
Yves Ubelmann,
Caleb Robinson,
Inbal Becker-Reshef,
Juan Lavista Ferres
Abstract:
Looting at archaeological sites poses a severe risk to cultural heritage, yet monitoring thousands of remote locations remains operationally difficult. We present a scalable and satellite-based pipeline to detect looted archaeological sites, using PlanetScope monthly mosaics (4.7m/pixel) and a curated dataset of 1,943 archaeological sites in Afghanistan (898 looted, 1,045 preserved) with multi-yea…
▽ More
Looting at archaeological sites poses a severe risk to cultural heritage, yet monitoring thousands of remote locations remains operationally difficult. We present a scalable and satellite-based pipeline to detect looted archaeological sites, using PlanetScope monthly mosaics (4.7m/pixel) and a curated dataset of 1,943 archaeological sites in Afghanistan (898 looted, 1,045 preserved) with multi-year imagery (2016--2023) and site-footprint masks. We compare (i) end-to-end CNN classifiers trained on raw RGB patches and (ii) traditional machine learning (ML) trained on handcrafted spectral/texture features and embeddings from recent remote-sensing foundation models. Results indicate that ImageNet-pretrained CNNs combined with spatial masking reach an F1 score of 0.926, clearly surpassing the strongest traditional ML setup, which attains an F1 score of 0.710 using SatCLIP-V+RF+Mean, i.e., location and vision embeddings fed into a Random Forest with mean-based temporal aggregation. Ablation studies demonstrate that ImageNet pretraining (even in the presence of domain shift) and spatial masking enhance performance. In contrast, geospatial foundation model embeddings perform competitively with handcrafted features, suggesting that looting signatures are extremely localized. The repository is available at https://github.com/microsoft/looted_site_detection.
△ Less
Submitted 23 February, 2026;
originally announced February 2026.
-
TherA: Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation
Authors:
Dong-Guw Lee,
Tai Hyoung Rhee,
Hyunsoo Jang,
Young-Sik Shin,
Ukcheol Shin,
Ayoung Kim
Abstract:
Despite the inherent advantages of thermal infrared(TIR) imaging, large-scale data collection and annotation remain a major bottleneck for TIR-based perception. A practical alternative is to synthesize pseudo TIR data via image translation; however, most RGB-to-TIR approaches heavily rely on RGB-centric priors that overlook thermal physics, yielding implausible heat distributions. In this paper, w…
▽ More
Despite the inherent advantages of thermal infrared(TIR) imaging, large-scale data collection and annotation remain a major bottleneck for TIR-based perception. A practical alternative is to synthesize pseudo TIR data via image translation; however, most RGB-to-TIR approaches heavily rely on RGB-centric priors that overlook thermal physics, yielding implausible heat distributions. In this paper, we introduce TherA, a controllable RGB-to-TIR translation framework that produces diverse and thermally plausible images at both scene and object level. TherA couples TherA-VLM with a latent-diffusion-based translator. Given a single RGB image and a user-prompted condition pair, TherA-VLM yields a thermal-aware embedding that encodes scene, object, material, and heat-emission context reflecting the input scene-condition pair. Conditioning the diffusion model on this embedding enables realistic TIR synthesis and fine-grained control across time of day, weather, and object state. Compared to other baselines, TherA achieves state-of-the-art translation performance, demonstrating improved zero-shot translation performance up to 33% increase averaged across all metrics.
△ Less
Submitted 23 February, 2026; v1 submitted 22 February, 2026;
originally announced February 2026.
-
RoEL: Robust Event-based 3D Line Reconstruction
Authors:
Gwangtak Bae,
Jaeho Shin,
Seunggu Kang,
Junho Kim,
Ayoung Kim,
Young Min Kim
Abstract:
Event cameras in motion tend to detect object boundaries or texture edges, which produce lines of brightness changes, especially in man-made environments. While lines can constitute a robust intermediate representation that is consistently observed, the sparse nature of lines may lead to drastic deterioration with minor estimation errors. Only a few previous works, often accompanied by additional…
▽ More
Event cameras in motion tend to detect object boundaries or texture edges, which produce lines of brightness changes, especially in man-made environments. While lines can constitute a robust intermediate representation that is consistently observed, the sparse nature of lines may lead to drastic deterioration with minor estimation errors. Only a few previous works, often accompanied by additional sensors, utilize lines to compensate for the severe domain discrepancies of event sensors along with unpredictable noise characteristics. We propose a method that can stably extract tracks of varying appearances of lines using a clever algorithmic process that observes multiple representations from various time slices of events, compensating for potential adversaries within the event data. We then propose geometric cost functions that can refine the 3D line maps and camera poses, eliminating projective distortions and depth ambiguities. The 3D line maps are highly compact and can be equipped with our proposed cost function, which can be adapted for any observations that can detect and extract line structures or projections of them, including 3D point cloud maps or image observations. We demonstrate that our formulation is powerful enough to exhibit a significant performance boost in event-based mapping and pose refinement across diverse datasets, and can be flexibly applied to multimodal scenarios. Our results confirm that the proposed line-based formulation is a robust and effective approach for the practical deployment of event-based perceptual modules. Project page: https://gwangtak.github.io/roel/
△ Less
Submitted 20 February, 2026;
originally announced February 2026.
-
High-temperature $η$-pairing superconductivity in the photodoped Hubbard model
Authors:
Lei Geng,
Aaram J. Kim,
Philipp Werner
Abstract:
We investigate superconductivity emerging in the photodoped Mott insulating Hubbard model using steady-state dynamical mean-field theory implemented on the real-frequency axis. By employing high-order strong-coupling impurity solvers, we obtain the nonequilibrium phase diagram for photoinduced $η$-pairing superconductivity with a remarkably high effective critical temperature. We further identify…
▽ More
We investigate superconductivity emerging in the photodoped Mott insulating Hubbard model using steady-state dynamical mean-field theory implemented on the real-frequency axis. By employing high-order strong-coupling impurity solvers, we obtain the nonequilibrium phase diagram for photoinduced $η$-pairing superconductivity with a remarkably high effective critical temperature. We further identify a superconducting gap in the momentum-resolved spectral function and optical conductivity, providing spectroscopic signatures accessible to experiments. Our results highlight a route to a controllable form of high-temperature superconductivity in nonequilibrium strongly correlated systems, fundamentally distinct from the equilibrium $s$-wave pairing state in the attractive Hubbard model or cuprate-like $d$-wave superconductors.
△ Less
Submitted 19 February, 2026;
originally announced February 2026.
-
ESO White Paper on Intensity Interferometry: Cosmology, Fundamental Physics, Quantum Optics
Authors:
Robin Kaiser,
William Guerin,
Farrokh Vakili,
Jean-Philippe Berger,
Andrei Nomerotski,
Sergei Kulkov,
Peter Svihra,
Eva Santos,
Colin Carlile,
Dainis Dravins,
Stefan Funk,
Prasenjit Saha,
Roland Walter,
Marcelo Borges Fernandes,
Alex G. Kim,
David Dunsky,
Ken Van Tilburg,
Masha Baryakhtar,
Marios Galanis,
Robert V. Wagoner,
Neal Dalal,
Junwu Huang,
Charles Gammie,
Norman W. Murray
Abstract:
In this whitepaper, we outline how recent technological advances and ongoing developments open qualitatively new science opportunities in cosmology, fundamental physics, and quantum astrophysics. First, intensity interferometry can contribute to one of the most foundational observables in cosmology: the expansion rate of the Universe. Its angular resolution allows it to resolve the angular extent…
▽ More
In this whitepaper, we outline how recent technological advances and ongoing developments open qualitatively new science opportunities in cosmology, fundamental physics, and quantum astrophysics. First, intensity interferometry can contribute to one of the most foundational observables in cosmology: the expansion rate of the Universe. Its angular resolution allows it to resolve the angular extent of extragalactic objects such as supernovae or quasars; combined with a physical scale local to the source, this yields an angular diameter distance and hence a 'Hubble diagram'. Second, the nature of dark matter can be probed via the astrometric lensing signatures of tiny dark matter halos. Third, intensity interferometry gives direct access to second-order coherence properties of astrophysical emission, opening a window onto genuinely quantum aspects of astrophysical light.
△ Less
Submitted 13 February, 2026;
originally announced February 2026.
-
Clutt3R-Seg: Sparse-view 3D Instance Segmentation for Language-grounded Grasping in Cluttered Scenes
Authors:
Jeongho Noh,
Tai Hyoung Rhee,
Eunho Lee,
Jeongyun Kim,
Sunwoo Lee,
Ayoung Kim
Abstract:
Reliable 3D instance segmentation is fundamental to language-grounded robotic manipulation. Its critical application lies in cluttered environments, where occlusions, limited viewpoints, and noisy masks degrade perception. To address these challenges, we present Clutt3R-Seg, a zero-shot pipeline for robust 3D instance segmentation for language-grounded grasping in cluttered scenes. Our key idea is…
▽ More
Reliable 3D instance segmentation is fundamental to language-grounded robotic manipulation. Its critical application lies in cluttered environments, where occlusions, limited viewpoints, and noisy masks degrade perception. To address these challenges, we present Clutt3R-Seg, a zero-shot pipeline for robust 3D instance segmentation for language-grounded grasping in cluttered scenes. Our key idea is to introduce a hierarchical instance tree of semantic cues. Unlike prior approaches that attempt to refine noisy masks, our method leverages them as informative cues: through cross-view grouping and conditional substitution, the tree suppresses over- and under-segmentation, yielding view-consistent masks and robust 3D instances. Each instance is enriched with open-vocabulary semantic embeddings, enabling accurate target selection from natural language instructions. To handle scene changes during multi-stage tasks, we further introduce a consistency-aware update that preserves instance correspondences from only a single post-interaction image, allowing efficient adaptation without rescanning. Clutt3R-Seg is evaluated on both synthetic and real-world datasets, and validated on a real robot. Across all settings, it consistently outperforms state-of-the-art baselines in cluttered and sparse-view scenarios. Even on the most challenging heavy-clutter sequences, Clutt3R-Seg achieves an AP@25 of 61.66, over 2.2x higher than baselines, and with only four input views it surpasses MaskClustering with eight views by more than 2x. The code is available at: https://github.com/jeonghonoh/clutt3r-seg.
△ Less
Submitted 12 February, 2026;
originally announced February 2026.
-
CompSplat: Compression-aware 3D Gaussian Splatting for Real-world Video
Authors:
Hojun Song,
Heejung Choi,
Aro Kim,
Chae-yeong Song,
Gahyeon Kim,
Soo Ye Kim,
Jaehyup Lee,
Sang-hyo Park
Abstract:
High-quality novel view synthesis (NVS) from real-world videos is crucial for applications such as cultural heritage preservation, digital twins, and immersive media. However, real-world videos typically contain long sequences with irregular camera trajectories and unknown poses, leading to pose drift, feature misalignment, and geometric distortion during reconstruction. Moreover, lossy compressio…
▽ More
High-quality novel view synthesis (NVS) from real-world videos is crucial for applications such as cultural heritage preservation, digital twins, and immersive media. However, real-world videos typically contain long sequences with irregular camera trajectories and unknown poses, leading to pose drift, feature misalignment, and geometric distortion during reconstruction. Moreover, lossy compression amplifies these issues by introducing inconsistencies that gradually degrade geometry and rendering quality. While recent studies have addressed either long-sequence NVS or unposed reconstruction, compression-aware approaches still focus on specific artifacts or limited scenarios, leaving diverse compression patterns in long videos insufficiently explored. In this paper, we propose CompSplat, a compression-aware training framework that explicitly models frame-wise compression characteristics to mitigate inter-frame inconsistency and accumulated geometric errors. CompSplat incorporates compression-aware frame weighting and an adaptive pruning strategy to enhance robustness and geometric consistency, particularly under heavy compression. Extensive experiments on challenging benchmarks, including Tanks and Temples, Free, and Hike, demonstrate that CompSplat achieves state-of-the-art rendering quality and pose accuracy, significantly surpassing most recent state-of-the-art NVS approaches under severe compression conditions.
△ Less
Submitted 10 February, 2026;
originally announced February 2026.
-
Informative Object-centric Next Best View for Object-aware 3D Gaussian Splatting in Cluttered Scenes
Authors:
Seunghoon Jeong,
Eunho Lee,
Jeongyun Kim,
Ayoung Kim
Abstract:
In cluttered scenes with inevitable occlusions and incomplete observations, selecting informative viewpoints is essential for building a reliable representation. In this context, 3D Gaussian Splatting (3DGS) offers a distinct advantage, as it can explicitly guide the selection of subsequent viewpoints and then refine the representation with new observations. However, existing approaches rely solel…
▽ More
In cluttered scenes with inevitable occlusions and incomplete observations, selecting informative viewpoints is essential for building a reliable representation. In this context, 3D Gaussian Splatting (3DGS) offers a distinct advantage, as it can explicitly guide the selection of subsequent viewpoints and then refine the representation with new observations. However, existing approaches rely solely on geometric cues, neglect manipulation-relevant semantics, and tend to prioritize exploitation over exploration. To tackle these limitations, we introduce an instance-aware Next Best View (NBV) policy that prioritizes underexplored regions by leveraging object features. Specifically, our object-aware 3DGS distills instancelevel information into one-hot object vectors, which are used to compute confidence-weighted information gain that guides the identification of regions associated with erroneous and uncertain Gaussians. Furthermore, our method can be easily adapted to an object-centric NBV, which focuses view selection on a target object, thereby improving reconstruction robustness to object placement. Experiments demonstrate that our NBV policy reduces depth error by up to 77.14% on the synthetic dataset and 34.10% on the real-world GraspNet dataset compared to baselines. Moreover, compared to targeting the entire scene, performing NBV on a specific object yields an additional reduction of 25.60% in depth error for that object. We further validate the effectiveness of our approach through real-world robotic manipulation tasks.
△ Less
Submitted 8 February, 2026;
originally announced February 2026.
-
PLATO Hand: Shaping Contact Behavior with Fingernails for Precise Manipulation
Authors:
Dong Ho Kang,
Aaron Kim,
Mingyo Seo,
Kazuto Yokoyama,
Tetsuya Narita,
Luis Sentis
Abstract:
We present the PLATO Hand, a dexterous robotic hand with a hybrid fingertip that embeds a rigid fingernail within a compliant pulp. This design shapes contact behavior to enable diverse interaction modes across a range of object geometries. We develop a strain-energy-based bending-indentation model to guide the fingertip design and to explain how guided contact preserves local indentation while su…
▽ More
We present the PLATO Hand, a dexterous robotic hand with a hybrid fingertip that embeds a rigid fingernail within a compliant pulp. This design shapes contact behavior to enable diverse interaction modes across a range of object geometries. We develop a strain-energy-based bending-indentation model to guide the fingertip design and to explain how guided contact preserves local indentation while suppressing global bending. Experimental results show that the proposed robotic hand design demonstrates improved pinching stability, enhanced force observability, and successful execution of edge-sensitive manipulation tasks, including paper singulation, card picking, and orange peeling. Together, these results show that coupling structured contact geometry with a force-motion transparent mechanism provides a principled, physically embodied approach to precise manipulation.
△ Less
Submitted 4 February, 2026;
originally announced February 2026.
-
Tokenization and Morphological Fidelity in Uralic NLP: A Cross-Lingual Evaluation
Authors:
Nuo Xu,
Ahrii Kim
Abstract:
Subword tokenization critically affects Natural Language Processing (NLP) performance, yet its behavior in morphologically rich and low-resource language families remains under-explored. This study systematically compares three subword paradigms -- Byte Pair Encoding (BPE), Overlap BPE (OBPE), and Unigram Language Model -- across six Uralic languages with varying resource availability and typologi…
▽ More
Subword tokenization critically affects Natural Language Processing (NLP) performance, yet its behavior in morphologically rich and low-resource language families remains under-explored. This study systematically compares three subword paradigms -- Byte Pair Encoding (BPE), Overlap BPE (OBPE), and Unigram Language Model -- across six Uralic languages with varying resource availability and typological diversity. Using part-of-speech (POS) tagging as a controlled downstream task, we show that OBPE consistently achieves stronger morphological alignment and higher tagging accuracy than conventional methods, particularly within the Latin-script group. These gains arise from reduced fragmentation in open-class categories and a better balance across the frequency spectrum. Transfer efficacy further depends on the downstream tagging architecture, interacting with both training volume and genealogical proximity. Taken together, these findings highlight that morphology-sensitive tokenization is not merely a preprocessing choice but a decisive factor in enabling effective cross-lingual transfer for agglutinative, low-resource languages.
△ Less
Submitted 14 February, 2026; v1 submitted 4 February, 2026;
originally announced February 2026.
-
Liouvillian Gap in Dissipative Haar-Doped Clifford Circuits
Authors:
Ha Eum Kim,
Andrew D. Kim,
Jong Yeon Lee
Abstract:
Quantum chaos is commonly assessed through probe-dependent signatures that need not coincide. Recently, a dissipative signature was proposed for chaotic Floquet systems, where infinitesimal bulk dissipation induces a non-zero constant intrinsic relaxation rate quantified by the Liouvillian gap. This raises a question: what minimal departure from Clifford dynamics is required to generate such intri…
▽ More
Quantum chaos is commonly assessed through probe-dependent signatures that need not coincide. Recently, a dissipative signature was proposed for chaotic Floquet systems, where infinitesimal bulk dissipation induces a non-zero constant intrinsic relaxation rate quantified by the Liouvillian gap. This raises a question: what minimal departure from Clifford dynamics is required to generate such intrinsic relaxation? To address this, we study a Floquet two-qubit Clifford circuit doped with Haar-random single-qubit gates and subject to local dissipation of strength $γ$. We find a structure-dependent crossover. The undoped iSWAP-class circuit exhibits a weak-dissipation singularity, with a gap that grows with $N$ for any $γ>0$. Haar doping preserves this undoped-like growth for any subextensive doping pattern. At finite doping density, there exist patterns that yield an $\mathcal{O}(1)$ gap for any fixed $γ$ as $N\to\infty$, yet remain singular as $γ\to0^+$. Because our bounds depend only on the spatial doping pattern, they remain valid even when the Haar rotations are independently redrawn each Floquet period. Overall, our findings provide a circuit-level perspective on intrinsic relaxation, and thus irreversibility, in open many-body systems.
△ Less
Submitted 22 February, 2026; v1 submitted 3 February, 2026;
originally announced February 2026.
-
Validating the Angular Sizes of Red Clump Stars with Intensity Interferometry
Authors:
Alex G. Kim,
Robin Kaiser
Abstract:
The surface-brightness-color (SBC) relationship for Red Clump stars provides a critical foundation for precision distance ladder measurements, including the 1\% distance determination to the Large Magellanic Cloud. Current SBC calibrations rely on angular diameter measurements of nearby Red Clump stars obtained through long-baseline optical interferometry using the Very Large Telescope Interferome…
▽ More
The surface-brightness-color (SBC) relationship for Red Clump stars provides a critical foundation for precision distance ladder measurements, including the 1\% distance determination to the Large Magellanic Cloud. Current SBC calibrations rely on angular diameter measurements of nearby Red Clump stars obtained through long-baseline optical interferometry using the Very Large Telescope Interferometer. We explore the application of intensity interferometry to measure limb-darkened angular diameters of Red Clump stars, offering a complementary approach to traditional amplitude interferometry. We describe the framework for extracting angular diameters from squared visibility measurements in intensity interferometry, accounting for limb darkening through the stellar atmosphere models. For the Red Clump star HD~17652, we show that intensity interferometry in the $H$ band at baselines matching PIONIER ($\sim$100~m) could achieve $<1$\% angular size uncertainties in 2-hour exposures by measuring the primary peak of the visibility function, enabling direct comparison with existing measurements. Critically, observations at shorter wavelengths probe the secondary visibility maximum, providing independent checks of both measurement and systematic errors that are largely insensitive to limb-darkening assumptions. Exploiting the multiplex advantage of simultaneous multi-bandpass observations and the large number of baselines available with telescope arrays such as the Cherenkov Telescope Array Observatory can reduce observing times to practical levels, making intensity interferometry a viable tool for validating the angular sizes for a subset of the Red Clump star calibration sample.
△ Less
Submitted 8 April, 2026; v1 submitted 2 February, 2026;
originally announced February 2026.
-
TreeLoc: 6-DoF LiDAR Global Localization in Forests via Inter-Tree Geometric Matching
Authors:
Minwoo Jung,
Nived Chebrolu,
Lucas Carvalho de Lima,
Haedam Oh,
Maurice Fallon,
Ayoung Kim
Abstract:
Reliable localization is crucial for navigation in forests, where GPS is often degraded and LiDAR measurements are repetitive, occluded, and structurally complex. These conditions weaken the assumptions of traditional urban-centric localization methods, which assume that consistent features arise from unique structural patterns, necessitating forest-centric solutions to achieve robustness in these…
▽ More
Reliable localization is crucial for navigation in forests, where GPS is often degraded and LiDAR measurements are repetitive, occluded, and structurally complex. These conditions weaken the assumptions of traditional urban-centric localization methods, which assume that consistent features arise from unique structural patterns, necessitating forest-centric solutions to achieve robustness in these environments. To address these challenges, we propose TreeLoc, a LiDAR-based global localization framework for forests that handles place recognition and 6-DoF pose estimation. We represent scenes using tree stems and their Diameter at Breast Height (DBH), which are aligned to a common reference frame via their axes and summarized using the tree distribution histogram (TDH) for coarse matching, followed by fine matching with a 2D triangle descriptor. Finally, pose estimation is achieved through a two-step geometric verification. On diverse forest benchmarks, TreeLoc outperforms baselines, achieving precise localization. Ablation studies validate the contribution of each component. We also propose applications for long-term forest management using descriptors from a compact global tree database. TreeLoc is open-sourced for the robotics community at https://github.com/minwoo0611/TreeLoc.
△ Less
Submitted 12 February, 2026; v1 submitted 1 February, 2026;
originally announced February 2026.
-
Do LLMs Truly Benefit from Longer Context in Automatic Post-Editing?
Authors:
Ahrii Kim,
Seong-heum Kim
Abstract:
Automatic post-editing (APE) aims to refine machine translations by correcting residual errors. Although recent large language models (LLMs) demonstrate strong translation capabilities, their effectiveness for APE--especially under document-level context--remains insufficiently understood. We present a systematic comparison of proprietary and open-weight LLMs under a naive document-level prompting…
▽ More
Automatic post-editing (APE) aims to refine machine translations by correcting residual errors. Although recent large language models (LLMs) demonstrate strong translation capabilities, their effectiveness for APE--especially under document-level context--remains insufficiently understood. We present a systematic comparison of proprietary and open-weight LLMs under a naive document-level prompting setup, analyzing APE quality, contextual behavior, robustness, and efficiency.
Our results show that proprietary LLMs achieve near human-level APE quality even with simple one-shot prompting, regardless of whether document context is provided. While these models exhibit higher robustness to data poisoning attacks than open-weight counterparts, this robustness also reveals a limitation: they largely fail to exploit document-level context for contextual error correction. Furthermore, standard automatic metrics do not reliably reflect these qualitative improvements, highlighting the continued necessity of human evaluation. Despite their strong performance, the substantial cost and latency overheads of proprietary LLMs render them impractical for real-world APE deployment. Overall, our findings elucidate both the promise and current limitations of LLM-based document-aware APE, and point toward the need for more efficient long-context modeling approaches for translation refinement.
△ Less
Submitted 12 March, 2026; v1 submitted 27 January, 2026;
originally announced January 2026.
-
Biphasic Meniscus Coating for Scalable and Material Efficient Quantum Dot Films
Authors:
Shlok Joseph Paul,
Letian Li,
Zheng Li,
Andrew Kim,
Mia Klopfestein,
Stephanie S. Lee,
Ayaskanta Sahu
Abstract:
Colloidal quantum dots (cQDs) have emerged as a cornerstone of next-generation optoelectronics, offering unparalleled spectral tunability and solution-processability. However, the transition from laboratory-scale devices to sustainable industrial manufacturing is fundamentally hindered by spin-coating workflows, which are intrinsically wasteful and restricted to planar geometries. These limitation…
▽ More
Colloidal quantum dots (cQDs) have emerged as a cornerstone of next-generation optoelectronics, offering unparalleled spectral tunability and solution-processability. However, the transition from laboratory-scale devices to sustainable industrial manufacturing is fundamentally hindered by spin-coating workflows, which are intrinsically wasteful and restricted to planar geometries. These limitations are particularly acute for high-performance cQDs containing regulated elements such as lead, cadmium, or mercury, where poor material utilization exacerbates both environmental burden and cost. Here we report a biphasic dip-coating strategy that redefines the material efficiency of nanocrystal film fabrication. By utilizing an immiscible underlayer to displace ~88% of the active reservoir volume, we demonstrate a deposition geometry that decouples material consumption from total precursor volume. Infrared PbS photodetectors fabricated via this approach maintain their performance against spin-coated benchmarks while reducing ink consumption by up to 20-fold. Our technoeconomic analysis reveals that this biphasic architecture achieves cost parity at film thicknesses an order of magnitude lower than conventional monophasic dip-coating. Our results establish a low-waste framework for solution-processed materials, providing a viable pathway for the resource-efficient manufacturing of optoelectronic devices.
△ Less
Submitted 21 January, 2026;
originally announced January 2026.
-
Pedagogical Alignment for Vision-Language-Action Models: A Comprehensive Framework for Data, Architecture, and Evaluation in Education
Authors:
Unggi Lee,
Jahyun Jeong,
Sunyoung Shin,
Haeun Park,
Jeongsu Moon,
Youngchang Song,
Jaechang Shim,
JaeHwan Lee,
Yunju Noh,
Seungwon Choi,
Ahhyun Kim,
TaeHyeon Kim,
Kyungtae Joo,
Taeyeong Kim,
Gyeonggeon Lee
Abstract:
Science demonstrations are important for effective STEM education, yet teachers face challenges in conducting them safely and consistently across multiple occasions, where robotics can be helpful. However, current Vision-Language-Action (VLA) models require substantial computational resources and sacrifice language generation capabilities to maximize efficiency, making them unsuitable for resource…
▽ More
Science demonstrations are important for effective STEM education, yet teachers face challenges in conducting them safely and consistently across multiple occasions, where robotics can be helpful. However, current Vision-Language-Action (VLA) models require substantial computational resources and sacrifice language generation capabilities to maximize efficiency, making them unsuitable for resource-constrained educational settings that require interpretable, explanation-generating systems. We present \textit{Pedagogical VLA Framework}, a framework that applies pedagogical alignment to lightweight VLA models through four components: text healing to restore language generation capabilities, large language model (LLM) distillation to transfer pedagogical knowledge, safety training for educational environments, and pedagogical evaluation adjusted to science education contexts. We evaluate Pedagogical VLA Framework across five science demonstrations spanning physics, chemistry, biology, and earth science, using an evaluation framework developed in collaboration with science education experts. Our evaluation assesses both task performance (success rate, protocol compliance, efficiency, safety) and pedagogical quality through teacher surveys and LLM-as-Judge assessment. We additionally provide qualitative analysis of generated texts. Experimental results demonstrate that Pedagogical VLA Framework achieves comparable task performance to baseline models while producing contextually appropriate educational explanations.
△ Less
Submitted 20 January, 2026;
originally announced January 2026.
-
Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
Authors:
Cameron Tice,
Puria Radmard,
Samuel Ratnam,
Andy Kim,
David Africa,
Kyle O'Brien
Abstract:
Pretraining corpora contain extensive discourse about AI systems, yet the causal influence of this discourse on downstream alignment remains poorly understood. If prevailing descriptions of AI behaviour are predominantly negative, LLMs may internalise corresponding behavioural priors, giving rise to self-fulfilling misalignment. This paper provides the first controlled study of this hypothesis by…
▽ More
Pretraining corpora contain extensive discourse about AI systems, yet the causal influence of this discourse on downstream alignment remains poorly understood. If prevailing descriptions of AI behaviour are predominantly negative, LLMs may internalise corresponding behavioural priors, giving rise to self-fulfilling misalignment. This paper provides the first controlled study of this hypothesis by pretraining 6.9B-parameter LLMs with varying amounts of (mis)alignment discourse. We find that discussion of AI contributes to misalignment. Upsampling synthetic training documents about AI misalignment leads to a notable increase in misaligned behaviour. Conversely, upsampling documents about aligned behaviour reduces misalignment scores from 45% to 9%. We consider this evidence of self-fulfilling alignment. These effects are dampened, but persist through post-training. Our findings establish the study of how pretraining data shapes alignment priors, or alignment pretraining, as a complement to post-training. We recommend practitioners consider pretraining for alignment alongside capabilities. We share our models, data, and evaluations at AlignmentPretraining.ai.
△ Less
Submitted 19 February, 2026; v1 submitted 15 January, 2026;
originally announced January 2026.
-
Spatiotemporal Change-Points in Development Discourse: Insights from Social Media in Low-Resource Contexts
Authors:
Woojin Jung,
Charles Chear,
Andrew H. Kim,
Vatsal Shah,
Tawfiq Ammari
Abstract:
This study investigates the spatiotemporal evolution of development discourse in low-resource settings. Analyzing more than two years of geotagged X data from Zambia, we introduce a mixed-methods pipeline utilizing topic modeling, change-point detection, and qualitative coding to identify critical shifts in public debate. We identify seven recurring themes, including public health challenges and f…
▽ More
This study investigates the spatiotemporal evolution of development discourse in low-resource settings. Analyzing more than two years of geotagged X data from Zambia, we introduce a mixed-methods pipeline utilizing topic modeling, change-point detection, and qualitative coding to identify critical shifts in public debate. We identify seven recurring themes, including public health challenges and frustration with government policy, shaped by regional events and national interventions. Notably, we detect discourse changepoints linked to the COVID19 pandemic and a geothermal project, illustrating how online conversations mirror policy flashpoints. Our analysis distinguishes between the ephemeral nature of acute crises like COVID19 and the persistent, structural reorientations driven by long-term infrastructure projects. We conceptualize "durable discourse" as sustained narrative engagement with development issues. Contributing to HCI and ICTD, we examine technology's socioeconomic impact, providing practical implications and future work for direct local engagement.
△ Less
Submitted 19 January, 2026; v1 submitted 9 January, 2026;
originally announced January 2026.
-
Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Authors:
Sara Papi,
Javier Garcia Gilabert,
Zachary Hopton,
Vilém Zouhar,
Carlos Escolano,
Gerard I. Gállego,
Jorge Iranzo-Sánchez,
Ahrii Kim,
Dominik Macháček,
Patricia Schmidtova,
Maike Züfle
Abstract:
As Large Language Models (LLMs) expand beyond text, integrating speech as a native modality has given rise to SpeechLLMs, which directly process spoken language and enable speech-to-text translation (ST) and other downstream tasks, bypassing traditional transcription-based pipelines. Whether this integration improves ST quality over established cascaded architectures, however, remains an open ques…
▽ More
As Large Language Models (LLMs) expand beyond text, integrating speech as a native modality has given rise to SpeechLLMs, which directly process spoken language and enable speech-to-text translation (ST) and other downstream tasks, bypassing traditional transcription-based pipelines. Whether this integration improves ST quality over established cascaded architectures, however, remains an open question. We present Hearing to Translate, the first comprehensive test suite rigorously benchmarking 6 state-of-the-art SpeechLLMs against 16 strong direct and cascade systems that couple leading speech foundation models (SFM), with multilingual LLMs. Our analysis spans 16 benchmarks, 13 language pairs, and 9 challenging conditions, including disfluent, noisy, and long-form speech. Across this extensive evaluation, we find that cascaded systems remain the most reliable solution overall, but most recent SpeechLLMs can match or even outperform cascades in various settings while SFMs lag behind both, highlighting that integrating an LLM, either within the model or in a pipeline, is essential for high-quality speech translation.
△ Less
Submitted 27 March, 2026; v1 submitted 18 December, 2025;
originally announced December 2025.
-
A Special Case of Quadratic Extrapolation Under the Neural Tangent Kernel
Authors:
Abiel Kim
Abstract:
It has been demonstrated both theoretically and empirically that the ReLU MLP tends to extrapolate linearly for an out-of-distribution evaluation point. The machine learning literature provides ample analysis with respect to the mechanisms to which linearity is induced. However, the analysis of extrapolation at the origin under the NTK regime remains a more unexplored special case. In particular,…
▽ More
It has been demonstrated both theoretically and empirically that the ReLU MLP tends to extrapolate linearly for an out-of-distribution evaluation point. The machine learning literature provides ample analysis with respect to the mechanisms to which linearity is induced. However, the analysis of extrapolation at the origin under the NTK regime remains a more unexplored special case. In particular, the infinite-dimensional feature map induced by the neural tangent kernel is not translationally invariant. This means that the study of an out-of-distribution evaluation point very far from the origin is not equivalent to the evaluation of a point very near the origin. And since the feature map is rotation invariant, these two special cases may represent the most canonically extreme bounds of ReLU NTK extrapolation. Ultimately, it is this loose recognition of the two special cases of extrapolation that motivate the discovery of quadratic extrapolation for an evaluation close to the origin.
△ Less
Submitted 10 December, 2025;
originally announced December 2025.
-
DBT-DINO: Towards Foundation model based analysis of Digital Breast Tomosynthesis
Authors:
Felix J. Dorfner,
Manon A. Dorster,
Ryan Connolly,
Oscar Gentilhomme,
Edward Gibbs,
Steven Graham,
Seth Wander,
Thomas Schultz,
Manisha Bahl,
Dania Daye,
Albert E. Kim,
Christopher P. Bridge
Abstract:
Foundation models have shown promise in medical imaging but remain underexplored for three-dimensional imaging modalities. No foundation model currently exists for Digital Breast Tomosynthesis (DBT), despite its use for breast cancer screening.
To develop and evaluate a foundation model for DBT (DBT-DINO) across multiple clinical tasks and assess the impact of domain-specific pre-training.
Sel…
▽ More
Foundation models have shown promise in medical imaging but remain underexplored for three-dimensional imaging modalities. No foundation model currently exists for Digital Breast Tomosynthesis (DBT), despite its use for breast cancer screening.
To develop and evaluate a foundation model for DBT (DBT-DINO) across multiple clinical tasks and assess the impact of domain-specific pre-training.
Self-supervised pre-training was performed using the DINOv2 methodology on over 25 million 2D slices from 487,975 DBT volumes from 27,990 patients. Three downstream tasks were evaluated: (1) breast density classification using 5,000 screening exams; (2) 5-year risk of developing breast cancer using 106,417 screening exams; and (3) lesion detection using 393 annotated volumes.
For breast density classification, DBT-DINO achieved an accuracy of 0.79 (95\% CI: 0.76--0.81), outperforming both the MetaAI DINOv2 baseline (0.73, 95\% CI: 0.70--0.76, p<.001) and DenseNet-121 (0.74, 95\% CI: 0.71--0.76, p<.001). For 5-year breast cancer risk prediction, DBT-DINO achieved an AUROC of 0.78 (95\% CI: 0.76--0.80) compared to DINOv2's 0.76 (95\% CI: 0.74--0.78, p=.57). For lesion detection, DINOv2 achieved a higher average sensitivity of 0.67 (95\% CI: 0.60--0.74) compared to DBT-DINO with 0.62 (95\% CI: 0.53--0.71, p=.60). DBT-DINO demonstrated better performance on cancerous lesions specifically with a detection rate of 78.8\% compared to Dinov2's 77.3\%.
Using a dataset of unprecedented size, we developed DBT-DINO, the first foundation model for DBT. DBT-DINO demonstrated strong performance on breast density classification and cancer risk prediction. However, domain-specific pre-training showed variable benefits on the detection task, with ImageNet baseline outperforming DBT-DINO on general lesion detection, indicating that localized detection tasks require further methodological development.
△ Less
Submitted 15 December, 2025;
originally announced December 2025.
-
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
Authors:
Team Seedance,
Heyi Chen,
Siyan Chen,
Xin Chen,
Yanfei Chen,
Ying Chen,
Zhuo Chen,
Feng Cheng,
Tianheng Cheng,
Xinqi Cheng,
Xuyan Chi,
Jian Cong,
Jing Cui,
Qinpeng Cui,
Qide Dong,
Junliang Fan,
Jing Fang,
Zetao Fang,
Chengjian Feng,
Han Feng,
Mingyuan Gao,
Yu Gao,
Dong Guo,
Qiushan Guo,
Boyang Hao
, et al. (172 additional authors not shown)
Abstract:
Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint audio-video generation. Leveraging a dual-branch Diffusion Transformer architecture, the model integrates a cross-modal joint module with a specialized multi-stage data pipeline, achieving exceptional au…
▽ More
Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint audio-video generation. Leveraging a dual-branch Diffusion Transformer architecture, the model integrates a cross-modal joint module with a specialized multi-stage data pipeline, achieving exceptional audio-visual synchronization and superior generation quality. To ensure practical utility, we implement meticulous post-training optimizations, including Supervised Fine-Tuning (SFT) on high-quality datasets and Reinforcement Learning from Human Feedback (RLHF) with multi-dimensional reward models. Furthermore, we introduce an acceleration framework that boosts inference speed by over 10X. Seedance 1.5 pro distinguishes itself through precise multilingual and dialect lip-syncing, dynamic cinematic camera control, and enhanced narrative coherence, positioning it as a robust engine for professional-grade content creation. Seedance 1.5 pro is now accessible on Volcano Engine at https://console.volcengine.com/ark/region:ark+cn-beijing/experience/vision?type=GenVideo.
△ Less
Submitted 23 December, 2025; v1 submitted 15 December, 2025;
originally announced December 2025.
-
THE-Pose: Topological Prior with Hybrid Graph Fusion for Estimating Category-Level 6D Object Pose
Authors:
Eunho Lee,
Chaehyeon Song,
Seunghoon Jeong,
Ayoung Kim
Abstract:
Category-level object pose estimation requires both global context and local structure to ensure robustness against intra-class variations. However, 3D graph convolution (3D-GC) methods only focus on local geometry and depth information, making them vulnerable to complex objects and visual ambiguities. To address this, we present THE-Pose, a novel category-level 6D pose estimation framework that l…
▽ More
Category-level object pose estimation requires both global context and local structure to ensure robustness against intra-class variations. However, 3D graph convolution (3D-GC) methods only focus on local geometry and depth information, making them vulnerable to complex objects and visual ambiguities. To address this, we present THE-Pose, a novel category-level 6D pose estimation framework that leverages a topological prior via surface embedding and hybrid graph fusion. Specifically, we extract consistent and invariant topological features from the image domain, effectively overcoming the limitations inherent in existing 3D-GC based methods. Our Hybrid Graph Fusion (HGF) module adaptively integrates the topological features with point-cloud features, seamlessly bridging 2D image context and 3D geometric structure. These fused features ensure stability for unseen or complicated objects, even under significant occlusions. Extensive experiments on the REAL275 dataset show that THE-Pose achieves a 35.8% improvement over the 3D-GC baseline (HS-Pose) and surpasses the previous state-of-the-art by 7.2% across all key metrics. The code is avaialbe on https://github.com/EHxxx/THE-Pose
△ Less
Submitted 10 December, 2025;
originally announced December 2025.
-
First Study of the Nuclear Response to Fast Hadrons via Angular Correlations between Pions and Slow Protons in Electron-Nucleus Scattering
Authors:
S. J. Paul,
M. Arratia,
H. Hakobyan,
W. Brooks,
A. Acar,
P. Achenbach,
J. S. Alvarado,
W. R. Armstrong,
N. A. Baltzell,
L. Barion,
M. Bashkanov,
M. Battaglieri,
F. Benmokhtar,
A. Bianconi,
A. S. Biselli,
F. Bossù,
S. Boiarinov,
K. -T. Brinkmann,
W. J. Briscoe,
V. Burkert,
T. Cao,
D. S. Carman,
P. Chatagnon,
H. Chinchay,
G. Ciullo
, et al. (105 additional authors not shown)
Abstract:
We report on the first measurement of angular correlations between high-energy pions and slow protons in electron-nucleus ($eA$) scattering, providing a new probe of how a nucleus responds to a fast-moving quark. The experiment employed the CLAS detector with a 5-GeV electron beam incident on deuterium, carbon, iron, and lead targets. For heavier nuclei, the pion-proton correlation function is mor…
▽ More
We report on the first measurement of angular correlations between high-energy pions and slow protons in electron-nucleus ($eA$) scattering, providing a new probe of how a nucleus responds to a fast-moving quark. The experiment employed the CLAS detector with a 5-GeV electron beam incident on deuterium, carbon, iron, and lead targets. For heavier nuclei, the pion-proton correlation function is more spread-out in azimuth than for lighter ones, and this effect is more pronounced in the $πp$ channel than in earlier $ππ$ studies. The proton-to-pion yield ratio likewise rises with nuclear mass, although the increase appears to saturate for the heaviest targets. These trends are qualitatively reproduced by state-of-the-art $eA$ event generators, including BeAGLE, eHIJING, and GiBUU, indicating that current descriptions of target fragmentation rest on sound theoretical footing. At the same time, the precision of our data exposes model-dependent discrepancies, delineating a clear path for future improvements in the treatment of cold-nuclear matter effects in $eA$ scattering.
△ Less
Submitted 4 February, 2026; v1 submitted 4 December, 2025;
originally announced December 2025.
-
LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation
Authors:
Huynh Trinh Ngoc,
Hoang Anh Nguyen Kim,
Toan Nguyen Hai,
Long Tran Quoc
Abstract:
Generative models have achieved remarkable progress with the emergence of flow matching (FM). It has demonstrated strong generative capabilities and attracted significant attention as a simulation-free flow-based framework capable of learning exact data densities. Motivated by these advances, we propose LatentFM, a flow-based model operating in the latent space for medical image segmentation. To m…
▽ More
Generative models have achieved remarkable progress with the emergence of flow matching (FM). It has demonstrated strong generative capabilities and attracted significant attention as a simulation-free flow-based framework capable of learning exact data densities. Motivated by these advances, we propose LatentFM, a flow-based model operating in the latent space for medical image segmentation. To model the data distribution, we first design two variational autoencoders (VAEs) to encode both medical images and their corresponding masks into a lower-dimensional latent space. We then estimate a conditional velocity field that guides the flow based on the input image. By sampling multiple latent representations, our method synthesizes diverse segmentation outputs whose pixel-wise variance reliably captures the underlying data distribution, enabling both highly accurate and uncertainty-aware predictions. Furthermore, we generate confidence maps that quantify the model certainty, providing clinicians with richer information for deeper analysis. We conduct experiments on two datasets, ISIC-2018 and CVC-Clinic, and compare our method with several prior baselines, including both deterministic and generative approach models. Through comprehensive evaluations, both qualitative and quantitative results show that our approach achieves superior segmentation accuracy while remaining highly efficient in the latent space.
△ Less
Submitted 31 March, 2026; v1 submitted 4 December, 2025;
originally announced December 2025.
-
The DESI DR1 Peculiar Velocity Survey: global zero-point and $H_0$ constraints
Authors:
A. Carr,
C. Howlett,
A. J. Amsellem,
Tamara M. Davis,
K. Said,
D. Parkinson,
A. Palmese,
J. Aguilar,
S. Ahlen,
J. Bautista,
S. BenZvi,
D. Bianchi,
C. Blake,
D. Brooks,
T. Claybaugh,
A. Cuceu,
A. de la Macorra,
P. Doel,
K. Douglass,
S. Ferraro,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
G. Gutierrez,
H. K. Herrera-Alcantar
, et al. (33 additional authors not shown)
Abstract:
The Dark Energy Spectroscopic Instrument (DESI) in its first Data Release (DR1) already provides more than 100,000 galaxies with relative distance measurements. The primary purpose of this paper is to perform the calibration of the zero-point for the DESI Fundamental Plane and Tully-Fisher relations, which allows us to measure the Hubble constant, $H_0$. This sample has a lower statistical uncerta…
▽ More
The Dark Energy Spectroscopic Instrument (DESI) in its first Data Release (DR1) already provides more than 100,000 galaxies with relative distance measurements. The primary purpose of this paper is to perform the calibration of the zero-point for the DESI Fundamental Plane and Tully-Fisher relations, which allows us to measure the Hubble constant, $H_0$. This sample has a lower statistical uncertainty than any previously used to measure $H_0$, and we investigate the systematic uncertainties in absolute calibration that could limit the accuracy of that measurement. We improve upon the DESI Early Data Release Fundamental Plane $H_0$ measurement by a) using a group catalog to increase the number of calibrator galaxies and b) investigating alternative calibrators in the nearby universe. Our baseline measurement calibrates to the SH0ES/Pantheon+ type Ia supernovae, and finds $H_0=73.7\pm 0.06\;(\text{stat.})\pm 1.1\;(\text{syst.})$ km s$^{-1}$ Mpc$^{-1}$. Calibrating to surface brightness fluctuation (SBF) distances yields a similar $H_0$. We explore measurements using other calibrators, but these are currently less precise since the overlap with DESI peculiar velocity tracers is much smaller. In future data releases with an even larger peculiar velocity sample, we plan to calibrate directly to Cepheids and the tip of the red giant branch, which will enable the uncertainty to decrease towards a percent-level measurement of $H_0$. This will provide an alternative to supernovae as the Hubble flow sample for $H_0$ measurements.
△ Less
Submitted 2 December, 2025;
originally announced December 2025.
-
The DESI DR1 peculiar velocity survey: growth rate measurements from the galaxy power spectrum
Authors:
F. Qin,
C. Blake,
C. Howlett,
R. J. Turner,
K. Lodha,
J. Bautista,
Y. Lai,
A. J. Amsellem,
J. Aguilar,
S. Ahlen,
D. Bianchi,
D. Brooks,
S. BenZvi,
A. Carr,
E. Chaussidon,
T. Claybaugh,
A. Cuceu,
A. de la Macorra,
K. Douglass,
P. Doel,
S. Ferraro,
A. Font-Ribera,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho
, et al. (41 additional authors not shown)
Abstract:
The large-scale structure of the Universe and its evolution encapsulate a wealth of cosmological information. A powerful means of unlocking this knowledge lies in measuring the auto-power spectrum and/or the cross-power spectrum of the galaxy density and momentum fields, followed by the estimation of cosmological parameters based on these spectrum measurements. In this study, we generalize the cro…
▽ More
The large-scale structure of the Universe and its evolution encapsulate a wealth of cosmological information. A powerful means of unlocking this knowledge lies in measuring the auto-power spectrum and/or the cross-power spectrum of the galaxy density and momentum fields, followed by the estimation of cosmological parameters based on these spectrum measurements. In this study, we generalize the cross-power spectrum model to accommodate scenarios where the density and momentum fields are derived from distinct galaxy surveys. The growth rate of the large-scale structures of the Universe, commonly represented as $fσ_8$, is extracted by jointly fitting the monopole and quadrupole moments of the auto-density power spectrum, the monopole of the auto-momentum power spectrum, and the dipole of the cross-power spectrum. Our estimators, theoretical models and parameter-fitting framework have been tested using mocks, confirming their robustness and accuracy in retrieving the fiducial growth rate from simulation. These techniques are then applied to analyze the power spectrum of the DESI Bright Galaxy Survey and Peculiar Velocity Survey, and the fit result of the growth rate is $fσ_8=0.440^{+0.080}_{-0.096}$ at effective redshift $z_{\rm eff}=0.07$. By synthesizing the fitting outcomes from correlation functions, maximum likelihood estimation and power spectrum, yields a consensus value of $fσ_8(z_{\rm eff}=0.07) = 0.450 ^{+0.055}_{-0.055}$, and correspondingly we obtain $γ=0.580^{+0.110}_{-0.110}$, $Ω_\mathrm{m}=0.301^{+0.011}_{-0.011}$ and $σ_8=0.834^{+0.032}_{-0.032}$. The measured $fσ_8$ and $γ$ are consistent with the prediction of the $Λ$ Cold Dark Matter Model and General Relativity.
△ Less
Submitted 15 February, 2026; v1 submitted 2 December, 2025;
originally announced December 2025.
-
The DESI DR1 Peculiar Velocity Survey: growth rate measurements from galaxy and momentum correlation functions
Authors:
R. J. Turner,
C. Blake,
F. Qin,
J. Aguilar,
S. Ahlen,
A. J. Amsellem,
J. Bautista,
S. BenZvi,
D. Bianchi,
D. Brooks,
A. Carr,
E. Chaussidon,
T. Claybaugh,
A. Cuceu,
A. de la Macorra,
P. Doel,
K. Douglass,
S. Ferraro,
A. Font-Ribera,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
G. Gutierrez,
J. Guy,
H. K. Herrera-Alcantar
, et al. (39 additional authors not shown)
Abstract:
Joint analysis of the local peculiar velocity and galaxy density fields offers a promising route to testing cosmological models of gravity. We present a measurement of the normalised growth rate of structure, $fσ_8$, from the two-point correlations of velocity and density tracers from the DESI DR1 Peculiar Velocity and Bright Galaxy Surveys, the largest catalogues of their kind assembled to date.…
▽ More
Joint analysis of the local peculiar velocity and galaxy density fields offers a promising route to testing cosmological models of gravity. We present a measurement of the normalised growth rate of structure, $fσ_8$, from the two-point correlations of velocity and density tracers from the DESI DR1 Peculiar Velocity and Bright Galaxy Surveys, the largest catalogues of their kind assembled to date. We fit the two-point correlation measurements with non-linear correlation function models, constructed from density and momentum power spectra generated using 1-loop Eulerian perturbation theory, and validate our methodology using representative mock catalogues. We find $fσ_8 = 0.391^{+0.080}_{-0.081}$, consistent to within $1σ$ with accompanying analyses of the same datasets using power spectrum and maximum-likelihood fields methods. Combining these growth rate results from different methods including appropriate correlations, we find a consensus determination $fσ_8(z = 0.07) = 0.4497 \pm 0.0548$, consistent with predictions from \textit{Planck}$+Λ$CDM cosmology. Jointly fitting to this consensus low-redshift growth rate and the DESI DR1 full-shape clustering dataset, we measure gravitational growth index $γ_{\rm L} = 0.580^{+0.110}_{-0.110}$, consistent with the prediction of general relativity.
△ Less
Submitted 2 December, 2025;
originally announced December 2025.
-
The DESI DR1 Peculiar Velocity Survey: growth rate measurements from the maximum likelihood fields method
Authors:
Y. Lai,
C. Howlett,
J. Aguilar,
S. Ahlen,
A. J. Amsellem,
J. Bautista,
S. BenZvi,
D. Bianchi,
C. Blake,
D. Brooks,
A. Carr,
T. Claybaugh,
T. M. Davis,
A. de la Macorra,
P. Doel,
K. Douglass,
S. Ferraro,
A. Font-Ribera,
J. E. Forero-Romero,
E. Gaztañaga,
G. Gutierrez,
J. Guy,
H. K. Herrera-Alcantar,
D. Huterer,
M. Ishak
, et al. (38 additional authors not shown)
Abstract:
We present the constraint on the growth rate of structure from the combination of DESI DR1 BGS sample, Fundamental Plane, and Tully-Fisher peculiar velocity catalogues using the maximum likelihood fields method. The combined catalogue contains 415,523 galaxy redshifts and 76,616 peculiar velocity measurements. To handle the large amount of data in the DESI DR1 peculiar velocity catalogue, we signi…
▽ More
We present the constraint on the growth rate of structure from the combination of DESI DR1 BGS sample, Fundamental Plane, and Tully-Fisher peculiar velocity catalogues using the maximum likelihood fields method. The combined catalogue contains 415,523 galaxy redshifts and 76,616 peculiar velocity measurements. To handle the large amount of data in the DESI DR1 peculiar velocity catalogue, we significantly improve the computational efficiency by rewriting the algorithm with JAX. After removing outliers and Tully-Fisher galaxies that are affected by systematics, we find $fσ_8 = 0.483_{-0.043}^{+0.080}(\mathrm{stat}) \pm 0.018(\mathrm{sys})$, consistent within $1σ$ with the power spectrum and correlation function analysis using the same dataset. Combining all three measurements with appropriate correlations, the consensus measurement is $fσ_8 (z_{\mathrm{eff}}=0.07) = 0.450\pm0.055$, consistent with Planck $+Λ$CDM cosmology $(fσ_8 = 0.449 \pm 0.008)$. Combining with the high redshift growth rate of structure measurements from DESI ShapeFit, the constraint on the growth index is $γ= 0.58\pm0.11$, consistent with GR.
△ Less
Submitted 26 January, 2026; v1 submitted 2 December, 2025;
originally announced December 2025.
-
The DESI DR1 Peculiar Velocity Survey: The Tully-Fisher Distance Catalog
Authors:
K. Douglass,
S. BenZvi,
A. G. Kim,
S. Moore,
A. Carr,
J. Largett,
N. Ravi,
J. Aguilar,
S. Ahlen,
A. J. Amsellem,
J. Bautista,
D. Bianchi,
C. Blake,
D. Brooks,
T. Claybaugh,
A. Cuceu,
A. de la Macorra,
R. Demina,
P. Doel,
S. Ferraro,
A. Font-Ribera,
J. E. Forero-Romero,
E. Gaztanaga,
S. Gontcho A Gontcho,
G. Gutierrez
, et al. (42 additional authors not shown)
Abstract:
We calibrate the Tully-Fisher relation (TFR) using observations of spiral galaxies taken during the first year (DR1) of the DESI galaxy redshift survey. The rotational velocities of 10,262 galaxies are measured at 0.4 R26 by comparing the redshifts at 0.4 R26 with those at the galaxy centers of spatially-resolved galaxies targeted as part of the DESI Peculiar Velocity Survey. The DESI DR1 TFR slop…
▽ More
We calibrate the Tully-Fisher relation (TFR) using observations of spiral galaxies taken during the first year (DR1) of the DESI galaxy redshift survey. The rotational velocities of 10,262 galaxies are measured at 0.4 R26 by comparing the redshifts at 0.4 R26 with those at the galaxy centers of spatially-resolved galaxies targeted as part of the DESI Peculiar Velocity Survey. The DESI DR1 TFR slope is calibrated by separating the spiral galaxies into redshift bins of width dz = 0.005 from 0.03 < z < 0.1 and jointly fitting the TFR across all bins. We find a slope of -7.22+/-0.01 AB mag in the r-band for the TFR, with an intrinsic scatter of 0.466+/-0.001 AB mag. We present a catalog of the distances and peculiar velocities to these 10,262 galaxies using our calibrated TFR. For cosmological analyses, we also present a clustering catalog and associated random catalogs using a subset of 6807 of the DESI DR1 TF galaxies.
△ Less
Submitted 2 December, 2025;
originally announced December 2025.
-
The DESI DR1 Peculiar Velocity Survey: Fundamental Plane Catalogue
Authors:
C. E. Ross,
C. Howlett,
J. R. Lucey,
K. Said,
T. M. Davis,
J. Aguilar,
S. Ahlen,
A. J. Amsellem,
J. Bautista,
S. BenZvi,
D. Bianchi,
C. Blake,
D. Brooks,
A. Carr,
T. Claybaugh,
A. Cuceu,
A. de la Macorra,
B. Dey,
P. Doel,
K. Douglass,
S. Ferraro,
A. Font-Ribera,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho
, et al. (36 additional authors not shown)
Abstract:
Measurements of peculiar velocities in the local Universe are a powerful tool to study the nature of dark energy at low ($z < 0.1$) redshifts. Here we present the largest single set of $z<0.1$ peculiar velocity measurements to date, obtained using the Fundamental Plane (FP) of galaxies in the first data release (DR1) of the Dark Energy Spectroscopic Instrument (DESI). We describe the photometric a…
▽ More
Measurements of peculiar velocities in the local Universe are a powerful tool to study the nature of dark energy at low ($z < 0.1$) redshifts. Here we present the largest single set of $z<0.1$ peculiar velocity measurements to date, obtained using the Fundamental Plane (FP) of galaxies in the first data release (DR1) of the Dark Energy Spectroscopic Instrument (DESI). We describe the photometric and spectroscopic selection criteria used to define the sample, as well as extensive quality control checks on the photometry and velocity dispersion measurements. Additionally, we perform detailed systematics checks for the many analysis parameters in our pipeline. Our DESI DR1 catalogue contains FP-based distances and peculiar velocities for $98,292$ unique early-type galaxies, increasing the total number of $z < 0.1$ FP distances ever measured by a factor of $\sim2$. We achieve a precision of $26\%$ random error in our distance measurements which is comparable to previous surveys. A series of companion DESI papers use the distances and peculiar velocities presented in this paper to measure cosmological parameters.
△ Less
Submitted 2 December, 2025;
originally announced December 2025.
-
OzDES Reverberation Mapping of Active Galactic Nuclei: Final Data Release, Black-Hole Mass Results, & Scaling Relations
Authors:
H. McDougall,
T. M. Davis,
Z. Yu,
P. Martini,
C. Lidman,
U. Malik,
A. Penton,
G. F. Lewis,
B. E. Tucker,
B. J. S. Pope,
S. Allam,
F. Andrade-Oliveira,
J. Asorey,
D. Bacon,
S. Bocquet,
D. Brooks,
A. Carnero Rosell,
D. Carollo,
A. Carr,
J. Carretero,
T. Y. Cheng,
L. N. da Costa,
M. E. da Silva Pereira,
J. De Vicente,
H. T. Diehl
, et al. (31 additional authors not shown)
Abstract:
Over the last decade, the Australian Dark Energy (OzDES) collaboration has used Reverberation Mapping to measure the masses of high redshift supermassive black holes. Here we present the final review and analysis of this OzDES reverberation mapping campaign. These observations use 6-7 years of photometric and spectroscopic observations of 735 Active Galactic Nuclei (AGN) in the redshift range 0.13…
▽ More
Over the last decade, the Australian Dark Energy (OzDES) collaboration has used Reverberation Mapping to measure the masses of high redshift supermassive black holes. Here we present the final review and analysis of this OzDES reverberation mapping campaign. These observations use 6-7 years of photometric and spectroscopic observations of 735 Active Galactic Nuclei (AGN) in the redshift range 0.13-3.85 and bolometric luminosity range 44.3 - 47.5 erg/s. Both photometry and spectra are observed in visible wavelengths, allowing for the physical scale of the AGN broad line region to be estimated from reverberations of the H\b{eta}, MgII and CIV emission lines. We successfully use reverberation mapping to constrain the masses of 62 super-massive black holes, and combine with existing data to fit a power law to the lag-luminosity relation for the H\b{eta} and MgII lines with a scatter of ~0.25 dex, the tightest yet identified, fit specifically for consistency with high redshift AGN. We fit a similarly constrained relation for CIV, resolving a tension with the low luminosity literature AGN by accounting for selection effects arising from finite survey length. We also examine the impact of emission line width and luminosity (related to accretion rate) in reducing the scatter of these scaling relationships and find no significant improvement over the lag-only approach for any of the three lines. Using these relations, we further estimate the masses and accretion rates of 246 AGN with single epoch methods. We also use these relations to estimate the relative sizes of the H\b{eta}, MgII and CIV emitting regions. In short, we provide a comprehensive benchmark of high redshift AGN reverberation mapping at the close of this most recent generation of surveys, including light curves, time-delays, and a set of significantly improved radius-luminosity relations for use with high-redshift populations.
△ Less
Submitted 19 February, 2026; v1 submitted 30 November, 2025;
originally announced December 2025.
-
A Catalog of Galactic Atomic Hydrogen Position-Position-Velocity Filaments
Authors:
M. E. Putman,
D. A. Kim,
S. E. Clark,
L. Li,
C. Holm-Hansen,
J. E. G. Peek
Abstract:
We present a catalog of 3D Galactic HI filaments over 1/3 of the sky using Galactic Arecibo L-band Feed Array HI (GALFA-HI) data. The 3D filaments are defined to be linear HI features that are continuous in position-position-velocity (PPV) and are found with fil3d, an algorithm that expands on the 2D FilFinder. The catalog contains 3333 HI filaments between +/- 50 km/s at a range of Galactic posit…
▽ More
We present a catalog of 3D Galactic HI filaments over 1/3 of the sky using Galactic Arecibo L-band Feed Array HI (GALFA-HI) data. The 3D filaments are defined to be linear HI features that are continuous in position-position-velocity (PPV) and are found with fil3d, an algorithm that expands on the 2D FilFinder. The catalog contains 3333 HI filaments between +/- 50 km/s at a range of Galactic positions. 1542 of the PPV filaments are identified as local at the distance of the wall of the Local Bubble, and 209 are likely at the disk-halo interface of our Galaxy. The catalog and properties of the PPV filaments are obtained after an unsharp mask (USM) is applied to the data. The widths of the filaments are consistently ~12' (0.34 pc at 100 pc), and constrained by the 4' resolution. The local filaments have median properties of N_HI = $6 \times 10^{18}$ cm$^{-2}$, M_HI = 0.17 M_sun, FWHM = 3.2 km/s, and length of 6.4 pc. The disk-halo population has similar column densities, but the median FWHM = 7.7 km/s, consistent with them being higher z-height, warmer structures. The L $\propto$ M$^{0.5}$ relationship found for the HI filaments and their bundling on the sky are consistent with a hierarchical structure, and is likely related to turbulence playing a role in their formation.
△ Less
Submitted 23 November, 2025;
originally announced November 2025.
-
GaRLILEO: Gravity-aligned Radar-Leg-Inertial Enhanced Odometry
Authors:
Chiyun Noh,
Sangwoo Jung,
Hanjun Kim,
Yafei Hu,
Laura Herlant,
Ayoung Kim
Abstract:
Deployment of legged robots for navigating challenging terrains (e.g., stairs, slopes, and unstructured environments) has gained increasing preference over wheel-based platforms. In such scenarios, accurate odometry estimation is a preliminary requirement for stable locomotion, localization, and mapping. Traditional proprioceptive approaches, which rely on leg kinematics sensor modalities and iner…
▽ More
Deployment of legged robots for navigating challenging terrains (e.g., stairs, slopes, and unstructured environments) has gained increasing preference over wheel-based platforms. In such scenarios, accurate odometry estimation is a preliminary requirement for stable locomotion, localization, and mapping. Traditional proprioceptive approaches, which rely on leg kinematics sensor modalities and inertial sensing, suffer from irrepressible vertical drift caused by frequent contact impacts, foot slippage, and vibrations, particularly affected by inaccurate roll and pitch estimation. Existing methods incorporate exteroceptive sensors such as LiDAR or cameras. Further enhancement has been introduced by leveraging gravity vector estimation to add additional observations on roll and pitch, thereby increasing the accuracy of vertical pose estimation. However, these approaches tend to degrade in feature-sparse or repetitive scenes and are prone to errors from double-integrated IMU acceleration. To address these challenges, we propose GaRLILEO, a novel gravity-aligned continuous-time radar-leg-inertial odometry framework. GaRLILEO decouples velocity from the IMU by building a continuous-time ego-velocity spline from SoC radar Doppler and leg kinematics information, enabling seamless sensor fusion which mitigates odometry distortion. In addition, GaRLILEO can reliably capture accurate gravity vectors leveraging a novel soft S2-constrained gravity factor, improving vertical pose accuracy without relying on LiDAR or cameras. Evaluated on a self-collected real-world dataset with diverse indoor-outdoor trajectories, GaRLILEO demonstrates state-of-the-art accuracy, particularly in vertical odometry estimation on stairs and slopes. We open-source both our dataset and algorithm to foster further research in legged robot odometry and SLAM. https://garlileo.github.io/GaRLILEO
△ Less
Submitted 17 November, 2025;
originally announced November 2025.
-
XPRESS: X-Band Radar Place Recognition via Elliptical Scan Shaping
Authors:
Hyesu Jang,
Wooseong Yang,
Ayoung Kim,
Dongje Lee,
Hanguen Kim
Abstract:
X-band radar serves as the primary sensor on maritime vessels, however, its application in autonomous navigation has been limited due to low sensor resolution and insufficient information content. To enable X-band radar-only autonomous navigation in maritime environments, this paper proposes a place recognition algorithm specifically tailored for X-band radar, incorporating an object density-based…
▽ More
X-band radar serves as the primary sensor on maritime vessels, however, its application in autonomous navigation has been limited due to low sensor resolution and insufficient information content. To enable X-band radar-only autonomous navigation in maritime environments, this paper proposes a place recognition algorithm specifically tailored for X-band radar, incorporating an object density-based rule for efficient candidate selection and intentional degradation of radar detections to achieve robust retrieval performance. The proposed algorithm was evaluated on both public maritime radar datasets and our own collected dataset, and its performance was compared against state-of-the-art radar place recognition methods. An ablation study was conducted to assess the algorithm's performance sensitivity with respect to key parameters.
△ Less
Submitted 11 November, 2025;
originally announced November 2025.
-
Election and Subjective Well-Being:Evidence from the 2024 U.S. Presidential Election
Authors:
Dongyoung Kim,
Young-Il Albert Kim,
Haedong Aiden Rho
Abstract:
This paper uses daily Behavioral Risk Factor Surveillance System data to estimate the causal effect of the 2024 U.S. presidential election, a highly competitive race whose outcome resolved lingering uncertainty on election day, on mental-health and life-satisfaction outcomes through a regression discontinuity design. Following the resolution of electoral uncertainty on election day, we find a shar…
▽ More
This paper uses daily Behavioral Risk Factor Surveillance System data to estimate the causal effect of the 2024 U.S. presidential election, a highly competitive race whose outcome resolved lingering uncertainty on election day, on mental-health and life-satisfaction outcomes through a regression discontinuity design. Following the resolution of electoral uncertainty on election day, we find a sharp and persistent post-election decline in subjective well-being, concentrated among female, non-White, urban, and more-educated respondents. These findings reveal an expected-outcome shock, showing that political polarization itself, not electoral surprise, can act as a chronic psychological stressor.
△ Less
Submitted 6 November, 2025;
originally announced November 2025.
-
Momentum-Transfer Framework Unifies High-Velocity Impact and Failure Across Materials, Geometries, and Scales
Authors:
Yasara Dharmadasa,
Nicholas Jaegersberg,
Ara Kim,
Jizhe Cai,
Ramathasan Thevamaran
Abstract:
Materials that dissipate energy efficiently under high-speed impacts, from micrometeoroid strikes on spacecraft to ballistic penetration in protective systems, are essential for maintaining structural integrity in extreme environments. Yet, despite decades of study, predicting and comparing impact performance across materials, geometries, and length scales remains challenging because conventional…
▽ More
Materials that dissipate energy efficiently under high-speed impacts, from micrometeoroid strikes on spacecraft to ballistic penetration in protective systems, are essential for maintaining structural integrity in extreme environments. Yet, despite decades of study, predicting and comparing impact performance across materials, geometries, and length scales remains challenging because conventional projectile-impact models often rely on conservation-based or empirically partitioned descriptions that assume the projectile-target interaction is a closed system. Here, we relax this assumption and directly observe the momentum and energy transferred out of the projectile during impact. We find that the momentum transferred to the target consistently reaches its maximum at the ballistic-limit velocity, demonstrated through a coordinated suite of micro-projectile impact experiments spanning varied projectile diameters, target thicknesses, and impact velocities, and further supported by targeted macroscale tests. This behavior is reinforced across a broad range of independent studies encompassing metals, polymers, composites, sandwich panels, and reinforced concrete, with thicknesses ranging from nanometers to hundreds of millimeters and projectiles of spherical, blunt, ogive, and conical shape, under both normal and oblique impacts. Together, these observations reveal a consistent impact behavior across all available data: maximum momentum transfer occurs at the ballistic limit. Extending this bound into the energy absorption landscape addresses an entrenched misconception in the field by revealing that specific energy absorption inherently inflates the performance of thinner targets due to geometric normalization, rather than reflecting genuine material enhancement.
△ Less
Submitted 4 January, 2026; v1 submitted 30 October, 2025;
originally announced October 2025.
-
PlanarMesh: Building Compact 3D Meshes from LiDAR using Incremental Adaptive Resolution Reconstruction
Authors:
Jiahao Wang,
Nived Chebrolu,
Yifu Tao,
Lintong Zhang,
Ayoung Kim,
Maurice Fallon
Abstract:
Building an online 3D LiDAR mapping system that produces a detailed surface reconstruction while remaining computationally efficient is a challenging task. In this paper, we present PlanarMesh, a novel incremental, mesh-based LiDAR reconstruction system that adaptively adjusts mesh resolution to achieve compact, detailed reconstructions in real-time. It introduces a new representation, planar-mesh…
▽ More
Building an online 3D LiDAR mapping system that produces a detailed surface reconstruction while remaining computationally efficient is a challenging task. In this paper, we present PlanarMesh, a novel incremental, mesh-based LiDAR reconstruction system that adaptively adjusts mesh resolution to achieve compact, detailed reconstructions in real-time. It introduces a new representation, planar-mesh, which combines plane modeling and meshing to capture both large surfaces and detailed geometry. The planar-mesh can be incrementally updated considering both local surface curvature and free-space information from sensor measurements. We employ a multi-threaded architecture with a Bounding Volume Hierarchy (BVH) for efficient data storage and fast search operations, enabling real-time performance. Experimental results show that our method achieves reconstruction accuracy on par with, or exceeding, state-of-the-art techniques-including truncated signed distance functions, occupancy mapping, and voxel-based meshing-while producing smaller output file sizes (10 times smaller than raw input and more than 5 times smaller than mesh-based methods) and maintaining real-time performance (around 2 Hz for a 64-beam sensor).
△ Less
Submitted 15 October, 2025;
originally announced October 2025.