arXiv:2604.08516 [pdf, ps, other]

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Authors: Tanmay Gupta, Piper Wolters, Zixian Ma, Peter Sushko, Rock Yuren Pang, Diego Llanes, Yue Yang, Taira Anderson, Boyuan Zheng, Zhongzheng Ren, Harsh Trivedi, Taylor Blanton, Caleb Ouellette, Winson Han, Ali Farhadi, Ranjay Krishna

Abstract: Web agents--autonomous systems that navigate and execute tasks on the web on behalf of users--have the potential to transform how people interact with the digital world. However, the most capable web agents today rely on proprietary models with undisclosed training data and recipes, limiting scientific understanding, reproducibility, and community-driven progress. We believe agents for the open… ▽ More Web agents--autonomous systems that navigate and execute tasks on the web on behalf of users--have the potential to transform how people interact with the digital world. However, the most capable web agents today rely on proprietary models with undisclosed training data and recipes, limiting scientific understanding, reproducibility, and community-driven progress. We believe agents for the open web should be built in the open. To this end, we introduce (1) MolmoWebMix, a large and diverse mixture of browser task demonstrations and web-GUI perception data and (2) MolmoWeb, a family of fully open multimodal web agents. Specifically, MolmoWebMix combines over 100K synthetic task trajectories from multiple complementary generation pipelines with 30K+ human demonstrations, atomic web-skill trajectories, and GUI perception data, including referring expression grounding and screenshot question answering. MolmoWeb agents operate as instruction-conditioned visual-language action policies: given a task instruction and a webpage screenshot, they predict the next browser action, requiring no access to HTML, accessibility trees, or specialized APIs. Available in 4B and 8B size, on browser-use benchmarks like WebVoyager, Online-Mind2Web, and DeepShop, MolmoWeb agents achieve state-of-the-art results outperforming similar scale open-weight-only models such as Fara-7B, UI-Tars-1.5-7B, and Holo1-7B. MolmoWeb-8B also surpasses set-of-marks (SoM) agents built on much larger closed frontier models like GPT-4o. We further demonstrate consistent gains through test-time scaling via parallel rollouts with best-of-N selection, achieving 94.7% and 60.5% pass@4 (compared to 78.2% and 35.3% pass@1) on WebVoyager and Online-Mind2Web respectively. We will release model checkpoints, training data, code, and a unified evaluation harness to enable reproducibility and accelerate open research on web agents. △ Less

Submitted 9 April, 2026; originally announced April 2026.

Comments: https://allenai.org/blog/molmoweb

arXiv:2603.17490 [pdf, ps, other]

Modeling Decay Heat with a Simplified Depletion Chain in OpenMC

Authors: Tanmay Gupta, Benoit Forget

Abstract: OpenMC can be used to computationally model depletion and produce estimates of decay heat. As an input to depletion simulations, OpenMC requires a depletion chain that details nuclide transmutation pathways. The simplified CASL depletion chain was designed to track relatively few nuclides while still accurately modeling the effective neutron multiplication factor and nuclide number densities. Howe… ▽ More OpenMC can be used to computationally model depletion and produce estimates of decay heat. As an input to depletion simulations, OpenMC requires a depletion chain that details nuclide transmutation pathways. The simplified CASL depletion chain was designed to track relatively few nuclides while still accurately modeling the effective neutron multiplication factor and nuclide number densities. However, the CASL chain dramatically underestimates decay heat due to the many nuclides it does not contain. In this work, we modify the CASL depletion chain to improve its accuracy while maintaining its computational efficiency. We demonstrate the effectiveness of adding pseudo-nuclides to the CASL chain, with each pseudo-nuclide capturing the behavior of a large group of nuclides. We further introduce "delay nuclides," which dramatically improve the accuracy of decay heat estimates. △ Less

Submitted 18 March, 2026; originally announced March 2026.

Comments: 28 pages, 14 figures

arXiv:2603.05452 [pdf, ps, other]

Local strategies are pretty good at computing Boolean properties of quantum sequences

Authors: Tathagata Gupta, Ankith Mohan, Shayeef Murshid, Vincent Russo, Jamie Sikora, Alice Zheng

Abstract: Quantum memory is a scarce and costly resource, yet little is known about which learning tasks remain feasible under severe memory constraints. We study the problem of computing global properties of quantum sequences when quantum systems must be measured individually, without storing or jointly processing them. In our setting, a bit string $x \in \{0,1\}^n$ is encoded into an $n$-qubit product sta… ▽ More Quantum memory is a scarce and costly resource, yet little is known about which learning tasks remain feasible under severe memory constraints. We study the problem of computing global properties of quantum sequences when quantum systems must be measured individually, without storing or jointly processing them. In our setting, a bit string $x \in \{0,1\}^n$ is encoded into an $n$-qubit product state $|ψ_{x_1}\rangle \otimes \cdots \otimes |ψ_{x_n}\rangle$, and the goal is to infer $f(x) \in \{0,1\}$ from measurements of this quantum encoding. We consider a simple local strategy, which we call the greedy strategy, that applies the same optimal single-system measurement independently to each subsystem and then infers $f(x)$ from the outcomes. Our main result gives a complete characterization of when the greedy strategy is optimal: it achieves the same maximum success probability as an unrestricted global measurement if and only if the target Boolean function is affine (in all but finitely many cases). We establish a universal performance guarantee for general Boolean functions, showing that the success probability of the greedy strategy is always at least the square of the optimal global success probability, in direct analogy with the Barnum-Knill bound for the pretty good measurement. These results demonstrate that even under extreme memory constraints, simple local measurement strategies can remain provably competitive for learning global properties of quantum sequences. △ Less

Submitted 5 March, 2026; originally announced March 2026.

Comments: 26 pages, 2 figures. Comments are welcome!

arXiv:2602.02936 [pdf, ps, other]

Thurston geometries and parameter constraints from SNIa data

Authors: Tanay Gupta, Anshul Verma, Sukanta Panda, Pavan K. Aluri

Abstract: Following the numerous evidence for large-scale cosmic isotropy violation with the advent of the `precision cosmology' era, we explore the possible advantages of extending the flat $Λ$CDM model to more general models in order to constrain anisotropies in the universe, otherwise absent in the standard model based on FLRW spacetime. Such extensions are offered by the topologically unique Thurston ge… ▽ More Following the numerous evidence for large-scale cosmic isotropy violation with the advent of the `precision cosmology' era, we explore the possible advantages of extending the flat $Λ$CDM model to more general models in order to constrain anisotropies in the universe, otherwise absent in the standard model based on FLRW spacetime. Such extensions are offered by the topologically unique Thurston geometries, which are homogeneous but anisotropic spacetime models. In this work, we attempt to distinguish Thurston geometries from one another by introducing anisotropies via different scale factors in different directions, thereby introducing additional model parameters such as shear, eccentricity, curvature, and a preferred axis. We used the latest compilation of Pantheon+ \& SH0ES Type Ia supernova data for deriving model constraints, and found mild evidence of large-scale isotropy violation. △ Less

Submitted 4 February, 2026; v1 submitted 2 February, 2026; originally announced February 2026.

Comments: v2: Minor syntax changes. 25 pages (including citations), 04 figures & 03 tables

arXiv:2601.14410 [pdf, ps, other]

Quantum state exclusion with many copies

Authors: Debanjan Roy, Tathagata Gupta, Pratik Ghosal, Samrat Sen, Somshubhro Bandyopadhyay

Abstract: Quantum state exclusion is the task of identifying at least one state from a known set that was not used in the preparation of a quantum system. A set of quantum states is said to admit state exclusion if there exists a measurement whose outcomes can be put in one-to-one correspondence with the states in the set, such that each outcome rules out its corresponding state with certainty (while possib… ▽ More Quantum state exclusion is the task of identifying at least one state from a known set that was not used in the preparation of a quantum system. A set of quantum states is said to admit state exclusion if there exists a measurement whose outcomes can be put in one-to-one correspondence with the states in the set, such that each outcome rules out its corresponding state with certainty (while possibly also ruling out other states), and each outcome occurs with nonzero probability for at least one state in the set. State exclusion, however, is not always possible in the single-copy setting. In this paper, we investigate whether access to multiple identical copies of the system enables state exclusion. We prove that for any set of three or more pure states, state exclusion becomes possible with a finite number of copies. Moreover, we show that the number of copies required may be arbitrarily large: in particular, for every natural number $N$, we construct sets of states for which state exclusion remains impossible with $N$ or fewer copies. △ Less

Submitted 5 February, 2026; v1 submitted 20 January, 2026; originally announced January 2026.

Comments: Updated version; reference and clarification about terminology added

arXiv:2601.07595 [pdf, ps, other]

Deep Search for Joint Sources of Gravitational Waves and High-Energy Neutrinos with IceCube During the Third Observing Run of LIGO and Virgo

Authors: The IceCube Collaboration, R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, S. Ali, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, S. N. Axani, R. Babu, X. Bai, J. Baines-Holmes, A. Balagopal V., S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus , et al. (2193 additional authors not shown)

Abstract: The discovery of joint sources of high-energy neutrinos and gravitational waves has been a primary target for the LIGO, Virgo, KAGRA, and IceCube observatories. The joint detection of high-energy neutrinos and gravitational waves would provide insight into cosmic processes, from the dynamics of compact object mergers and stellar collapses to the mechanisms driving relativistic outflows. The joint… ▽ More The discovery of joint sources of high-energy neutrinos and gravitational waves has been a primary target for the LIGO, Virgo, KAGRA, and IceCube observatories. The joint detection of high-energy neutrinos and gravitational waves would provide insight into cosmic processes, from the dynamics of compact object mergers and stellar collapses to the mechanisms driving relativistic outflows. The joint detection of multiple cosmic messengers can also elevate the significance of the common observation even when some or all of the constituent messengers are sub-threshold, i.e. not significant enough to declare their detection individually. Using data from the LIGO, Virgo, and IceCube observatories, including sub-threshold events, we searched for common sources of gravitational waves and high-energy neutrinos during the third observing run of Advanced LIGO and Advanced Virgo detectors. Our search did not identify significant joint sources. We derive constraints on the rate densities of joint sources. Our results constrain the isotropic neutrino emission from gravitational-wave sources for very high values of the total energy emitted in neutrinos (> $10^{52} - 10^{54}$ erg). △ Less

Submitted 28 January, 2026; v1 submitted 12 January, 2026; originally announced January 2026.

Comments: Data release at: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/34B5AP

arXiv:2512.21893 [pdf, ps, other]

Evaluating Supervised Learning Approaches for Quantification of Quantum Entanglement

Authors: Shruti Aggarwal, Trasha Gupta, R. K. Agrawal, S. Indu

Abstract: Quantum entanglement is a key resource in quantum computing and quantum information processing tasks. However, its quantification remains a major challenge since it cannot be directly extracted from physical observables. To address this issue, we study a few machine-learning based models to estimate the amount of entanglement in two-qubit as well as three-qubit systems. We use measurement outcomes… ▽ More Quantum entanglement is a key resource in quantum computing and quantum information processing tasks. However, its quantification remains a major challenge since it cannot be directly extracted from physical observables. To address this issue, we study a few machine-learning based models to estimate the amount of entanglement in two-qubit as well as three-qubit systems. We use measurement outcomes as the input features and entanglement measures as the training labels. Our models predict entanglement without requiring the full state information. This demonstrates the potential of machine learning as an effcient and powerful tool for characterizing quantum entanglement △ Less

Submitted 26 December, 2025; originally announced December 2025.

Comments: 8 pages, 10 figures

arXiv:2512.16509 [pdf, ps, other]

Supersolid crystals of dipolar excitons in a lattice

Authors: C. Morin, C. Lagoin, T. Gupta, N. Reinic, K. Baldwin, L. Pfeiffer, G. Pupillo, F. Dubin

Abstract: In condensed-matter physics, long-range correlations introduce quantum states of matter that challenge intuition. For instance, supersolids combine symmetry-breaking crystalline structure, i.e. density order, and frictionless superfluid flow. Envisioned over fifty years ago, supersolids have proven to only exist under very stringent conditions, with experimental evidence limited to few observation… ▽ More In condensed-matter physics, long-range correlations introduce quantum states of matter that challenge intuition. For instance, supersolids combine symmetry-breaking crystalline structure, i.e. density order, and frictionless superfluid flow. Envisioned over fifty years ago, supersolids have proven to only exist under very stringent conditions, with experimental evidence limited to few observations. Many-body phases with supersolid properties in fact reduce to a few recent observations for weakly interacting Bose gases. Here, we demonstrate a new framework to realize supersolid crystals in the strong interaction regime, by confining dipolar bosons in a lattice with long-range hopping. We study dipolar excitons that genuinely realize this lattice model. At fractional lattice fillings - 1/4, 1/3 and 1/2 - we report mesoscopic quantum solids, across over 100 sites, spontaneously breaking translational symmetry. At the same time, we show that off-diagonal long-range order is induced by long-range hopping, such that exciton solids are superfluids. State-of-the-art numerical methods quantitatively confirm that supersolidity builds up in the ground-state of the lattice Hamiltonian. Our studies of strongly-correlated supersolid crystals open new frontiers for exploration in condensed matter physics. △ Less

Submitted 18 December, 2025; originally announced December 2025.

Comments: 13 pages, 7 figures

arXiv:2512.13874 [pdf, ps, other]

SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

Authors: Jitesh Jain, Jialuo Li, Zixian Ma, Jieyu Zhang, Chris Dongjoo Kim, Sangho Lee, Rohun Tripathi, Tanmay Gupta, Christopher Clark, Humphrey Shi

Abstract: As humans, we are natural any-horizon reasoners, i.e., we can decide whether to iteratively skim long videos or watch short ones in full when necessary for a given task. With this in mind, one would expect video reasoning models to reason flexibly across different durations. However, SOTA models are still trained to predict answers in a single turn while processing a large number of frames, akin t… ▽ More As humans, we are natural any-horizon reasoners, i.e., we can decide whether to iteratively skim long videos or watch short ones in full when necessary for a given task. With this in mind, one would expect video reasoning models to reason flexibly across different durations. However, SOTA models are still trained to predict answers in a single turn while processing a large number of frames, akin to watching an entire long video, requiring significant resources. This raises the question: Is it possible to develop performant any-horizon video reasoning systems? Inspired by human behavior, we first propose SAGE, an agent system that performs multi-turn reasoning on long videos while handling simpler problems in a single turn. Secondly, we introduce an easy synthetic data generation pipeline using Gemini-2.5-Flash to train the orchestrator, SAGE-MM, which lies at the core of SAGE. We further propose an effective RL post-training recipe essential for instilling any-horizon reasoning ability in SAGE-MM. Thirdly, we curate SAGE-Bench with an average duration of greater than 700 seconds for evaluating video reasoning ability in real-world entertainment use cases. Lastly, we empirically validate the effectiveness of our system, data, and RL recipe, observing notable improvements of up to 6.1% on open-ended video reasoning tasks, as well as an impressive 8.2% improvement on videos longer than 10 minutes. △ Less

Submitted 29 March, 2026; v1 submitted 15 December, 2025; originally announced December 2025.

Comments: Project Page: https://praeclarumjj3.github.io/sage/

arXiv:2512.11698 [pdf, ps, other]

Diederich-Fornæss index and global regularity of the complex Green operator: domains with comparable Levi eigenvalues

Authors: Tanuj Gupta, Emil J. Straube

Abstract: Let $Ω\subset \mathbb{C}^{n}$, with $n \geq 3$, be a smooth bounded pseudoconvex domain satisfying the symmetric eigenvalue comparability condition $D(q_0)$ for some $1\le q_0\le n-2$. We show that if the Diederich-Fornaess-index of $Ω$ is one, then the complex Green operator $G_q$, associated with $Ω$, is globally regular for $q$ in the range… ▽ More Let $Ω\subset \mathbb{C}^{n}$, with $n \geq 3$, be a smooth bounded pseudoconvex domain satisfying the symmetric eigenvalue comparability condition $D(q_0)$ for some $1\le q_0\le n-2$. We show that if the Diederich-Fornaess-index of $Ω$ is one, then the complex Green operator $G_q$, associated with $Ω$, is globally regular for $q$ in the range $\min\{q_0,\, n - 1 - q_0\} \leq q \leq \max\{q_0,\, n - 1 - q_0\}$. △ Less

Submitted 5 January, 2026; v1 submitted 12 December, 2025; originally announced December 2025.

Comments: Corrected the formula for the Hessian of $\varphi$ in inequality (6.1), taking into account that the vector fields $L^{K}_{u}$ are not invariant under a change of frame when $q>1$. This entailed corresponding changes in various places throughout the paper. None of the results are affected

MSC Class: 32W10; 35N15

arXiv:2512.10935 [pdf, ps, other]

Any4D: Unified Feed-Forward Metric 4D Reconstruction

Authors: Jay Karhade, Nikhil Keetha, Yuchen Zhang, Tanisha Gupta, Akash Sharma, Sebastian Scherer, Deva Ramanan

Abstract: We present Any4D, a scalable multi-view transformer for metric-scale, dense feed-forward 4D reconstruction. Any4D directly generates per-pixel motion and geometry predictions for N frames, in contrast to prior work that typically focuses on either 2-view dense scene flow or sparse 3D point tracking. Moreover, unlike other recent methods for 4D reconstruction from monocular RGB videos, Any4D can pr… ▽ More We present Any4D, a scalable multi-view transformer for metric-scale, dense feed-forward 4D reconstruction. Any4D directly generates per-pixel motion and geometry predictions for N frames, in contrast to prior work that typically focuses on either 2-view dense scene flow or sparse 3D point tracking. Moreover, unlike other recent methods for 4D reconstruction from monocular RGB videos, Any4D can process additional modalities and sensors such as RGB-D frames, IMU-based egomotion, and Radar Doppler measurements, when available. One of the key innovations that allows for such a flexible framework is a modular representation of a 4D scene; specifically, per-view 4D predictions are encoded using a variety of egocentric factors (depthmaps and camera intrinsics) represented in local camera coordinates, and allocentric factors (camera extrinsics and scene flow) represented in global world coordinates. We achieve superior performance across diverse setups - both in terms of accuracy (2-3X lower error) and compute efficiency (15X faster), opening avenues for multiple downstream applications. △ Less

Submitted 11 December, 2025; originally announced December 2025.

Comments: Project Website: https://any-4d.github.io/

arXiv:2511.15136 [pdf]

Novel sparse matrix algorithm expands the feasible size of a self-organizing map of the knowledge indexed by a database of peer-reviewed medical literature

Authors: Andrew Amos, Joanne Lee, Tarun Sen Gupta, Bunmi S. Malau-Aduli

Abstract: Past efforts to map the Medline database have been limited to small subsets of the available data because of the exponentially increasing memory and processing demands of existing algorithms. We designed a novel algorithm for sparse matrix multiplication that allowed us to apply a self-organizing map to the entire Medline dataset, allowing for a more complete map of existing medical knowledge. The… ▽ More Past efforts to map the Medline database have been limited to small subsets of the available data because of the exponentially increasing memory and processing demands of existing algorithms. We designed a novel algorithm for sparse matrix multiplication that allowed us to apply a self-organizing map to the entire Medline dataset, allowing for a more complete map of existing medical knowledge. The algorithm also increases the feasibility of refining the self-organizing map to account for changes in the dataset over time. △ Less

Submitted 19 November, 2025; originally announced November 2025.

arXiv:2511.12239 [pdf, ps, other]

Beyond World Models: Rethinking Understanding in AI Models

Authors: Tarun Gupta, Danish Pruthi

Abstract: World models have garnered substantial interest in the AI community. These are internal representations that simulate aspects of the external world, track entities and states, capture causal relationships, and enable prediction of consequences. This contrasts with representations based solely on statistical correlations. A key motivation behind this research direction is that humans possess such m… ▽ More World models have garnered substantial interest in the AI community. These are internal representations that simulate aspects of the external world, track entities and states, capture causal relationships, and enable prediction of consequences. This contrasts with representations based solely on statistical correlations. A key motivation behind this research direction is that humans possess such mental world models, and finding evidence of similar representations in AI models might indicate that these models "understand" the world in a human-like way. In this paper, we use case studies from the philosophy of science literature to critically examine whether the world model framework adequately characterizes human-level understanding. We focus on specific philosophical analyses where the distinction between world model capabilities and human understanding is most pronounced. While these represent particular views of understanding rather than universal definitions, they help us explore the limits of world models. △ Less

Submitted 15 November, 2025; originally announced November 2025.

Comments: Accepted to AAAI 2026 (Main Track)

arXiv:2510.18969 [pdf, ps, other]

doi 10.1103/n6nb-klgc

Freeze-in and Freeze-out in a Right-Handed Neutrino Extended MSSM with a Seesaw Mechanism

Authors: Tushar Gupta, Matti Heikinheimo, Katri Huitu, Harri Waltari

Abstract: We investigate the possibility of saturating the relic density bound with light Higgsinos. When the minimal supersymmetric Standard Model is extended with right-handed neutrino superfields and the seesaw scale is very low, right-handed sneutrinos can be produced via the freeze-in mechanism. In such a case we can have essentially two independent sources for dark matter, the traditional freeze-out o… ▽ More We investigate the possibility of saturating the relic density bound with light Higgsinos. When the minimal supersymmetric Standard Model is extended with right-handed neutrino superfields and the seesaw scale is very low, right-handed sneutrinos can be produced via the freeze-in mechanism. In such a case we can have essentially two independent sources for dark matter, the traditional freeze-out of Higgsinos and the freeze-in of right-handed sneutrinos. The heavier of these two will decay to the lighter species with a delay. We rule out such a scenario for all seesaw models as the lifetime of sterile neutrinos produced over-abundantly via Dodelson-Widrow mechanism exceeds the age of the universe and will contribute to the relic density. △ Less

Submitted 19 February, 2026; v1 submitted 21 October, 2025; originally announced October 2025.

Comments: 13 pages, 6 figures, 1 table; v2 updated title, minor textual and grammatical changes to match published version

Report number: HIP-2025-30/TH

Journal ref: Phys. Rev. D 113, 035019 (2026)

arXiv:2509.16553 [pdf, ps, other]

Cosmological viability of anisotropic inflation in Thurston spacetimes

Authors: Devika J. S., Tanay Gupta, Sukanta Panda

Abstract: Recent observations of large-scale statistical isotropy violations have prompted the adoption of anisotropic cosmological models that account for inherent directional curvature. Studies of these anisotropic spacetimes have shown how they can explain the evolutionary dynamics and light propagation in the universe. Here, we consider one such interesting set of spacetimes that preserve homogeneity bu… ▽ More Recent observations of large-scale statistical isotropy violations have prompted the adoption of anisotropic cosmological models that account for inherent directional curvature. Studies of these anisotropic spacetimes have shown how they can explain the evolutionary dynamics and light propagation in the universe. Here, we consider one such interesting set of spacetimes that preserve homogeneity but place no constraint on isotropy during the inflationary epoch, to examine whether we can address the possibility of anisotropic inflation in the universe. Researchers have proposed inflationary models in which a vector field coupled to the inflaton is found to violate the cosmic no-hair theorem for the anisotropic Bianchi type I spacetime, due to the existence of a stable anisotropically inflationary fixed point. Lately, this study has been extended to axisymmetric spacetimes of Bianchi type II, III, and the Kantowski-Sachs metric, and it has been inferred that the entire family of spacetimes is attracted to the anisotropic Bianchi I fixed point. By constructing inflationary models where the spatial slices are anisotropic Thurston 3-geometries, we demonstrate that the intrinsic eccentricity of the background geometry induces an isotropy-violating vector field. This field, through its coupling to the inflaton, triggers a secondary phase of anisotropic inflation. We perform dynamical stability and phase-space analyses to assess the feasibility of anisotropic inflation. The results for the considered set of Thurston geometries showed the presence of a unique, stable inflationary fixed point that converges, similar to those in Bianchi spacetimes, thereby indicating the cosmological viability of inflation with anisotropic hair. △ Less

Submitted 20 March, 2026; v1 submitted 20 September, 2025; originally announced September 2025.

Comments: This version corrects minor typographical errors and inconsistent references present in v2. No scientific conclusions are changed

arXiv:2508.18083 [pdf, ps, other]

GWTC-4.0: Population Properties of Merging Compact Binaries

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1783 additional authors not shown)

Abstract: We detail the population properties of merging compact objects using 158 mergers from the cumulative Gravitational-Wave Transient Catalog 4.0, which includes three types of binary mergers: binary neutron star, neutron star--black hole binary, and binary black hole mergers. We resolve multiple over- and under-densities in the black hole mass distribution: features persist at primary masses of… ▽ More We detail the population properties of merging compact objects using 158 mergers from the cumulative Gravitational-Wave Transient Catalog 4.0, which includes three types of binary mergers: binary neutron star, neutron star--black hole binary, and binary black hole mergers. We resolve multiple over- and under-densities in the black hole mass distribution: features persist at primary masses of $10\,M_\odot$ and $35\,M_\odot$ with a possible third feature at $\sim 20\,M_\odot$. These are departures from an otherwise power-law-like continuum that steepens above $35\,M_\odot$. Binary black holes with primary masses near $10\,M_\odot$ are more likely to have less massive secondaries, with a mass ratio distribution peaking at $q = 0.74^{+0.13}_{-0.13}$, potentially a signature of stable mass transfer during binary evolution. Black hole spins are inferred to be non-extremal, with 90\% of black holes having $χ< 0.57$, and preferentially aligned with binary orbits, implying many merging binaries form in isolation. However, we find a significant fraction, 0.24-0.42, of binaries have negative effective inspiral spins, suggesting many could be formed dynamically in gas-free environments. We find evidence for correlation between effective inspiral spin and mass ratio, though it is unclear if this is driven by variation in the mode of the distribution or the width. (Abridged) △ Less

Submitted 17 September, 2025; v1 submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog

Report number: LIGO-P2400004

arXiv:2508.18081 [pdf, ps, other]

GWTC-4.0: Methods for Identifying and Characterizing Gravitational-wave Transients

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, S. Akcay, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1787 additional authors not shown)

Abstract: The Gravitational-Wave Transient Catalog (GWTC) is a collection of candidate gravitational-wave transient signals identified and characterized by the LIGO-Virgo-KAGRA Collaboration. Producing the contents of the GWTC from detector data requires complex analysis methods. These comprise techniques to model the signal; identify the transients in the data; evaluate the quality of the data and mitigate… ▽ More The Gravitational-Wave Transient Catalog (GWTC) is a collection of candidate gravitational-wave transient signals identified and characterized by the LIGO-Virgo-KAGRA Collaboration. Producing the contents of the GWTC from detector data requires complex analysis methods. These comprise techniques to model the signal; identify the transients in the data; evaluate the quality of the data and mitigate possible instrumental issues; infer the parameters of each transient; compare the data with the waveform models for compact binary coalescences; and handle the large amount of results associated with all these different analyses. In this paper, we describe the methods employed to produce the catalog's fourth release, GWTC-4.0, focusing on the analysis of the first part of the fourth observing run of Advanced LIGO, Advanced Virgo and KAGRA. △ Less

Submitted 19 February, 2026; v1 submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog. Version accepted for publication

Report number: LIGO-P2400300

arXiv:2508.18080 [pdf, ps, other]

GWTC-4.0: An Introduction to Version 4.0 of the Gravitational-Wave Transient Catalog

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, D. Adhikari, N. Adhikari, R. X. Adhikari, V. K. Adkins, S. Afroz, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, S. Ahmadzadeh, L. Aiello, A. Ain, P. Ajith, S. Akcay, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1786 additional authors not shown)

Abstract: The Gravitational-Wave Transient Catalog (GWTC) is a collection of short-duration (transient) gravitational wave signals identified by the LIGO-Virgo-KAGRA Collaboration in gravitational-wave data produced by the eponymous detectors. The catalog provides information about the identified candidates, such as the arrival time and amplitude of the signal and properties of the signal's source as inferr… ▽ More The Gravitational-Wave Transient Catalog (GWTC) is a collection of short-duration (transient) gravitational wave signals identified by the LIGO-Virgo-KAGRA Collaboration in gravitational-wave data produced by the eponymous detectors. The catalog provides information about the identified candidates, such as the arrival time and amplitude of the signal and properties of the signal's source as inferred from the observational data. GWTC is the data release of this dataset and version 4.0 extends the catalog to include observations made during the first part of the fourth LIGO-Virgo-KAGRA observing run up until 2024 January 31. This paper marks an introduction to a collection of articles related to this version of the catalog, GWTC-4.0. The collection of articles accompanying the catalog provides documentation of the methods used to analyze the data, summaries of the catalog of events, observational measurements drawn from the population, and detailed discussions of selected candidates △ Less

Submitted 23 September, 2025; v1 submitted 25 August, 2025; originally announced August 2025.

Comments: As part of the Astrophysical Journal Letters Focus Issue on the Gravitational Wave Transient Catalog. Update following peer review

Report number: LIGO-P2400293

arXiv:2508.06949 [pdf, ps, other]

Convergence Sans Synchronization

Authors: Arya Tanmay Gupta

Abstract: We currently see a steady rise in the usage and size of multiprocessor systems, and so the community is evermore interested in developing fast parallel processing algorithms. However, most algorithms require a synchronization mechanism, which is costly in terms of computational resources and time. If an algorithm can be executed in asynchrony, then it can use all the available computation power, a… ▽ More We currently see a steady rise in the usage and size of multiprocessor systems, and so the community is evermore interested in developing fast parallel processing algorithms. However, most algorithms require a synchronization mechanism, which is costly in terms of computational resources and time. If an algorithm can be executed in asynchrony, then it can use all the available computation power, and the nodes can execute without being scheduled or locked. However, to show that an algorithm guarantees convergence in asynchrony, we need to generate the entire global state transition graph and check for the absence of cycles. This takes time exponential in the size of the global state space. In this dissertation, we present a theory that explains the necessary and sufficient properties of a multiprocessor algorithm that guarantees convergence even without synchronization. We develop algorithms for various problems that do not require synchronization. Additionally, we show for several existing algorithms that they can be executed without any synchronization mechanism. A significant theoretical benefit of our work is in proving that an algorithm can converge even in asynchrony. Our theory implies that we can make such conclusions about an algorithm, by only showing that the local state transition graph of a computing node forms a partial order, rather than generating the entire global state space and determining the absence of cycles in it. Thus, the complexity of rendering such proofs, formal or social, is phenomenally reduced. Experiments show a significant reduction in time taken to converge, when we compare the execution time of algorithms in the literature versus the algorithms that we design. We get similar results when we run an algorithm, that guarantees convergence in asynchrony, under a scheduler versus in asynchrony. △ Less

Submitted 9 August, 2025; originally announced August 2025.

Comments: PhD thesis

arXiv:2508.00486 [pdf, ps, other]

doi 10.1103/vx5s-cx17

The Bose-Hubbard polaron from weak to strong coupling

Authors: Tom Hartweg, Tanul Gupta, Guido Pupillo

Abstract: We investigate the zero-temperature properties of a mobile impurity immersed in a bath of bosonic particles confined to a square lattice. We analyze the regimes of attractive and repulsive coupling between the impurity and the bath particles for different strengths of boson-boson interactions in the bath, using exact large-scale quantum Monte-Carlo simulations in the grand canonical ensemble. For… ▽ More We investigate the zero-temperature properties of a mobile impurity immersed in a bath of bosonic particles confined to a square lattice. We analyze the regimes of attractive and repulsive coupling between the impurity and the bath particles for different strengths of boson-boson interactions in the bath, using exact large-scale quantum Monte-Carlo simulations in the grand canonical ensemble. For weak coupling, the polaron mass ratio is found to decrease around the Mott insulator (MI) to superfluid (SF) transition of the bath, as predicted by recent theory, confirming the possible use of the impurity as a probe for the transition. For strong coupling in the MI regime, instead, the impurity is found to modify the bath density by binding to an extra bath particle or a hole, depending on the sign of the polaron-bath interactions. While the binding prevent the aforementioned use of the polaron mass ratio as an MI-SF transition probe, we show that it can be used instead as a probe of the binding itself. Our exact numerical results provide a benchmark for comparing lattice Bose polaron theories and are relevant for experiments with cold atoms trapped in optical lattices, where the presence of a confining harmonic potential can be modeled by a slowly varying local chemical potential. △ Less

Submitted 17 December, 2025; v1 submitted 1 August, 2025; originally announced August 2025.

Journal ref: Phys. Rev. B 112, L220201 (2025)

arXiv:2506.20560 [pdf, ps, other]

Quantum nonlocality without entanglement and state discrimination measures

Authors: Shayeef Murshid, Tathagata Gupta, Vincent Russo, Somshubhro Bandyopadhyay

Abstract: An ensemble of product states is said to exhibit "quantum nonlocality without entanglement" if the states cannot be optimally discriminated by local operations and classical communication (LOCC). We show that this property can depend on the measure of state discrimination. We present a family of ensembles, each consisting of six linearly independent, equally probable product states for which LOCC… ▽ More An ensemble of product states is said to exhibit "quantum nonlocality without entanglement" if the states cannot be optimally discriminated by local operations and classical communication (LOCC). We show that this property can depend on the measure of state discrimination. We present a family of ensembles, each consisting of six linearly independent, equally probable product states for which LOCC fails to achieve optimal minimum-error discrimination but succeeds in achieving optimal unambiguous discrimination. △ Less

Submitted 25 June, 2025; originally announced June 2025.

Comments: 22 pages

arXiv:2504.12299 [pdf, other]

Adapting a World Model for Trajectory Following in a 3D Game

Authors: Marko Tot, Shu Ishida, Abdelhak Lemkhenter, David Bignell, Pallavi Choudhury, Chris Lovett, Luis França, Matheus Ribeiro Furtado de Mendonça, Tarun Gupta, Darren Gehring, Sam Devlin, Sergio Valcarcel Macua, Raluca Georgescu

Abstract: Imitation learning is a powerful tool for training agents by leveraging expert knowledge, and being able to replicate a given trajectory is an integral part of it. In complex environments, like modern 3D video games, distribution shift and stochasticity necessitate robust approaches beyond simple action replay. In this study, we apply Inverse Dynamics Models (IDM) with different encoders and polic… ▽ More Imitation learning is a powerful tool for training agents by leveraging expert knowledge, and being able to replicate a given trajectory is an integral part of it. In complex environments, like modern 3D video games, distribution shift and stochasticity necessitate robust approaches beyond simple action replay. In this study, we apply Inverse Dynamics Models (IDM) with different encoders and policy heads to trajectory following in a modern 3D video game -- Bleeding Edge. Additionally, we investigate several future alignment strategies that address the distribution shift caused by the aleatoric uncertainty and imperfections of the agent. We measure both the trajectory deviation distance and the first significant deviation point between the reference and the agent's trajectory and show that the optimal configuration depends on the chosen setting. Our results show that in a diverse data setting, a GPT-style policy head with an encoder trained from scratch performs the best, DINOv2 encoder with the GPT-style policy head gives the best results in the low data regime, and both GPT-style and MLP-style policy heads had comparable results when pre-trained on a diverse setting and fine-tuned for a specific behaviour setting. △ Less

Submitted 16 April, 2025; originally announced April 2025.

arXiv:2504.07468 [pdf, other]

Novel Pooling-based VGG-Lite for Pneumonia and Covid-19 Detection from Imbalanced Chest X-Ray Datasets

Authors: Santanu Roy, Ashvath Suresh, Palak Sahu, Tulika Rudra Gupta

Abstract: This paper proposes a novel pooling-based VGG-Lite model in order to mitigate class imbalance issues in Chest X-Ray (CXR) datasets. Automatic Pneumonia detection from CXR images by deep learning model has emerged as a prominent and dynamic area of research, since the inception of the new Covid-19 variant in 2020. However, the standard Convolutional Neural Network (CNN) models encounter challenges… ▽ More This paper proposes a novel pooling-based VGG-Lite model in order to mitigate class imbalance issues in Chest X-Ray (CXR) datasets. Automatic Pneumonia detection from CXR images by deep learning model has emerged as a prominent and dynamic area of research, since the inception of the new Covid-19 variant in 2020. However, the standard Convolutional Neural Network (CNN) models encounter challenges associated with class imbalance, a prevalent issue found in many medical datasets. The innovations introduced in the proposed model architecture include: (I) A very lightweight CNN model, `VGG-Lite', is proposed as a base model, inspired by VGG-16 and MobileNet-V2 architecture. (II) On top of this base model, we leverage an ``Edge Enhanced Module (EEM)" through a parallel branch, consisting of a ``negative image layer", and a novel custom pooling layer ``2Max-Min Pooling". This 2Max-Min Pooling layer is entirely novel in this investigation, providing more attention to edge components within pneumonia CXR images. Thus, it works as an efficient spatial attention module (SAM). We have implemented the proposed framework on two separate CXR datasets. The first dataset is obtained from a readily available source on the internet, and the second dataset is a more challenging CXR dataset, assembled by our research team from three different sources. Experimental results reveal that our proposed framework has outperformed pre-trained CNN models, and three recent trend existing models ``Vision Transformer", ``Pooling-based Vision Transformer (PiT)'' and ``PneuNet", by substantial margins on both datasets. The proposed framework VGG-Lite with EEM, has achieved a macro average of 95% accuracy, 97.1% precision, 96.1% recall, and 96.6% F1 score on the ``Pneumonia Imbalance CXR dataset", without employing any pre-processing technique. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: 12 pages

arXiv:2504.04922 [pdf]

Real-time tuneable bright bonding plasmonic modes in Ga nanostructures

Authors: Renu Raman Sahu, Tapajyoti Das Gupta

Abstract: The precise control of nanogaps is crucial for plasmonic nanoassemblies, where plasmon hybridization is highly sensitive to gap size and geometry. This sensitivity enables fine-tuning of the resonance wavelength and near-field enhancement, offering the potential for advanced optical applications. However, conventional lithographic techniques for gap modulation are constrained to discrete values an… ▽ More The precise control of nanogaps is crucial for plasmonic nanoassemblies, where plasmon hybridization is highly sensitive to gap size and geometry. This sensitivity enables fine-tuning of the resonance wavelength and near-field enhancement, offering the potential for advanced optical applications. However, conventional lithographic techniques for gap modulation are constrained to discrete values and face challenges in achieving nanometer order of separations. Such limitations hinder the comprehensive study of plasmon coupling across varying interaction regimes. Overcoming these challenges is essential for advancing nanoplasmonic research and its practical applications. Herein, we demonstrate a tuneable plasmonic device in which real-time tunability of this hybridization mode is achieved via manipulation of the inter-droplet gap of liquid metal nanoparticles by macroscopic physical deformation. In particular, we show that the optical spectra obtained from the sample shift towards higher energy on the application of a linear strain, resulting in an increase of inter-droplet gaps leading to a direct probing of the bright modes in situ. Our method thus offers a novel means of exploring the fundamental concept of real-time tuneable plasmon hybridization as well as tuning of nanoparticle assembly with any desired gap in a controlled manner. △ Less

Submitted 7 April, 2025; originally announced April 2025.

Comments: 19 Pages (Manuscript 10 pages, Supplementary Document 9 pages), 5 Figures, 10 SI Figures

arXiv:2502.18293 [pdf, ps, other]

AMPO: Active Multi-Preference Optimization for Self-play Preference Selection

Authors: Taneesh Gupta, Rahul Madhavan, Xuchao Zhang, Chetan Bansal, Saravan Rajmohan

Abstract: Multi-preference optimization enriches language-model alignment beyond pairwise preferences by contrasting entire sets of helpful and undesired responses, thereby enabling richer training signals for large language models. During self-play alignment, these models often produce numerous candidate answers per query, rendering it computationally infeasible to include all responses in the training obj… ▽ More Multi-preference optimization enriches language-model alignment beyond pairwise preferences by contrasting entire sets of helpful and undesired responses, thereby enabling richer training signals for large language models. During self-play alignment, these models often produce numerous candidate answers per query, rendering it computationally infeasible to include all responses in the training objective. In this work, we propose $\textit{Active Multi-Preference Optimization}$ (AMPO), a novel approach that combines on-policy generation, a multi-preference group-contrastive loss, and active subset selection. Specifically, we score and embed large candidate pools of responses and then select a small, yet informative, subset that covers reward extremes and distinct semantic clusters for preference optimization. Our contrastive training scheme is capable of identifying not only the best and worst answers but also subtle, underexplored modes that are crucial for robust alignment. Theoretically, we provide guarantees for expected reward maximization using our active selection method, and empirically, AMPO achieves state-of-the-art results on $\textit{AlpacaEval}$ using Llama 8B and Mistral 7B. We release our datasets $\href{https://huggingface.co/Multi-preference-Optimization}{here}$. △ Less

Submitted 8 June, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

Comments: Accepted at ICML 2025

arXiv:2502.16487 [pdf, ps, other]

All That Glitters is Not Novel: Plagiarism in AI Generated Research

Authors: Tarun Gupta, Danish Pruthi

Abstract: Automating scientific research is considered the final frontier of science. Recently, several papers claim autonomous research agents can generate novel research ideas. Amidst the prevailing optimism, we document a critical concern: a considerable fraction of such research documents are smartly plagiarized. Unlike past efforts where experts evaluate the novelty and feasibility of research ideas, w… ▽ More Automating scientific research is considered the final frontier of science. Recently, several papers claim autonomous research agents can generate novel research ideas. Amidst the prevailing optimism, we document a critical concern: a considerable fraction of such research documents are smartly plagiarized. Unlike past efforts where experts evaluate the novelty and feasibility of research ideas, we request $13$ experts to operate under a different situational logic: to identify similarities between LLM-generated research documents and existing work. Concerningly, the experts identify $24\%$ of the $50$ evaluated research documents to be either paraphrased (with one-to-one methodological mapping), or significantly borrowed from existing work. These reported instances are cross-verified by authors of the source papers. The remaining $76\%$ of documents show varying degrees of similarity with existing work, with only a small fraction appearing completely novel. Problematically, these LLM-generated research documents do not acknowledge original sources, and bypass inbuilt plagiarism detectors. Lastly, through controlled experiments we show that automated plagiarism detectors are inadequate at catching plagiarized ideas from such systems. We recommend a careful assessment of LLM-generated research, and discuss the implications of our findings on academic publishing. △ Less

Submitted 4 September, 2025; v1 submitted 23 February, 2025; originally announced February 2025.

Comments: Accepted to ACL 2025 (main) conference

arXiv:2502.15872 [pdf, other]

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

Authors: Zaid Khan, Ali Farhadi, Ranjay Krishna, Luca Weihs, Mohit Bansal, Tanmay Gupta

Abstract: When a human requests an LLM to complete a coding task using functionality from a large code repository, how do we provide context from the repo to the LLM? One approach is to add the entire repo to the LLM's context window. However, most tasks involve only fraction of symbols from a repo, longer contexts are detrimental to the LLM's reasoning abilities, and context windows are not unlimited. Alte… ▽ More When a human requests an LLM to complete a coding task using functionality from a large code repository, how do we provide context from the repo to the LLM? One approach is to add the entire repo to the LLM's context window. However, most tasks involve only fraction of symbols from a repo, longer contexts are detrimental to the LLM's reasoning abilities, and context windows are not unlimited. Alternatively, we could emulate the human ability to navigate a large repo, pick out the right functionality, and form a plan to solve the task. We propose MutaGReP (Mutation-guided Grounded Repository Plan Search), an approach to search for plans that decompose a user request into natural language steps grounded in the codebase. MutaGReP performs neural tree search in plan space, exploring by mutating plans and using a symbol retriever for grounding. On the challenging LongCodeArena benchmark, our plans use less than 5% of the 128K context window for GPT-4o but rival the coding performance of GPT-4o with a context window filled with the repo. Plans produced by MutaGReP allow Qwen 2.5 Coder 32B and 72B to match the performance of GPT-4o with full repo context and enable progress on the hardest LongCodeArena tasks. Project page: zaidkhan.me/MutaGReP △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: Project page: zaidkhan.me/MutaGReP

arXiv:2502.14846 [pdf, other]

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

Authors: Yue Yang, Ajay Patel, Matt Deitke, Tanmay Gupta, Luca Weihs, Andrew Head, Mark Yatskar, Chris Callison-Burch, Ranjay Krishna, Aniruddha Kembhavi, Christopher Clark

Abstract: Reasoning about images with rich text, such as charts and documents, is a critical application of vision-language models (VLMs). However, VLMs often struggle in these domains due to the scarcity of diverse text-rich vision-language data. To address this challenge, we present CoSyn, a framework that leverages the coding capabilities of text-only large language models (LLMs) to automatically create… ▽ More Reasoning about images with rich text, such as charts and documents, is a critical application of vision-language models (VLMs). However, VLMs often struggle in these domains due to the scarcity of diverse text-rich vision-language data. To address this challenge, we present CoSyn, a framework that leverages the coding capabilities of text-only large language models (LLMs) to automatically create synthetic text-rich multimodal data. Given input text describing a target domain (e.g., "nutrition fact labels"), CoSyn prompts an LLM to generate code (Python, HTML, LaTeX, etc.) for rendering synthetic images. With the underlying code as textual representations of the synthetic images, CoSyn can generate high-quality instruction-tuning data, again relying on a text-only LLM. Using CoSyn, we constructed a dataset comprising 400K images and 2.7M rows of vision-language instruction-tuning data. Comprehensive experiments on seven benchmarks demonstrate that models trained on our synthetic data achieve state-of-the-art performance among competitive open-source models, including Llama 3.2, and surpass proprietary models such as GPT-4V and Gemini 1.5 Flash. Furthermore, CoSyn can produce synthetic pointing data, enabling VLMs to ground information within input images, showcasing its potential for developing multimodal agents capable of acting in real-world environments. △ Less

Submitted 21 May, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

Comments: Published in ACL 2025, project page: https://yueyang1996.github.io/cosyn/

arXiv:2501.05736 [pdf]

Tailored Thin Films: Modulating Soft Photonics with Dynamically Tunable Large Area Microstructures via Controlled Thermal Processing

Authors: Srijeeta Biswas, Renu Raman Sahu, Omkar Deokinandan Nayak Shinkre, Shubham Meena, Ramnishanth, Mark Vailshery, Tapajyoti Das Gupta

Abstract: Self-assembled nano and micro-structures, particularly those capable of responsive erasure and regeneration, have garnered significant interest for their applications in smart photonics and electronics. However, current techniques for modulating these architectures largely depend on network rearrangement, posing challenges for in situ regeneration. Furthermore, their common fabrication techniques… ▽ More Self-assembled nano and micro-structures, particularly those capable of responsive erasure and regeneration, have garnered significant interest for their applications in smart photonics and electronics. However, current techniques for modulating these architectures largely depend on network rearrangement, posing challenges for in situ regeneration. Furthermore, their common fabrication techniques are complex and uncontrolled with the structures formed not being amenable for large area applications, thus compromising their economic viability. Herein, we present a controlled thermal process strategy for fabricating large-area, dynamically tunable, 1D,2D and 3D micro and nanostructures on a wide range of compatible materials including metals, semiconductors and polymers. By tuning the temperature changes in the system, thermal expansion coefficients of thin films and substrates, surface energy, Youngs modulus and thickness of the thin films we achieve robust, uniform, periodic structures over extensive areas on soft and stretchable substrates. The process is further supported by a theoretical model that we developed and validated by experiments and simulations. To showcase the robustness of our approach, we present prototypes of dynamically tunable diffraction gratings, optical diffusers, large-area reflective displays, camouflage devices, out-coupling efficiency enhancers, wearable devices and mechanochromic sensors. △ Less

Submitted 19 January, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

Comments: The legend to figure 1 was missing in the previous version

arXiv:2501.01495 [pdf, ps, other]

doi 10.3847/1538-4357/adb3a0

Search for continuous gravitational waves from known pulsars in the first part of the fourth LIGO-Virgo-KAGRA observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1794 additional authors not shown)

Abstract: Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent ana… ▽ More Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent analysis methods considering the single-harmonic and the dual-harmonic emission models. We find no evidence of a CW signal in O4a data for both models and set upper limits on the signal amplitude and on the ellipticity, which quantifies the asymmetry in the neutron star mass distribution. For the single-harmonic emission model, 29 targets have the upper limit on the amplitude below the theoretical spin-down limit. The lowest upper limit on the amplitude is $6.4\!\times\!10^{-27}$ for the young energetic pulsar J0537-6910, while the lowest constraint on the ellipticity is $8.8\!\times\!10^{-9}$ for the bright nearby millisecond pulsar J0437-4715. Additionally, for a subset of 16 targets we performed a narrowband search that is more robust regarding the emission model, with no evidence of a signal. We also found no evidence of non-standard polarizations as predicted by the Brans-Dicke theory. △ Less

Submitted 26 September, 2025; v1 submitted 2 January, 2025; originally announced January 2025.

Comments: main paper: 12 pages, 6 figures, 4 tables

Report number: LIGO-P2400315

Journal ref: Astrophys.J. 983 (2025) 2, 99

arXiv:2412.16378 [pdf, ps, other]

REFA: Reference Free Alignment for multi-preference optimization

Authors: Taneesh Gupta, Rahul Madhavan, Xuchao Zhang, Chetan Bansal, Saravan Rajmohan

Abstract: To mitigate reward hacking from response verbosity, modern preference optimization methods are increasingly adopting length normalization (e.g., SimPO, ORPO, LN-DPO). While effective against this bias, we demonstrate that length normalization itself introduces a failure mode: the URSLA shortcut. Here models learn to satisfy the alignment objective by prematurely truncating low-quality responses ra… ▽ More To mitigate reward hacking from response verbosity, modern preference optimization methods are increasingly adopting length normalization (e.g., SimPO, ORPO, LN-DPO). While effective against this bias, we demonstrate that length normalization itself introduces a failure mode: the URSLA shortcut. Here models learn to satisfy the alignment objective by prematurely truncating low-quality responses rather than learning from their semantic content. To address this, we introduce REFA, a new alignment framework that proposes probabilistic control on a structural token that controls termination. Our core innovation is a new class of regularizers that operate directly on the probability of the End-of-Sequence (EOS) token, a previously unexploited control lever. This token-level intervention provides a principled solution to the URSLA shortcut, ensuring genuine quality improvements. Furthermore, it unlocks a versatile mechanism for managing the alignment-efficiency tradeoff, enabling practitioners to fine-tune models that adhere to specific token budgets. Empirically, REFA achieves a 60.29% win rate and a 52.17% length-controlled win rate on AlpacaEval2 with Llama-3-8B-Instruct, demonstrating the power of our token-level control paradigm. △ Less

Submitted 5 November, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

arXiv:2412.12122 [pdf, other]

AI-driven Inverse Design of Band-Tunable Mechanical Metastructures for Tailored Vibration Mitigation

Authors: Tanuj Gupta, Arun Kumar Sharma, Ankur Dwivedi, Vivek Gupta, Subhadeep Sahana, Suryansh Pathak, Ashish Awasthi, Bishakh Bhattacharya

Abstract: On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corre… ▽ More On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corresponding vibration characteristics are studied experimentally and numerically. Further, the band-gap modulation with metallic inserts in the honeycomb interlaced metastructures is also studied. AI-driven inverse design of such complex metastructures with a desired vibration mitigation profile can pave the way for addressing engineering challenges in high-precision manufacturing. The current inverse design methodologies are limited to designing simple periodic structures based on limited variants of unit cells. Therefore, a novel forward analysis model with multi-head FEM-inspired spatial attention (FSA) is proposed to learn the complex geometry of the metastructures and predict corresponding transmissibility. Subsequently, a multiscale Gaussian self-attention (MGSA) based inverse design model with Gaussian function for 1D spectrum position encoding is developed to produce a suitable metastructure for the desired vibration transmittance. The proposed AI framework demonstrated outstanding performance corresponding to the expected locally resonant bandgaps in a targeted frequency range. △ Less

Submitted 28 February, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

arXiv:2412.04628 [pdf, ps, other]

Multi-Preference Optimization: Generalizing DPO via Set-Level Contrasts

Authors: Taneesh Gupta, Rahul Madhavan, Xuchao Zhang, Nagarajan Natarajan, Chetan Bansal, Saravan Rajmohan

Abstract: Direct Preference Optimization (DPO) has become a popular approach for aligning language models using pairwise preferences. However, in practical post-training pipelines, on-policy generation typically yields multiple candidate responses per prompt, which are scored by a reward model to guide learning. In this setting, we propose $\textbf{Multi-Preference Optimization (MPO)}$, a generalization of… ▽ More Direct Preference Optimization (DPO) has become a popular approach for aligning language models using pairwise preferences. However, in practical post-training pipelines, on-policy generation typically yields multiple candidate responses per prompt, which are scored by a reward model to guide learning. In this setting, we propose $\textbf{Multi-Preference Optimization (MPO)}$, a generalization of DPO that optimizes over entire sets of responses by extending the Bradley-Terry model to groupwise comparisons between chosen and rejected sets. To further enhance learning, MPO employs deviation-based weighting, which emphasizes outlier responses that deviate most from the mean reward, effectively inducing a self-paced curriculum. We theoretically prove that MPO reduces alignment bias at a rate of $\mathcal{O}\left(\frac{1}{\sqrt{n}}\right)$ with respect to the number of responses per query. Empirically, MPO achieves state-of-the-art performance on the UltraFeedback benchmark and yields up to $\sim 17.5\%$ improvement over the state-of-the-art baseline in length-controlled win rate on AlpacaEval2, establishing a new baseline for preference-based alignment △ Less

Submitted 19 June, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

arXiv:2412.01571 [pdf, other]

Bose-Hubbard model with power-law hopping in one dimension

Authors: Tanul Gupta, Nikolay V. Prokof'ev, Guido Pupillo

Abstract: We investigate the zero-temperature phase diagram of the one-dimensional Bose-Hubbard model with power-law hopping decaying with distance as $1/r^α$ using exact large scale Quantum Monte-Carlo simulations. For all $1<α\leq 3$ the quantum phase transition from a superfluid and a Mott insulator at unit filling is found to be continuous and scale invariant, in a way incompatible with the Berezinskii-… ▽ More We investigate the zero-temperature phase diagram of the one-dimensional Bose-Hubbard model with power-law hopping decaying with distance as $1/r^α$ using exact large scale Quantum Monte-Carlo simulations. For all $1<α\leq 3$ the quantum phase transition from a superfluid and a Mott insulator at unit filling is found to be continuous and scale invariant, in a way incompatible with the Berezinskii-Kosterlitz-Thouless (BKT) scenario, which is recovered for $α>3$. We characterise the new universality class by providing the critical exponents by means of data collapse analysis near the critical point for each $α$ and from careful analysis of the spectrum. Large-scale simulations of the grand canonical phase diagram and of the decay of correlation functions demonstrate an overall behavior akin to higher dimensional systems with long-range order in the ground state for $α\leq 2$ and intermediate between one and higher dimensions for $2<α\leq 3$. Our exact numerical results provide a benchmark to compare theories of long-range quantum models and are relevant for experiments with cold neutral atom, molecules and ion chains. △ Less

Submitted 8 January, 2025; v1 submitted 2 December, 2024; originally announced December 2024.

arXiv:2411.11973 [pdf, other]

Boosted Dark Matter Driven by Cosmic Rays and Diffuse Supernova Neutrinos

Authors: Dilip Kumar Ghosh, Tushar Gupta, Matti Heikinheimo, Katri Huitu, Sk Jeesun

Abstract: Direct detection of light dark matter can be significantly enhanced by up-scattering of dark matter with energetic particles in the cosmic ambient. This boosted dark matter flux can reach kinetic energies up to tens of MeV, while the typical kinetic energies of GeV mass dark matter particles in the Milky Way halo are of the order of keV. Dark matter boosted by energetic diffuse supernova backgroun… ▽ More Direct detection of light dark matter can be significantly enhanced by up-scattering of dark matter with energetic particles in the cosmic ambient. This boosted dark matter flux can reach kinetic energies up to tens of MeV, while the typical kinetic energies of GeV mass dark matter particles in the Milky Way halo are of the order of keV. Dark matter boosted by energetic diffuse supernova background neutrinos can be detected only through nuclear or electron scattering in ground-based detectors requiring a non-zero interaction of dark matter with nucleon or electron, in addition to its interaction with neutrino. However, in the presence of dark matter-nucleon (electron) interaction, the scattering of dark matter with cosmic rays is unavoidable. Thus, we consider boosted dark matter resulting from diffuse supernova neutrinos as well as cosmic protons (electrons) considering both energy-dependent and energy-independent scattering cross-sections between dark matter and standard model particles. We explore this scenario in dark matter detectors such as XENONnT and neutrino detectors like Super-Kamiokande. △ Less

Submitted 17 March, 2025; v1 submitted 18 November, 2024; originally announced November 2024.

Comments: 27 pages, 7 figures, Accepted in PRD

Journal ref: Phys. Rev. D 111, 063019 (2025)

arXiv:2410.21545 [pdf, other]

CARMO: Dynamic Criteria Generation for Context-Aware Reward Modelling

Authors: Taneesh Gupta, Shivam Shandilya, Xuchao Zhang, Rahul Madhavan, Supriyo Ghosh, Chetan Bansal, Huaxiu Yao, Saravan Rajmohan

Abstract: Reward modeling in large language models is susceptible to reward hacking, causing models to latch onto superficial features such as the tendency to generate lists or unnecessarily long responses. In reinforcement learning from human feedback (RLHF) and more generally during post-training flawed reward signals often lead to outputs that optimize for these spurious correlates instead of genuine qua… ▽ More Reward modeling in large language models is susceptible to reward hacking, causing models to latch onto superficial features such as the tendency to generate lists or unnecessarily long responses. In reinforcement learning from human feedback (RLHF) and more generally during post-training flawed reward signals often lead to outputs that optimize for these spurious correlates instead of genuine quality or correctness. We propose Context-Aware Reward Modeling (CARMO), a novel approach that first generates dynamic, context-relevant criteria to ground the reward model before producing reward scores. Unlike prior methods that rely on static rubrics, CARMO leverages large language models (LLMs) to adaptively create evaluation criteria such as logical consistency, clarity, and depth tailored to the user query. Our theoretical analysis shows that such criteria generation can mitigate reward hacking. We further demonstrate that CARMO can be distilled into smaller models, reducing the computational cost of alignment. We establish a new state-of-the-art performance in zero-shot settings for generative models, achieving a 2.1\% improvement on Reward Bench. Furthermore, alignment performed on the CARMO-curated preference dataset achieves 22.5\% and 21.1\% LC-WR and WR, respectively, on Mistral-Base (7B). △ Less

Submitted 17 February, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

arXiv:2410.16565 [pdf, other]

doi 10.3847/1538-4357/adc681

Search for gravitational waves emitted from SN 2023ixf

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1758 additional authors not shown)

Abstract: We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been… ▽ More We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been identified in data when at least two gravitational-wave observatories were operating, which covered $\sim 14\%$ of this five-day window. We report the search detection efficiency for various possible gravitational-wave emission models. Considering the distance to M101 (6.7 Mpc), we derive constraints on the gravitational-wave emission mechanism of core-collapse supernovae across a broad frequency spectrum, ranging from 50 Hz to 2 kHz where we assume the gravitational-wave emission occurred when coincident data are available in the on-source window. Considering an ellipsoid model for a rotating proto-neutron star, our search is sensitive to gravitational-wave energy $1 \times 10^{-4} M_{\odot} c^2$ and luminosity $2.6 \times 10^{-4} M_{\odot} c^2/s$ for a source emitting at 82 Hz. These constraints are around an order of magnitude more stringent than those obtained so far with gravitational-wave data. The constraint on the ellipticity of the proto-neutron star that is formed is as low as 1.08, at frequencies above 1200 Hz, surpassing past results. △ Less

Submitted 11 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

Comments: Main paper: 6 pages, 4 figures and 1 table. Total with appendices: 20 pages, 4 figures, and 1 table

Report number: LIGO-P2400125

Journal ref: ApJ 985 183 (2025)

arXiv:2410.12822 [pdf, other]

AVID: Adapting Video Diffusion Models to World Models

Authors: Marc Rigter, Tarun Gupta, Agrin Hilmkil, Chao Ma

Abstract: Large-scale generative models have achieved remarkable success in a number of domains. However, for sequential decision-making problems, such as robotics, action-labelled data is often scarce and therefore scaling-up foundation models for decision-making remains a challenge. A potential solution lies in leveraging widely-available unlabelled videos to train world models that simulate the consequen… ▽ More Large-scale generative models have achieved remarkable success in a number of domains. However, for sequential decision-making problems, such as robotics, action-labelled data is often scarce and therefore scaling-up foundation models for decision-making remains a challenge. A potential solution lies in leveraging widely-available unlabelled videos to train world models that simulate the consequences of actions. If the world model is accurate, it can be used to optimize decision-making in downstream tasks. Image-to-video diffusion models are already capable of generating highly realistic synthetic videos. However, these models are not action-conditioned, and the most powerful models are closed-source which means they cannot be finetuned. In this work, we propose to adapt pretrained video diffusion models to action-conditioned world models, without access to the parameters of the pretrained model. Our approach, AVID, trains an adapter on a small domain-specific dataset of action-labelled videos. AVID uses a learned mask to modify the intermediate outputs of the pretrained model and generate accurate action-conditioned videos. We evaluate AVID on video game and real-world robotics data, and show that it outperforms existing baselines for diffusion model adaptation.1 Our results demonstrate that if utilized correctly, pretrained video models have the potential to be powerful tools for embodied AI. △ Less

Submitted 24 November, 2024; v1 submitted 1 October, 2024; originally announced October 2024.

Comments: Project Webpage: https://sites.google.com/view/avid-world-model-adapters/home

arXiv:2410.09151 [pdf, other]

doi 10.3847/1538-4357/ad8de0

A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1758 additional authors not shown)

Abstract: The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by… ▽ More The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs. △ Less

Submitted 21 May, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

Comments: 15 pages of text including references, 4 figures, 5 tables

Report number: LIGO-P2400192

Journal ref: ApJ 977 255 (2024)

arXiv:2410.08507 [pdf, ps, other]

Decentralized Uncertainty-Aware Active Search with a Team of Aerial Robots

Authors: Wennie Tabib, John Stecklein, Caleb McDowell, Kshitij Goel, Felix Jonathan, Abhishek Rathod, Meghan Kokoski, Edsel Burkholder, Brian Wallace, Luis Ernesto Navarro-Serment, Nikhil Angad Bakshi, Tejus Gupta, Norman Papernick, David Guttendorf, Erik E. Kahn, Jessica Kasemer, Jesse Holdaway, Jeff Schneider

Abstract: Rapid search and rescue is critical to maximizing survival rates following natural disasters. However, these efforts are challenged by the need to search large disaster zones, lack of reliability in the communications infrastructure, and a priori unknown numbers of objects of interest (OOIs), such as injured survivors. Aerial robots are increasingly being deployed for search and rescue due to thei… ▽ More Rapid search and rescue is critical to maximizing survival rates following natural disasters. However, these efforts are challenged by the need to search large disaster zones, lack of reliability in the communications infrastructure, and a priori unknown numbers of objects of interest (OOIs), such as injured survivors. Aerial robots are increasingly being deployed for search and rescue due to their high mobility, but there remains a gap in deploying multi-robot autonomous aerial systems for methodical search of large environments. Prior works have relied on preprogrammed paths from human operators or are evaluated only in simulation. We bridge these gaps in the state of the art by developing and demonstrating a decentralized active search system, which biases its trajectories to take additional views of uncertain OOIs. The methodology leverages stochasticity for rapid coverage in communication denied scenarios. When communications are available, robots share poses, goals, and OOI information to accelerate the rate of search. Detections from multiple images and vehicles are fused to provide a mean and covariance for each OOI location. Extensive simulations and hardware experiments in Bloomingdale, OH, are conducted to validate the approach. The results demonstrate the active search approach outperforms greedy coverage-based planning in communication-denied scenarios while maintaining comparable performance in communication-enabled scenarios. The results also demonstrate the ability to detect and localize all a priori unknown OOIs with a mean error of approximately 3m at flight altitudes between 50m-60m. △ Less

Submitted 10 June, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

Comments: accepted at ISER 2025

arXiv:2409.17146 [pdf, other]

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

Authors: Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou , et al. (25 additional authors not shown)

Abstract: Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a result, the community has been missing foundational knowledge about how to build performant VLMs from scratch. We present Molmo, a new family of VLMs t… ▽ More Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a result, the community has been missing foundational knowledge about how to build performant VLMs from scratch. We present Molmo, a new family of VLMs that are state-of-the-art in their class of openness. Our key contribution is a collection of new datasets called PixMo, including a dataset of highly detailed image captions for pre-training, a free-form image Q&A dataset for fine-tuning, and an innovative 2D pointing dataset, all collected without the use of external VLMs. The success of our approach relies on careful modeling choices, a well-tuned training pipeline, and, most critically, the quality of our newly collected datasets. Our best-in-class 72B model not only outperforms others in the class of open weight and data models, but also outperforms larger proprietary models including Claude 3.5 Sonnet, and Gemini 1.5 Pro and Flash, second only to GPT-4o based on both academic benchmarks and on a large human evaluation. Our model weights, new datasets, and source code are available at https://molmo.allenai.org/blog. △ Less

Submitted 5 December, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

Comments: Updated with ablations and more technical details

arXiv:2409.08705 [pdf, ps, other]

doi 10.1103/PhysRevA.110.062426

Optimal discrimination of quantum sequences

Authors: Tathagata Gupta, Shayeef Murshid, Vincent Russo, Somshubhro Bandyopadhyay

Abstract: A key concept of quantum information theory is that accessing information encoded in a quantum system requires us to discriminate between several possible states the system could be in. A natural generalization of this problem, namely, quantum sequence discrimination, appears in various quantum information processing tasks, the objective being to determine the state of a finite sequence of quantum… ▽ More A key concept of quantum information theory is that accessing information encoded in a quantum system requires us to discriminate between several possible states the system could be in. A natural generalization of this problem, namely, quantum sequence discrimination, appears in various quantum information processing tasks, the objective being to determine the state of a finite sequence of quantum states. Since such a sequence is a composite quantum system, the fundamental question is whether an optimal measurement is local, i.e., comprising measurements on the individual members, or collective, i.e. requiring joint measurement(s). In some known instances of this problem, the optimal measurement is local, whereas in others, it is collective. But, so far, a definite prescription based solely on the problem description has been lacking. In this paper, we prove that if the members of a given sequence are drawn secretly and independently from an ensemble or even from different ensembles, the optimum success probability is achievable by fixed local measurements on the individual members of the sequence, and no collective measurement is necessary. This holds for both minimum-error and unambiguous state discrimination paradigms. △ Less

Submitted 4 January, 2025; v1 submitted 13 September, 2024; originally announced September 2024.

Comments: published version

Journal ref: Phys. Rev. A 110, 062426 (2024)

arXiv:2407.17766 [pdf, other]

Strategic Pseudo-Goal Perturbation for Deadlock-Free Multi-Agent Navigation in Social Mini-Games

Authors: Abhishek Jha, Tanishq Gupta, Sumit Singh Rawat, Girish Kumar

Abstract: This work introduces a Strategic Pseudo-Goal Perturbation (SPGP) technique, a novel approach to resolve deadlock situations in multi-agent navigation scenarios. Leveraging the robust framework of Safety Barrier Certificates, our method integrates a strategic perturbation mechanism that guides agents through social mini-games where deadlock and collision occur frequently. The method adopts a strate… ▽ More This work introduces a Strategic Pseudo-Goal Perturbation (SPGP) technique, a novel approach to resolve deadlock situations in multi-agent navigation scenarios. Leveraging the robust framework of Safety Barrier Certificates, our method integrates a strategic perturbation mechanism that guides agents through social mini-games where deadlock and collision occur frequently. The method adopts a strategic calculation process where agents, upon encountering a deadlock select a pseudo goal within a predefined radius around the current position to resolve the deadlock among agents. The calculation is based on controlled strategic algorithm, ensuring that deviation towards pseudo-goal is both purposeful and effective in resolution of deadlock. Once the agent reaches the pseudo goal, it resumes the path towards the original goal, thereby enhancing navigational efficiency and safety. Experimental results demonstrates SPGP's efficacy in reducing deadlock instances and improving overall system throughput in variety of multi-agent navigation scenarios. △ Less

Submitted 25 July, 2024; originally announced July 2024.

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 27 March, 2025; v1 submitted 13 July, 2024; originally announced July 2024.

Comments: Update to version accepted for publication in ApJ. 50 pages, 10 figures, 4 tables

Journal ref: ApJ, Volume 980, 2025, 207

arXiv:2407.08726 [pdf, other]

Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

Authors: Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer

Abstract: Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more sca… ▽ More Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms, Mapillary for FPV images and OpenStreetMap for BEV semantic maps. We introduce Map It Anywhere (MIA), a data engine that enables seamless curation and modeling of labeled map prediction data from existing open-source map platforms. Using our MIA data engine, we display the ease of automatically collecting a dataset of 1.2 million pairs of FPV images & BEV maps encompassing diverse geographies, landscapes, environmental factors, camera models & capture scenarios. We further train a simple camera model-agnostic model on this data for BEV map prediction. Extensive evaluations using established benchmarks and our dataset show that the data curated by MIA enables effective pretraining for generalizable BEV map prediction, with zero-shot performance far exceeding baselines trained on existing datasets by 35%. Our analysis highlights the promise of using large-scale public maps for developing & testing generalizable BEV perception, paving the way for more robust autonomous navigation. Website: https://mapitanywhere.github.io/ △ Less

Submitted 5 December, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks. Website: https://mapitanywhere.github.io/

arXiv:2406.12276 [pdf, other]

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

Authors: Tanmay Gupta, Luca Weihs, Aniruddha Kembhavi

Abstract: We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM context, CodeNav automatically indexes and searches over code blocks in the target codebase, finds relevant code snippets, imports them, and uses them to… ▽ More We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM context, CodeNav automatically indexes and searches over code blocks in the target codebase, finds relevant code snippets, imports them, and uses them to iteratively generate a solution with execution feedback. To highlight the core-capabilities of CodeNav, we first showcase three case studies where we use CodeNav for solving complex user queries using three diverse codebases. Next, on three benchmarks, we quantitatively compare the effectiveness of code-use (which only has access to the target codebase) to tool-use (which has privileged access to all tool names and descriptions). Finally, we study the effect of varying kinds of tool and library descriptions on code-use performance, as well as investigate the advantage of the agent seeing source code as opposed to natural descriptions of code. All code will be made open source under a permissive license. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11775 [pdf, other]

Task Me Anything

Authors: Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna

Abstract: Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their spec… ▽ More Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their specific use case. This paper introduces Task-Me-Anything, a benchmark generation engine which produces a benchmark tailored to a user's needs. Task-Me-Anything maintains an extendable taxonomy of visual assets and can programmatically generate a vast number of task instances. Additionally, it algorithmically addresses user queries regarding MLM performance efficiently within a computational budget. It contains 113K images, 10K videos, 2K 3D object assets, over 365 object categories, 655 attributes, and 335 relationships. It can generate 750M image/video question-answering pairs, which focus on evaluating MLM perceptual capabilities. Task-Me-Anything reveals critical insights: open-source MLMs excel in object and attribute recognition but lack spatial and temporal understanding; each model exhibits unique strengths and weaknesses; larger models generally perform better, though exceptions exist; and GPT4o demonstrates challenges in recognizing rotating/moving objects and distinguishing colors. △ Less

Submitted 27 January, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: NeurIPS 2024 Track on Datasets and Benchmarks. Website: https://www.task-me-anything.org

arXiv:2404.11719 [pdf, other]

Detecting gravitational wave signals using a flexible model for the amplitude and frequency evolution

Authors: Toral Gupta, Neil Cornish

Abstract: We currently lack good waveform models for many gravitational wave sources. Examples where models are lacking include neutron star post merger signals, core collapse supernovae, and signals of unknown origin. Wavelet based techniques have proven effective at detecting and characterizing these signals. Here we introduce a new method that uses collections of evolving amplitude-frequency tracks, or "… ▽ More We currently lack good waveform models for many gravitational wave sources. Examples where models are lacking include neutron star post merger signals, core collapse supernovae, and signals of unknown origin. Wavelet based techniques have proven effective at detecting and characterizing these signals. Here we introduce a new method that uses collections of evolving amplitude-frequency tracks, or "voices", to model generic gravitational wave signals. The analysis is implemented using trans-dimensional Bayesian inference, building on the earlier wavelet-based BayesWave algorithm. The new algorithm, BayesWaveVoices, outperforms the original for long duration signals. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 13 pages, 12 figures

arXiv:2404.05366 [pdf, other]

CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery

Authors: Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee

Abstract: In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is… ▽ More In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets, while emphasizing the distinct categorization of the target data. To facilitate this, we propose an entropy-driven adversarial learning strategy that accounts for the distance distributions of target samples relative to source-domain class prototypes. Parallelly, the discriminative nature of the shared space is upheld through a fusion of three metric learning objectives. In the source domain, our focus is on refining the proximity between samples and their affiliated class prototypes, while in the target domain, we integrate a neighborhood-centric contrastive learning mechanism, enriched with an adept neighborsmining approach. To further accentuate the nuanced feature interrelation among semantically aligned images, we champion the concept of conditional image inpainting, underscoring the premise that semantically analogous images prove more efficacious to the task than their disjointed counterparts. Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Accepted in L3D-IVU, CVPR Workshop, 2024

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

Showing 1–50 of 156 results for author: Gupta, T