-
What Do Temporal Graph Learning Models Learn?
Authors:
Abigail J. Hayes,
Tobias Schumacher,
Markus Strohmaier
Abstract:
Learning on temporal graphs has become a central topic in graph representation learning, with numerous benchmarks indicating the strong performance of state-of-the-art models. However, recent work has raised concerns about the reliability of benchmark results, noting issues with commonly used evaluation protocols and the surprising competitiveness of simple heuristics. This contrast raises the que…
▽ More
Learning on temporal graphs has become a central topic in graph representation learning, with numerous benchmarks indicating the strong performance of state-of-the-art models. However, recent work has raised concerns about the reliability of benchmark results, noting issues with commonly used evaluation protocols and the surprising competitiveness of simple heuristics. This contrast raises the question of which characteristics of the underlying graphs temporal graph learning models actually use to form their predictions. We address this by systematically evaluating eight models on their ability to capture eight fundamental characteristics related to the link structure of temporal graphs. These include structural characteristics such as density, temporal patterns such as recency, and edge formation mechanisms such as homophily. Using both synthetic and real-world datasets, we analyze how well models learn these characteristics. Our findings reveal a mixed picture: models capture some characteristics well but fail to reproduce others. With this, we expose important limitations. Overall, we believe that our results provide practical insights for the application of temporal graph learning models and motivate more interpretability-driven evaluations in graph learning research.
△ Less
Submitted 1 April, 2026; v1 submitted 10 October, 2025;
originally announced October 2025.
-
GPT-4o System Card
Authors:
OpenAI,
:,
Aaron Hurst,
Adam Lerer,
Adam P. Goucher,
Adam Perelman,
Aditya Ramesh,
Aidan Clark,
AJ Ostrow,
Akila Welihinda,
Alan Hayes,
Alec Radford,
Aleksander MÄ…dry,
Alex Baker-Whitcomb,
Alex Beutel,
Alex Borzunov,
Alex Carney,
Alex Chow,
Alex Kirillov,
Alex Nichol,
Alex Paino,
Alex Renzin,
Alex Tachard Passos,
Alexander Kirillov,
Alexi Christakis
, et al. (395 additional authors not shown)
Abstract:
GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil…
▽ More
GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50\% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models. In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House, we are sharing the GPT-4o System Card, which includes our Preparedness Framework evaluations. In this System Card, we provide a detailed look at GPT-4o's capabilities, limitations, and safety evaluations across multiple categories, focusing on speech-to-speech while also evaluating text and image capabilities, and measures we've implemented to ensure the model is safe and aligned. We also include third-party assessments on dangerous capabilities, as well as discussion of potential societal impacts of GPT-4o's text and vision capabilities.
△ Less
Submitted 25 October, 2024;
originally announced October 2024.
-
Atmospheric Density-Compensating Model Predictive Control for Targeted Reentry of Drag-Modulated Spacecraft
Authors:
Alex D. Hayes,
Ryan J. Caverly
Abstract:
This paper presents an estimation and control framework that enables the targeted reentry of a drag-modulated spacecraft in the presence of atmospheric density uncertainty. In particular, an extended Kalman filter (EKF) is used to estimate the in-flight density errors relative to the atmospheric density used to generate the nominal guidance trajectory. This information is leveraged within a model…
▽ More
This paper presents an estimation and control framework that enables the targeted reentry of a drag-modulated spacecraft in the presence of atmospheric density uncertainty. In particular, an extended Kalman filter (EKF) is used to estimate the in-flight density errors relative to the atmospheric density used to generate the nominal guidance trajectory. This information is leveraged within a model predictive control (MPC) strategy to improve tracking performance, reduce control effort, and increase robustness to actuator saturation compared to the state-of-the-art approach. The estimation and control framework is tested in a Monte Carlo simulation campaign with historical space weather data. These simulation efforts demonstrate that the proposed framework is able to stay within 100 km of the guidance trajectory at all points in time for 98.4% of cases. The remaining 1.6% of cases were pushed away from the guidance by large density errors, many due to significant solar storms and flares, that could not physically be compensated for by the drag control device. For the successful cases, the proposed framework was able to guide the spacecraft to the desired location at the entry interface altitude with a mean error of 12.1 km and 99.7% of cases below 100 km.
△ Less
Submitted 9 June, 2025; v1 submitted 26 July, 2024;
originally announced July 2024.
-
Frontier AI Regulation: Managing Emerging Risks to Public Safety
Authors:
Markus Anderljung,
Joslyn Barnhart,
Anton Korinek,
Jade Leung,
Cullen O'Keefe,
Jess Whittlestone,
Shahar Avin,
Miles Brundage,
Justin Bullock,
Duncan Cass-Beggs,
Ben Chang,
Tantum Collins,
Tim Fist,
Gillian Hadfield,
Alan Hayes,
Lewis Ho,
Sara Hooker,
Eric Horvitz,
Noam Kolt,
Jonas Schuett,
Yonadav Shavit,
Divya Siddarth,
Robert Trager,
Kevin Wolf
Abstract:
Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilit…
▽ More
Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.
△ Less
Submitted 7 November, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Science Platforms for Heliophysics Data Analysis
Authors:
Monica G. Bobra,
Will T. Barnes,
Thomas Y. Chen,
Mark C. M. Cheung,
Laura A. Hayes,
Jack Ireland,
Miho Janvier,
Michael S. F. Kirk,
James P. Mason,
Stuart J. Mumford,
Paul J. Wright
Abstract:
We recommend that NASA maintain and fund science platforms that enable interactive and scalable data analysis in order to maximize the scientific return of data collected from space-based instruments.
We recommend that NASA maintain and fund science platforms that enable interactive and scalable data analysis in order to maximize the scientific return of data collected from space-based instruments.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
Adaptive Passivity-Based Pose Tracking Control of Cable-Driven Parallel Robots for Multiple Attitude Parameterizations
Authors:
Sze Kwan Cheah,
Alex Hayes,
Ryan J. Caverly
Abstract:
The proposed control method uses an adaptive feedforward-based controller to establish a passive input-output mapping for the CDPR that is used alongside a linear time-invariant strictly positive real feedback controller to guarantee robust closed-loop input-output stability and asymptotic pose trajectory tracking via the passivity theorem. A novelty of the proposed controller is its formulation f…
▽ More
The proposed control method uses an adaptive feedforward-based controller to establish a passive input-output mapping for the CDPR that is used alongside a linear time-invariant strictly positive real feedback controller to guarantee robust closed-loop input-output stability and asymptotic pose trajectory tracking via the passivity theorem. A novelty of the proposed controller is its formulation for use with a range of payload attitude parameterizations, including any unconstrained attitude parameterization, the quaternion, or the direction cosine matrix (DCM). The performance and robustness of the proposed controller is demonstrated through numerical simulations of a CDPR with rigid and flexible cables. The results demonstrate the importance of carefully defining the CDPR's pose error, which is performed in multiplicative fashion when using the quaternion and DCM, and in a specific additive fashion when using unconstrained attitude parameters (e.g., an Euler-angle sequence).
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
A Synergistic Compilation Workflow for Tackling Crosstalk in Quantum Machines
Authors:
Fei Hua,
Yuwei Jin,
Ang Li,
Chenxu Liu,
Meng Wang,
Yanhao Chen,
Chi Zhang,
Ari Hayes,
Samuel Stein,
Minghao Guo,
Yipeng Huang,
Eddy Z. Zhang
Abstract:
Near-term quantum systems tend to be noisy. Crosstalk noise has been recognized as one of several major types of noises in superconducting Noisy Intermediate-Scale Quantum (NISQ) devices. Crosstalk arises from the concurrent execution of two-qubit gates on nearby qubits, such as \texttt{CX}. It might significantly raise the error rate of gates in comparison to running them individually. Crosstalk…
▽ More
Near-term quantum systems tend to be noisy. Crosstalk noise has been recognized as one of several major types of noises in superconducting Noisy Intermediate-Scale Quantum (NISQ) devices. Crosstalk arises from the concurrent execution of two-qubit gates on nearby qubits, such as \texttt{CX}. It might significantly raise the error rate of gates in comparison to running them individually. Crosstalk can be mitigated through scheduling or hardware machine tuning. Prior scientific studies, however, manage crosstalk at a really late phase in the compilation process, usually after hardware mapping is done. It may miss great opportunities of optimizing algorithm logic, routing, and crosstalk at the same time. In this paper, we push the envelope by considering all these factors simultaneously at the very early compilation stage. We propose a crosstalk-aware quantum program compilation framework called CQC that can enhance crosstalk mitigation while achieving satisfactory circuit depth. Moreover, we identify opportunities for translation from intermediate representation to the circuit for application-specific crosstalk mitigation, for instance, the \texttt{CX} ladder construction in variational quantum eigensolvers (VQE). Evaluations through simulation and on real IBM-Q devices show that our framework can significantly reduce the error rate by up to 6$\times$, with only $\sim$60\% circuit depth compared to state-of-the-art gate scheduling approaches. In particular, for VQE, we demonstrate 49\% circuit depth reduction with 9.6\% fidelity improvement over prior art on the H4 molecule using IBMQ Guadalupe. Our CQC framework will be released on GitHub.
△ Less
Submitted 8 December, 2023; v1 submitted 12 July, 2022;
originally announced July 2022.
-
srlearn: A Python Library for Gradient-Boosted Statistical Relational Models
Authors:
Alexander L. Hayes
Abstract:
We present srlearn, a Python library for boosted statistical relational models. We adapt the scikit-learn interface to this setting and provide examples for how this can be used to express learning and inference problems.
We present srlearn, a Python library for boosted statistical relational models. We adapt the scikit-learn interface to this setting and provide examples for how this can be used to express learning and inference problems.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
User Friendly Automatic Construction of Background Knowledge: Mode Construction from ER Diagrams
Authors:
Alexander L. Hayes,
Mayukh Das,
Phillip Odom,
Sriraam Natarajan
Abstract:
One of the key advantages of Inductive Logic Programming systems is the ability of the domain experts to provide background knowledge as modes that allow for efficient search through the space of hypotheses. However, there is an inherent assumption that this expert should also be an ILP expert to provide effective modes. We relax this assumption by designing a graphical user interface that allows…
▽ More
One of the key advantages of Inductive Logic Programming systems is the ability of the domain experts to provide background knowledge as modes that allow for efficient search through the space of hypotheses. However, there is an inherent assumption that this expert should also be an ILP expert to provide effective modes. We relax this assumption by designing a graphical user interface that allows the domain expert to interact with the system using Entity Relationship diagrams. These interactions are used to construct modes for the learning system. We evaluate our algorithm on a probabilistic logic learning system where we demonstrate that the user is able to construct effective background knowledge on par with the expert-encoded knowledge on five data sets.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
PennyLane: Automatic differentiation of hybrid quantum-classical computations
Authors:
Ville Bergholm,
Josh Izaac,
Maria Schuld,
Christian Gogolin,
Shahnawaz Ahmed,
Vishnu Ajith,
M. Sohaib Alam,
Guillermo Alonso-Linaje,
B. AkashNarayanan,
Ali Asadi,
Juan Miguel Arrazola,
Utkarsh Azad,
Sam Banning,
Carsten Blank,
Thomas R Bromley,
Benjamin A. Cordier,
Jack Ceroni,
Alain Delgado,
Olivia Di Matteo,
Amintor Dusko,
Tanya Garg,
Diego Guala,
Anthony Hayes,
Ryan Hill,
Aroosa Ijaz
, et al. (43 additional authors not shown)
Abstract:
PennyLane is a Python 3 software framework for differentiable programming of quantum computers. The library provides a unified architecture for near-term quantum computing devices, supporting both qubit and continuous-variable paradigms. PennyLane's core feature is the ability to compute gradients of variational quantum circuits in a way that is compatible with classical techniques such as backpro…
▽ More
PennyLane is a Python 3 software framework for differentiable programming of quantum computers. The library provides a unified architecture for near-term quantum computing devices, supporting both qubit and continuous-variable paradigms. PennyLane's core feature is the ability to compute gradients of variational quantum circuits in a way that is compatible with classical techniques such as backpropagation. PennyLane thus extends the automatic differentiation algorithms common in optimization and machine learning to include quantum and hybrid computations. A plugin system makes the framework compatible with any gate-based quantum simulator or hardware. We provide plugins for hardware providers including the Xanadu Cloud, Amazon Braket, and IBM Quantum, allowing PennyLane optimizations to be run on publicly accessible quantum devices. On the classical front, PennyLane interfaces with accelerated machine learning libraries such as TensorFlow, PyTorch, JAX, and Autograd. PennyLane can be used for the optimization of variational quantum eigensolvers, quantum approximate optimization, quantum machine learning models, and many other applications.
△ Less
Submitted 29 July, 2022; v1 submitted 12 November, 2018;
originally announced November 2018.
-
A Graph-based Model for GPU Caching Problems
Authors:
Lingda Li,
Ari B. Hayes,
Stephen A. Hackler,
Eddy Z. Zhang,
Mario Szegedy,
Shuaiwen Leon Song
Abstract:
Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling among different threads. Traditionally, in the field of parallel computing, graph partition models are used to model data communication and guide task scheduling.…
▽ More
Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling among different threads. Traditionally, in the field of parallel computing, graph partition models are used to model data communication and guide task scheduling. However, we discover that the previous methods are either inaccurate or expensive when applied to GPU programs. In this paper, we propose a novel task partition model that is accurate and gives rise to the development of fast and high quality task/data reorganization algorithms. We demonstrate the effectiveness of the proposed model by rigorous theoretical analysis of the algorithm bounds and extensive experimental analysis. The experimental results show that it achieves significant performance improvement across a representative set of GPU applications.
△ Less
Submitted 6 May, 2016;
originally announced May 2016.