-
FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks
Authors:
Sabrina McCallum,
Amit Parekh,
Alessandro Suglia
Abstract:
Current approaches to embodied AI tend to learn policies from expert demonstrations. However, without a mechanism to evaluate the quality of demonstrated actions, they are limited to learning from optimal behaviour, or they risk replicating errors and inefficiencies. While reinforcement learning offers one alternative, the associated exploration typically results in sacrificing data efficiency. Th…
▽ More
Current approaches to embodied AI tend to learn policies from expert demonstrations. However, without a mechanism to evaluate the quality of demonstrated actions, they are limited to learning from optimal behaviour, or they risk replicating errors and inefficiencies. While reinforcement learning offers one alternative, the associated exploration typically results in sacrificing data efficiency. This work explores how agents trained with imitation learning can learn robust representations from both optimal and suboptimal demonstrations when given access to constructive language feedback as a means to contextualise different modes of behaviour. We directly provide language feedback embeddings as part of the input sequence into a Transformer-based policy, and optionally complement the traditional next action prediction objective with auxiliary self-supervised learning objectives for feedback prediction. We test our approach on a range of embodied Vision-and-Language tasks in our custom BabyAI-XGen environment and show significant improvements in agents' compositional generalisation abilities and robustness, suggesting that our data-efficient method allows models to successfully convert suboptimal behaviour into learning opportunities. Overall, our results suggest that language feedback is a competitive and intuitive alternative to intermediate scalar rewards for language-specified embodied tasks.
△ Less
Submitted 13 October, 2025;
originally announced October 2025.
-
Linear Regression in p-adic metric spaces
Authors:
Gregory D. Baker,
Scott McCallum,
Dirk Pattinson
Abstract:
Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane min…
▽ More
Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane minimizing the p-adic sum of distances to points in a dataset must pass through at least n + 1 of those points -- a striking contrast to Euclidean regression that highlights how p-adic metrics better align with the discrete nature of hierarchical data. As a corollary, a polynomial of degree n constructed to minimise the p-adic sum of residuals will pass through at least n + 1 points. As a further corollary, a polynomial of degree n approximating a higher degree polynomial at a finite number of points will yield a difference polynomial that has distinct rational roots. We demonstrate the practical significance of this result through two applications in natural language processing: analyzing hierarchical taxonomies and modeling grammatical morphology. These results suggest that p-adic metrics may be fundamental to properly handling hierarchical data structures in machine learning. In hierarchical data, interpolation between points often makes less sense than selecting actual observed points as representatives.
△ Less
Submitted 27 September, 2025;
originally announced October 2025.
-
Reversible Deep Equilibrium Models
Authors:
Sam McCallum,
Kamran Arora,
James Foster
Abstract:
Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often…
▽ More
Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often leads to unstable training dynamics and requires regularisation or many function evaluations to fix. Here, we introduce Reversible Deep Equilibrium Models (RevDEQs) that allow for exact gradient calculation, no regularisation and far fewer function evaluations than DEQs. We show that RevDEQs significantly improve performance on language modelling and image classification tasks against comparable implicit and explicit models.
△ Less
Submitted 3 December, 2025; v1 submitted 16 September, 2025;
originally announced September 2025.
-
Efficient, Accurate and Stable Gradients for Neural ODEs
Authors:
Sam McCallum,
James Foster
Abstract:
Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible ODE solvers that significantly improve upon both the time and memory cost of recursive checkpointing. The reversible solvers presented calculate exact gradients…
▽ More
Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible ODE solvers that significantly improve upon both the time and memory cost of recursive checkpointing. The reversible solvers presented calculate exact gradients, are high-order and numerically stable -- strictly improving on previous reversible architectures.
△ Less
Submitted 29 January, 2025; v1 submitted 15 October, 2024;
originally announced October 2024.
-
Iterated Resultants and Rational Functions in Real Quantifier Elimination
Authors:
James H. Davenport,
Matthew England,
Scott McCallum,
Ali K. Uncu
Abstract:
This paper builds and extends on the authors' previous work related to the algorithmic tool, Cylindrical Algebraic Decomposition (CAD), and one of its core applications, Real Quantifier Elimination (QE). These topics are at the heart of symbolic computation and were first implemented in computer algebra systems decades ago, but have recently received renewed interest as part of the ongoing develop…
▽ More
This paper builds and extends on the authors' previous work related to the algorithmic tool, Cylindrical Algebraic Decomposition (CAD), and one of its core applications, Real Quantifier Elimination (QE). These topics are at the heart of symbolic computation and were first implemented in computer algebra systems decades ago, but have recently received renewed interest as part of the ongoing development of SMT solvers for non-linear real arithmetic.
First, we consider the use of iterated univariate resultants in traditional CAD, and how this leads to inefficiencies, especially in the case of an input with multiple equational constraints. We reproduce the workshop paper [Davenport and England, 2023], adding important clarifications to our suggestions first made there to make use of multivariate resultants in the projection phase of CAD. We then consider an alternative approach to this problem first documented in [McCallum and Brown, 2009] which redefines the actual object under construction, albeit only in the case of two equational constraints. We correct an unhelpful typo and provide a proof missing from that paper.
We finish by revising the topic of how to deal with SMT or Real QE problems expressed using rational functions (as opposed to the usual polynomial ones) noting that these are often found in industrial applications. We revisit a proposal made in [Uncu, Davenport and England, 2023] for doing this in the case of satisfiability, explaining why such an approach does not trivially extend to more complicated quantification structure and giving a suitable alternative.
△ Less
Submitted 26 December, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning
Authors:
Sabrina McCallum,
Max Taylor-Davies,
Stefano V. Albrecht,
Alessandro Suglia
Abstract:
Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning. One possible way to help bridge this gap be to provide RL agents with richer, more human-like feedback expressed in natural language. To investigate this idea, we first extend BabyAI to automatically generate language feedback from the envi…
▽ More
Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning. One possible way to help bridge this gap be to provide RL agents with richer, more human-like feedback expressed in natural language. To investigate this idea, we first extend BabyAI to automatically generate language feedback from the environment dynamics and goal condition success. Then, we modify the Decision Transformer architecture to take advantage of this additional signal. We find that training with language feedback either in place of or in addition to the return-to-go or goal descriptions improves agents' generalisation performance, and that agents can benefit from feedback even when this is only available during training, but not at inference.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Truth Table Invariant Cylindrical Algebraic Decomposition
Authors:
Russell Bradford,
James H. Davenport,
Matthew England,
Scott McCallum,
David Wilson
Abstract:
When using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is likely not the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This observation motivates our article and definition of a Truth Table Invariant CAD (TTICAD).
In ISSAC 2013 the current author…
▽ More
When using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is likely not the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This observation motivates our article and definition of a Truth Table Invariant CAD (TTICAD).
In ISSAC 2013 the current authors presented an algorithm that can efficiently and directly construct a TTICAD for a list of formulae in which each has an equational constraint. This was achieved by generalising McCallum's theory of reduced projection operators. In this paper we present an extended version of our theory which can be applied to an arbitrary list of formulae, achieving savings if at least one has an equational constraint. We also explain how the theory of reduced projection operators can allow for further improvements to the lifting phase of CAD algorithms, even in the context of a single equational constraint.
The algorithm is implemented fully in Maple and we present both promising results from experimentation and a complexity analysis showing the benefits of our contributions.
△ Less
Submitted 13 November, 2015; v1 submitted 3 January, 2014;
originally announced January 2014.
-
Cylindrical Algebraic Decompositions for Boolean Combinations
Authors:
Russell Bradford,
James H. Davenport,
Matthew England,
Scott McCallum,
David Wilson
Abstract:
This article makes the key observation that when using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is not always the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This motivates our definition of a Truth Table Invariant CAD (TTICAD). We generalise…
▽ More
This article makes the key observation that when using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is not always the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This motivates our definition of a Truth Table Invariant CAD (TTICAD). We generalise the theory of equational constraints to design an algorithm which will efficiently construct a TTICAD for a wide class of problems, producing stronger results than when using equational constraints alone. The algorithm is implemented fully in Maple and we present promising results from experimentation.
△ Less
Submitted 29 April, 2013;
originally announced April 2013.