arXiv:2510.11307 [pdf, ps, other]

FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks

Authors: Sabrina McCallum, Amit Parekh, Alessandro Suglia

Abstract: Current approaches to embodied AI tend to learn policies from expert demonstrations. However, without a mechanism to evaluate the quality of demonstrated actions, they are limited to learning from optimal behaviour, or they risk replicating errors and inefficiencies. While reinforcement learning offers one alternative, the associated exploration typically results in sacrificing data efficiency. Th… ▽ More Current approaches to embodied AI tend to learn policies from expert demonstrations. However, without a mechanism to evaluate the quality of demonstrated actions, they are limited to learning from optimal behaviour, or they risk replicating errors and inefficiencies. While reinforcement learning offers one alternative, the associated exploration typically results in sacrificing data efficiency. This work explores how agents trained with imitation learning can learn robust representations from both optimal and suboptimal demonstrations when given access to constructive language feedback as a means to contextualise different modes of behaviour. We directly provide language feedback embeddings as part of the input sequence into a Transformer-based policy, and optionally complement the traditional next action prediction objective with auxiliary self-supervised learning objectives for feedback prediction. We test our approach on a range of embodied Vision-and-Language tasks in our custom BabyAI-XGen environment and show significant improvements in agents' compositional generalisation abilities and robustness, suggesting that our data-efficient method allows models to successfully convert suboptimal behaviour into learning opportunities. Overall, our results suggest that language feedback is a competitive and intuitive alternative to intermediate scalar rewards for language-specified embodied tasks. △ Less

Submitted 13 October, 2025; originally announced October 2025.

Comments: EMNLP 2025 Findings

arXiv:2510.00043 [pdf, ps, other]

Linear Regression in p-adic metric spaces

Authors: Gregory D. Baker, Scott McCallum, Dirk Pattinson

Abstract: Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane min… ▽ More Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane minimizing the p-adic sum of distances to points in a dataset must pass through at least n + 1 of those points -- a striking contrast to Euclidean regression that highlights how p-adic metrics better align with the discrete nature of hierarchical data. As a corollary, a polynomial of degree n constructed to minimise the p-adic sum of residuals will pass through at least n + 1 points. As a further corollary, a polynomial of degree n approximating a higher degree polynomial at a finite number of points will yield a difference polynomial that has distinct rational roots. We demonstrate the practical significance of this result through two applications in natural language processing: analyzing hierarchical taxonomies and modeling grammatical morphology. These results suggest that p-adic metrics may be fundamental to properly handling hierarchical data structures in machine learning. In hierarchical data, interpolation between points often makes less sense than selecting actual observed points as representatives. △ Less

Submitted 27 September, 2025; originally announced October 2025.

MSC Class: 11D88; 62J99; 68T50 ACM Class: G.3; I.2.6; I.2.7; I.5.1; I.5.4

Journal ref: p-Adic Numbers, Ultrametric Analysis and Applications, volume 17(4), 2025

arXiv:2509.12917 [pdf, ps, other]

Reversible Deep Equilibrium Models

Authors: Sam McCallum, Kamran Arora, James Foster

Abstract: Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often… ▽ More Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often leads to unstable training dynamics and requires regularisation or many function evaluations to fix. Here, we introduce Reversible Deep Equilibrium Models (RevDEQs) that allow for exact gradient calculation, no regularisation and far fewer function evaluations than DEQs. We show that RevDEQs significantly improve performance on language modelling and image classification tasks against comparable implicit and explicit models. △ Less

Submitted 3 December, 2025; v1 submitted 16 September, 2025; originally announced September 2025.

arXiv:2410.11648 [pdf, other]

Efficient, Accurate and Stable Gradients for Neural ODEs

Authors: Sam McCallum, James Foster

Abstract: Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible ODE solvers that significantly improve upon both the time and memory cost of recursive checkpointing. The reversible solvers presented calculate exact gradients… ▽ More Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible ODE solvers that significantly improve upon both the time and memory cost of recursive checkpointing. The reversible solvers presented calculate exact gradients, are high-order and numerically stable -- strictly improving on previous reversible architectures. △ Less

Submitted 29 January, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

Comments: Preprint

arXiv:2312.16210 [pdf, ps, other]

doi 10.1007/s11786-025-00606-4

Iterated Resultants and Rational Functions in Real Quantifier Elimination

Authors: James H. Davenport, Matthew England, Scott McCallum, Ali K. Uncu

Abstract: This paper builds and extends on the authors' previous work related to the algorithmic tool, Cylindrical Algebraic Decomposition (CAD), and one of its core applications, Real Quantifier Elimination (QE). These topics are at the heart of symbolic computation and were first implemented in computer algebra systems decades ago, but have recently received renewed interest as part of the ongoing develop… ▽ More This paper builds and extends on the authors' previous work related to the algorithmic tool, Cylindrical Algebraic Decomposition (CAD), and one of its core applications, Real Quantifier Elimination (QE). These topics are at the heart of symbolic computation and were first implemented in computer algebra systems decades ago, but have recently received renewed interest as part of the ongoing development of SMT solvers for non-linear real arithmetic. First, we consider the use of iterated univariate resultants in traditional CAD, and how this leads to inefficiencies, especially in the case of an input with multiple equational constraints. We reproduce the workshop paper [Davenport and England, 2023], adding important clarifications to our suggestions first made there to make use of multivariate resultants in the projection phase of CAD. We then consider an alternative approach to this problem first documented in [McCallum and Brown, 2009] which redefines the actual object under construction, albeit only in the case of two equational constraints. We correct an unhelpful typo and provide a proof missing from that paper. We finish by revising the topic of how to deal with SMT or Real QE problems expressed using rational functions (as opposed to the usual polynomial ones) noting that these are often found in industrial applications. We revisit a proposal made in [Uncu, Davenport and England, 2023] for doing this in the case of satisfiability, explaining why such an approach does not trivially extend to more complicated quantification structure and giving a suitable alternative. △ Less

Submitted 26 December, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

Comments: Submitted to Mathematics in Computer Science

MSC Class: 14W30 (primary) 68W30 (secondary) ACM Class: I.1.2

Journal ref: Mathematics in Computer Science, volume 19, article number 12, Springer, 2025

arXiv:2312.04736 [pdf, other]

Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning

Authors: Sabrina McCallum, Max Taylor-Davies, Stefano V. Albrecht, Alessandro Suglia

Abstract: Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning. One possible way to help bridge this gap be to provide RL agents with richer, more human-like feedback expressed in natural language. To investigate this idea, we first extend BabyAI to automatically generate language feedback from the envi… ▽ More Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning. One possible way to help bridge this gap be to provide RL agents with richer, more human-like feedback expressed in natural language. To investigate this idea, we first extend BabyAI to automatically generate language feedback from the environment dynamics and goal condition success. Then, we modify the Decision Transformer architecture to take advantage of this additional signal. We find that training with language feedback either in place of or in addition to the return-to-go or goal descriptions improves agents' generalisation performance, and that agents can benefit from feedback even when this is only available during training, but not at inference. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: Accepted at Workshop on Goal-conditioned Reinforcement Learning, NeurIPS 2023

arXiv:1401.0645 [pdf, other]

doi 10.1016/j.jsc.2015.11.002

Truth Table Invariant Cylindrical Algebraic Decomposition

Authors: Russell Bradford, James H. Davenport, Matthew England, Scott McCallum, David Wilson

Abstract: When using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is likely not the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This observation motivates our article and definition of a Truth Table Invariant CAD (TTICAD). In ISSAC 2013 the current author… ▽ More When using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is likely not the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This observation motivates our article and definition of a Truth Table Invariant CAD (TTICAD). In ISSAC 2013 the current authors presented an algorithm that can efficiently and directly construct a TTICAD for a list of formulae in which each has an equational constraint. This was achieved by generalising McCallum's theory of reduced projection operators. In this paper we present an extended version of our theory which can be applied to an arbitrary list of formulae, achieving savings if at least one has an equational constraint. We also explain how the theory of reduced projection operators can allow for further improvements to the lifting phase of CAD algorithms, even in the context of a single equational constraint. The algorithm is implemented fully in Maple and we present both promising results from experimentation and a complexity analysis showing the benefits of our contributions. △ Less

Submitted 13 November, 2015; v1 submitted 3 January, 2014; originally announced January 2014.

Comments: 40 pages

MSC Class: 68W30; 03C10 ACM Class: I.1.2

Journal ref: Journal of Symbolic Computation 76, pp. 1-35, 2016

arXiv:1304.7603 [pdf, other]

doi 10.1145/2465506.2465516

Cylindrical Algebraic Decompositions for Boolean Combinations

Authors: Russell Bradford, James H. Davenport, Matthew England, Scott McCallum, David Wilson

Abstract: This article makes the key observation that when using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is not always the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This motivates our definition of a Truth Table Invariant CAD (TTICAD). We generalise… ▽ More This article makes the key observation that when using cylindrical algebraic decomposition (CAD) to solve a problem with respect to a set of polynomials, it is not always the signs of those polynomials that are of paramount importance but rather the truth values of certain quantifier free formulae involving them. This motivates our definition of a Truth Table Invariant CAD (TTICAD). We generalise the theory of equational constraints to design an algorithm which will efficiently construct a TTICAD for a wide class of problems, producing stronger results than when using equational constraints alone. The algorithm is implemented fully in Maple and we present promising results from experimentation. △ Less

Submitted 29 April, 2013; originally announced April 2013.

Comments: To appear in the proceedings of the 38th International Symposium on Symbolic and Algebraic Computation (ISSAC '13)

MSC Class: 68W30; 03C10 ACM Class: I.1.2

Journal ref: In: Proceedings of the 38th International Symposium on Symbolic and Algebraic Computation, (ISSAC '13), pp 125-132, 2013

Showing 1–8 of 8 results for author: McCallum, S