-
KernelCraft: Benchmarking for Agentic Close-to-Metal Kernel Generation on Emerging Hardware
Authors:
Jiayi Nie,
Haoran Wu,
Yao Lai,
Zeyu Cao,
Cheng Zhang,
Binglei Lou,
Erwei Wang,
Jianyi Cheng,
Timothy M. Jones,
Robert Mullins,
Rika Antonova,
Yiren Zhao
Abstract:
New AI accelerators with novel instruction set architectures (ISAs) often require developers to manually craft low-level kernels -- a time-consuming, laborious, and error-prone process that cannot scale across diverse hardware targets. This prevents emerging hardware platforms from reaching the market efficiently. While prior LLM-based code generation has shown promise in mature GPU ecosystems, it…
▽ More
New AI accelerators with novel instruction set architectures (ISAs) often require developers to manually craft low-level kernels -- a time-consuming, laborious, and error-prone process that cannot scale across diverse hardware targets. This prevents emerging hardware platforms from reaching the market efficiently. While prior LLM-based code generation has shown promise in mature GPU ecosystems, it remains unclear whether agentic LLM systems can quickly produce valid and efficient kernels for emerging hardware with new ISAs. We present KernelCraft: the first benchmark to evaluate an LLM agent's ability to generate and optimize low-level kernels for customized accelerators via a function-calling, feedback-driven workflow. Within KernelCraft, the agent refines kernels under ISA and hardware constraints using automated feedback derived from compilation checks, simulation, and correctness validation against ground truth. In our experiments, we assess agent performance across three emerging accelerator platforms on more than 20 ML tasks, each with 5 diverse task configurations, with special evaluation of task configuration complexity. Across four leading reasoning models, top agents produce functionally valid kernels for previously unseen ISAs within a few refinement steps, with optimized kernels that match or outperform template-based compiler baselines. With that, we demonstrate the potential for reducing the cost of kernel development for accelerator designers and kernel developers.
△ Less
Submitted 10 February, 2026;
originally announced March 2026.
-
Adversarial Query Synthesis via Bayesian Optimization
Authors:
Jeffrey Tao,
Yimeng Zeng,
Haydn Thomas Jones,
Natalie Maus,
Osbert Bastani,
Jacob R. Gardner,
Ryan Marcus
Abstract:
Benchmark workloads are extremely important to the database management research community, especially as more machine learning components are integrated into database systems. Here, we propose a Bayesian optimization technique to automatically search for difficult benchmark queries, significantly reducing the amount of manual effort usually required. In preliminary experiments, we show that our ap…
▽ More
Benchmark workloads are extremely important to the database management research community, especially as more machine learning components are integrated into database systems. Here, we propose a Bayesian optimization technique to automatically search for difficult benchmark queries, significantly reducing the amount of manual effort usually required. In preliminary experiments, we show that our approach can generate queries with more than double the optimization headroom compared to existing benchmarks.
△ Less
Submitted 2 March, 2026;
originally announced March 2026.
-
Purely Agentic Black-Box Optimization for Biological Design
Authors:
Natalie Maus,
Yimeng Zeng,
Haydn Thomas Jones,
Yining Huang,
Gaurav Ng Goel,
Alden Rose,
Kyurae Kim,
Hyun-Su Lee,
Marcelo Der Torossian Torres,
Fangping Wan,
Cesar de la Fuente-Nunez,
Mark Yatskar,
Osbert Bastani,
Jacob R. Gardner
Abstract:
Many key challenges in biological design-such as small-molecule drug discovery, antimicrobial peptide development, and protein engineering-can be framed as black-box optimization over vast, complex structured spaces. Existing methods rely mainly on raw structural data and struggle to exploit the rich scientific literature. While large language models (LLMs) have been added to these pipelines, they…
▽ More
Many key challenges in biological design-such as small-molecule drug discovery, antimicrobial peptide development, and protein engineering-can be framed as black-box optimization over vast, complex structured spaces. Existing methods rely mainly on raw structural data and struggle to exploit the rich scientific literature. While large language models (LLMs) have been added to these pipelines, they have been confined to narrow roles within structure-centered optimizers. We instead cast biological black-box optimization as a fully agentic, language-based reasoning process. We introduce Purely Agentic BLack-box Optimization (PABLO), a hierarchical agentic system that uses scientific LLMs pretrained on chemistry and biology literature to generate and iteratively refine biological candidates. On both the standard GuacaMol molecular design and antimicrobial peptide optimization tasks, PABLO achieves state-of-the-art performance, substantially improving sample efficiency and final objective values over established baselines. Compared to prior optimization methods that incorporate LLMs, PABLO achieves competitive token usage per run despite relying on LLMs throughout the optimization loop. Beyond raw performance, the agentic formulation offers key advantages for realistic design: it naturally incorporates semantic task descriptions, retrieval-augmented domain knowledge, and complex constraints. In follow-up in vitro validation, PABLO-optimized peptides showed strong activity against drug-resistant pathogens, underscoring the practical potential of PABLO for therapeutic discovery.
△ Less
Submitted 29 January, 2026;
originally announced January 2026.
-
Attention-Informed Surrogates for Navigating Power-Performance Trade-offs in HPC
Authors:
Ashna Nawar Ahmed,
Banooqa Banday,
Terry Jones,
Tanzima Z. Islam
Abstract:
High-Performance Computing (HPC) schedulers must balance user performance with facility-wide resource constraints. The task boils down to selecting the optimal number of nodes for a given job. We present a surrogate-assisted multi-objective Bayesian optimization (MOBO) framework to automate this complex decision. Our core hypothesis is that surrogate models informed by attention-based embeddings o…
▽ More
High-Performance Computing (HPC) schedulers must balance user performance with facility-wide resource constraints. The task boils down to selecting the optimal number of nodes for a given job. We present a surrogate-assisted multi-objective Bayesian optimization (MOBO) framework to automate this complex decision. Our core hypothesis is that surrogate models informed by attention-based embeddings of job telemetry can capture performance dynamics more effectively than standard regression techniques. We pair this with an intelligent sample acquisition strategy to ensure the approach is data-efficient. On two production HPC datasets, our embedding-informed method consistently identified higher-quality Pareto fronts of runtime-power trade-offs compared to baselines. Furthermore, our intelligent data sampling strategy drastically reduced training costs while improving the stability of the results. To our knowledge, this is the first work to successfully apply embedding-informed surrogates in a MOBO framework to the HPC scheduling problem, jointly optimizing for performance and power on production workloads.
△ Less
Submitted 21 January, 2026;
originally announced January 2026.
-
$π^{*}_{0.6}$: a VLA That Learns From Experience
Authors:
Physical Intelligence,
Ali Amin,
Raichelle Aniceto,
Ashwin Balakrishna,
Kevin Black,
Ken Conley,
Grace Connors,
James Darpinian,
Karan Dhabalia,
Jared DiCarlo,
Danny Driess,
Michael Equi,
Adnan Esmail,
Yunhao Fang,
Chelsea Finn,
Catherine Glossop,
Thomas Godden,
Ivan Goryachev,
Lachy Groom,
Hunter Hancock,
Karol Hausman,
Gashon Hussein,
Brian Ichter,
Szymon Jakubczak,
Rowan Jen
, et al. (31 additional authors not shown)
Abstract:
We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditioning. Our method incorporates heterogeneous data into the self-improvement process, including demon…
▽ More
We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditioning. Our method incorporates heterogeneous data into the self-improvement process, including demonstrations, data from on-policy collection, and expert teleoperated interventions provided during autonomous execution. RECAP starts by pre-training a generalist VLA with offline RL, which we call $π^{*}_{0.6}$, that can then be specialized to attain high performance on downstream tasks through on-robot data collection. We show that the $π^{*}_{0.6}$ model trained with the full RECAP method can fold laundry in real homes, reliably assemble boxes, and make espresso drinks using a professional espresso machine. On some of the hardest tasks, RECAP more than doubles task throughput and roughly halves the task failure rate.
△ Less
Submitted 18 November, 2025; v1 submitted 18 November, 2025;
originally announced November 2025.
-
Developing Strategies to Increase Capacity in AI Education
Authors:
Noah Q. Cowit,
Sri Yash Tadimalla,
Stephanie T. Jones,
Mary Lou Maher,
Tracy Camp,
Enrico Pontelli
Abstract:
Many institutions are currently grappling with teaching artificial intelligence (AI) in the face of growing demand and relevance in our world. The Computing Research Association (CRA) has conducted 32 moderated virtual roundtable discussions of 202 experts committed to improving AI education. These discussions slot into four focus areas: AI Knowledge Areas and Pedagogy, Infrastructure Challenges i…
▽ More
Many institutions are currently grappling with teaching artificial intelligence (AI) in the face of growing demand and relevance in our world. The Computing Research Association (CRA) has conducted 32 moderated virtual roundtable discussions of 202 experts committed to improving AI education. These discussions slot into four focus areas: AI Knowledge Areas and Pedagogy, Infrastructure Challenges in AI Education, Strategies to Increase Capacity in AI Education, and AI Education for All. Roundtables were organized around institution type to consider the particular goals and resources of different AI education environments. We identified the following high-level community needs to increase capacity in AI education. A significant digital divide creates major infrastructure hurdles, especially for smaller and under-resourced institutions. These challenges manifest as a shortage of faculty with AI expertise, who also face limited time for reskilling; a lack of computational infrastructure for students and faculty to develop and test AI models; and insufficient institutional technical support. Compounding these issues is the large burden associated with updating curricula and creating new programs. To address the faculty gap, accessible and continuous professional development is crucial for faculty to learn about AI and its ethical dimensions. This support is particularly needed for under-resourced institutions and must extend to faculty both within and outside of computing programs to ensure all students have access to AI education. We have compiled and organized a list of resources that our participant experts mentioned throughout this study. These resources contribute to a frequent request heard during the roundtables: a central repository of AI education resources for institutions to freely use across higher education.
△ Less
Submitted 25 September, 2025;
originally announced September 2025.
-
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
Authors:
Haoran Wu,
Can Xiao,
Jiayi Nie,
Xuan Guo,
Binglei Lou,
Jeffrey T. H. Wong,
Zhiwen Mo,
Cheng Zhang,
Przemyslaw Forys,
Wayne Luk,
Hongxiang Fan,
Jianyi Cheng,
Timothy M. Jones,
Rika Antonova,
Robert Mullins,
Aaron Zhao
Abstract:
LLMs now form the backbone of AI agents for a diverse array of applications, including tool use, command-line agents, and web or computer use agents. These agentic LLM inference tasks are fundamentally different from chatbot-focused inference -- they often have much larger context lengths to capture complex, prolonged inputs, such as entire webpage DOMs or complicated tool call trajectories. This,…
▽ More
LLMs now form the backbone of AI agents for a diverse array of applications, including tool use, command-line agents, and web or computer use agents. These agentic LLM inference tasks are fundamentally different from chatbot-focused inference -- they often have much larger context lengths to capture complex, prolonged inputs, such as entire webpage DOMs or complicated tool call trajectories. This, in turn, generates significant off-chip memory traffic for the underlying hardware at the inference stage and causes the workload to be constrained by two memory walls, namely the bandwidth and capacity memory walls, preventing the on-chip compute units from achieving high utilization.
In this paper, we introduce PLENA, a hardware-software co-designed system that applies three core optimization pathways to tackle these challenges. PLENA includes an efficient hardware implementation of compute and memory units supporting an asymmetric quantization scheme. PLENA also features a novel flattened systolic array architecture that has native support for FlashAttention to tackle these memory walls in the scenario of inference serving for long-context LLMs. Additionally, PLENA is developed with a complete stack, including a custom ISA, a compiler, a cycle-emulated simulator, and an automated design space exploration flow. The simulated results show that PLENA achieves up to 8.5x higher utilization than existing accelerators, and delivers 2.24x higher throughput than the A100 GPU and 3.85x higher throughput than the TPU v6e, under the same multiplier count and memory settings. The full PLENA system will also be open-sourced.
△ Less
Submitted 24 September, 2025; v1 submitted 11 September, 2025;
originally announced September 2025.
-
Newton to Einstein: Axiom-Based Discovery via Game Design
Authors:
Pingchuan Ma,
Benjamin Tod Jones,
Tsun-Hsuan Wang,
Minghao Guo,
Michal Piotr Lipiec,
Chuang Gan,
Wojciech Matusik
Abstract:
This position paper argues that machine learning for scientific discovery should shift from inductive pattern recognition to axiom-based reasoning. We propose a game design framework in which scientific inquiry is recast as a rule-evolving system: agents operate within environments governed by axioms and modify them to explain outlier observations. Unlike conventional ML approaches that operate wi…
▽ More
This position paper argues that machine learning for scientific discovery should shift from inductive pattern recognition to axiom-based reasoning. We propose a game design framework in which scientific inquiry is recast as a rule-evolving system: agents operate within environments governed by axioms and modify them to explain outlier observations. Unlike conventional ML approaches that operate within fixed assumptions, our method enables the discovery of new theoretical structures through systematic rule adaptation. We demonstrate the feasibility of this approach through preliminary experiments in logic-based games, showing that agents can evolve axioms that solve previously unsolvable problems. This framework offers a foundation for building machine learning systems capable of creative, interpretable, and theory-driven discovery.
△ Less
Submitted 5 September, 2025;
originally announced September 2025.
-
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Authors:
Matthias Maiterth,
Wesley H. Brewer,
Jaya S. Kuruvella,
Arunavo Dey,
Tanzima Z. Islam,
Kevin Menear,
Dmitry Duplyakin,
Rashadul Kabir,
Tapasya Patki,
Terry Jones,
Feiyi Wang
Abstract:
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter c…
▽ More
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
△ Less
Submitted 27 August, 2025; v1 submitted 27 August, 2025;
originally announced August 2025.
-
A Dataset for Distilling Knowledge Priors from Literature for Therapeutic Design
Authors:
Haydn Thomas Jones,
Natalie Maus,
Josh Magnus Ludan,
Maggie Ziyu Huan,
Jiaming Liang,
Marcelo Der Torossian Torres,
Jiatao Liang,
Zachary Ives,
Yoseph Barash,
Cesar de la Fuente-Nunez,
Jacob R. Gardner,
Mark Yatskar
Abstract:
AI-driven discovery can greatly reduce design time and enhance new therapeutics' effectiveness. Models using simulators explore broad design spaces but risk violating implicit constraints due to a lack of experimental priors. For example, in a new analysis we performed on a diverse set of models on the GuacaMol benchmark using supervised classifiers, over 60\% of molecules proposed had high probab…
▽ More
AI-driven discovery can greatly reduce design time and enhance new therapeutics' effectiveness. Models using simulators explore broad design spaces but risk violating implicit constraints due to a lack of experimental priors. For example, in a new analysis we performed on a diverse set of models on the GuacaMol benchmark using supervised classifiers, over 60\% of molecules proposed had high probability of being mutagenic. In this work, we introduce Medex, a dataset of priors for design problems extracted from literature describing compounds used in lab settings. It is constructed with LLM pipelines for discovering therapeutic entities in relevant paragraphs and summarizing information in concise fair-use facts. Medex consists of 32.3 million pairs of natural language facts, and appropriate entity representations (i.e. SMILES or refseq IDs). To demonstrate the potential of the data, we train LLM, CLIP, and LLava architectures to reason jointly about text and design targets and evaluate on tasks from the Therapeutic Data Commons (TDC). Medex is highly effective for creating models with strong priors: in supervised prediction problems that use our data as pretraining, our best models with 15M learnable parameters outperform larger 2B TxGemma on both regression and classification TDC tasks, and perform comparably to 9B models on average. Models built with Medex can be used as constraints while optimizing for novel molecules in GuacaMol, resulting in proposals that are safer and nearly as effective. We release our dataset at https://huggingface.co/datasets/medexanon/Medex, and will provide expanded versions as available literature grows.
△ Less
Submitted 11 September, 2025; v1 submitted 14 August, 2025;
originally announced August 2025.
-
Momentum Point-Perplexity Mechanics in Large Language Models
Authors:
Lorenzo Tomaz,
Judd Rosenblatt,
Thomas Berry Jones,
Diogo Schwerz de Lucena
Abstract:
We take a physics-based approach to studying how the internal hidden states of large language models change from token to token during inference. Across 20 open-source transformer models (135M-3B parameters), we find that a quantity combining the rate of change in hidden states and the model's next-token certainty, analogous to energy in physics, remains nearly constant. Random-weight models conse…
▽ More
We take a physics-based approach to studying how the internal hidden states of large language models change from token to token during inference. Across 20 open-source transformer models (135M-3B parameters), we find that a quantity combining the rate of change in hidden states and the model's next-token certainty, analogous to energy in physics, remains nearly constant. Random-weight models conserve this "energy" more tightly than pre-trained ones, while training shifts models into a faster, more decisive regime with greater variability. Using this "log-Lagrangian" view, we derive a control method called Jacobian steering, which perturbs hidden states in the minimal way needed to favor a target token. This approach maintained near-constant energy in two tested models and produced continuations rated higher in semantic quality than the models' natural outputs. Viewing transformers through this mechanics lens offers a principled basis for interpretability, anomaly detection, and low-risk steering. This could help make powerful models more predictable and aligned with human intent.
△ Less
Submitted 11 August, 2025;
originally announced August 2025.
-
A chart review process aided by natural language processing and multi-wave adaptive sampling to expedite validation of code-based algorithms for large database studies
Authors:
Shirley V Wang,
Georg Hahn,
Sushama Kattinakere Sreedhara,
Mufaddal Mahesri,
Haritha S. Pillai,
Rajendra Aldis,
Joyce Lii,
Sarah K. Dutcher,
Rhoda Eniafe,
Jamal T. Jones,
Keewan Kim,
Jiwei He,
Hana Lee,
Sengwee Toh,
Rishi J Desai,
Jie Yang
Abstract:
Background: One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassifica…
▽ More
Background: One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassification. However, extensive time and resource allocation are typically re-quired to create reference-standard labels through manual chart review of free-text notes from linked electronic health records. Methods: We describe an expedited process that introduces efficiency in a validation study us-ing two distinct mechanisms: 1) use of natural language processing (NLP) to reduce time spent by human reviewers to review each chart, and 2) a multi-wave adaptive sampling approach with pre-defined criteria to stop the validation study once performance characteristics are identified with sufficient precision. We illustrate this process in a case study that validates the performance of a claims-based outcome algorithm for intentional self-harm in patients with obesity. Results: We empirically demonstrate that the NLP-assisted annotation process reduced the time spent on review per chart by 40% and use of the pre-defined stopping rule with multi-wave samples would have prevented review of 77% of patient charts with limited compromise to precision in derived measurement characteristics. Conclusion: This approach could facilitate more routine validation of code-based algorithms used to define key study parameters, ultimately enhancing understanding of the reliability of find-ings derived from database studies.
△ Less
Submitted 25 July, 2025;
originally announced July 2025.
-
Hierarchical Reinforcement Learning Framework for Adaptive Walking Control Using General Value Functions of Lower-Limb Sensor Signals
Authors:
Sonny T. Jones,
Grange M. Simpson,
Patrick M. Pilarski,
Ashley N. Dalrymple
Abstract:
Rehabilitation technology is a natural setting to study the shared learning and decision-making of human and machine agents. In this work, we explore the use of Hierarchical Reinforcement Learning (HRL) to develop adaptive control strategies for lower-limb exoskeletons, aiming to enhance mobility and autonomy for individuals with motor impairments. Inspired by prominent models of biological sensor…
▽ More
Rehabilitation technology is a natural setting to study the shared learning and decision-making of human and machine agents. In this work, we explore the use of Hierarchical Reinforcement Learning (HRL) to develop adaptive control strategies for lower-limb exoskeletons, aiming to enhance mobility and autonomy for individuals with motor impairments. Inspired by prominent models of biological sensorimotor processing, our investigated HRL approach breaks down the complex task of exoskeleton control adaptation into a higher-level framework for terrain strategy adaptation and a lower-level framework for providing predictive information; this latter element is implemented via the continual learning of general value functions (GVFs). GVFs generated temporal abstractions of future signal values from multiple wearable lower-limb sensors, including electromyography, pressure insoles, and goniometers. We investigated two methods for incorporating actual and predicted sensor signals into a policy network with the intent to improve the decision-making capacity of the control system of a lower-limb exoskeleton during ambulation across varied terrains. As a key result, we found that the addition of predictions made from GVFs increased overall network accuracy. Terrain-specific performance increases were seen while walking on even ground, uneven ground, up and down ramps, and turns, terrains that are often misclassified without predictive information. This suggests that predictive information can aid decision-making during uncertainty, e.g., on terrains that have a high chance of being misclassified. This work, therefore, contributes new insights into the nuances of HRL and the future development of exoskeletons to facilitate safe transitioning and traversing across different walking environments.
△ Less
Submitted 22 July, 2025;
originally announced July 2025.
-
Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions
Authors:
Keita Teranishi,
Harshitha Menon,
William F. Godoy,
Prasanna Balaprakash,
David Bau,
Tal Ben-Nun,
Abhinav Bhatele,
Franz Franchetti,
Michael Franusich,
Todd Gamblin,
Giorgis Georgakoudis,
Tom Goldstein,
Arjun Guha,
Steven Hahn,
Costin Iancu,
Zheming Jin,
Terry Jones,
Tze Meng Low,
Het Mankad,
Narasinga Rao Miniskar,
Mohammad Alaul Haque Monil,
Daniel Nichols,
Konstantinos Parasyris,
Swaroop Pophale,
Pedro Valero-Lara
, et al. (3 additional authors not shown)
Abstract:
We discuss the challenges and propose research directions for using AI to revolutionize the development of high-performance computing (HPC) software. AI technologies, in particular large language models, have transformed every aspect of software development. For its part, HPC software is recognized as a highly specialized scientific field of its own. We discuss the challenges associated with lever…
▽ More
We discuss the challenges and propose research directions for using AI to revolutionize the development of high-performance computing (HPC) software. AI technologies, in particular large language models, have transformed every aspect of software development. For its part, HPC software is recognized as a highly specialized scientific field of its own. We discuss the challenges associated with leveraging state-of-the-art AI technologies to develop such a unique and niche class of software and outline our research directions in the two US Department of Energy--funded projects for advancing HPC Software via AI: Ellora and Durban.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
Authors:
Physical Intelligence,
Kevin Black,
Noah Brown,
James Darpinian,
Karan Dhabalia,
Danny Driess,
Adnan Esmail,
Michael Equi,
Chelsea Finn,
Niccolo Fusai,
Manuel Y. Galliker,
Dibya Ghosh,
Lachy Groom,
Karol Hausman,
Brian Ichter,
Szymon Jakubczak,
Tim Jones,
Liyiming Ke,
Devin LeBlanc,
Sergey Levine,
Adrian Li-Bell,
Mohith Mothukuri,
Suraj Nair,
Karl Pertsch,
Allen Z. Ren
, et al. (11 additional authors not shown)
Abstract:
In order for robots to be useful, they must perform practically relevant tasks in the real world, outside of the lab. While vision-language-action (VLA) models have demonstrated impressive results for end-to-end robot control, it remains an open question how far such models can generalize in the wild. We describe $π_{0.5}$, a new model based on $π_{0}$ that uses co-training on heterogeneous tasks…
▽ More
In order for robots to be useful, they must perform practically relevant tasks in the real world, outside of the lab. While vision-language-action (VLA) models have demonstrated impressive results for end-to-end robot control, it remains an open question how far such models can generalize in the wild. We describe $π_{0.5}$, a new model based on $π_{0}$ that uses co-training on heterogeneous tasks to enable broad generalization. $π_{0.5}$\ uses data from multiple robots, high-level semantic prediction, web data, and other sources to enable broadly generalizable real-world robotic manipulation. Our system uses a combination of co-training and hybrid multi-modal examples that combine image observations, language commands, object detections, semantic subtask prediction, and low-level actions. Our experiments show that this kind of knowledge transfer is essential for effective generalization, and we demonstrate for the first time that an end-to-end learning-enabled robotic system can perform long-horizon and dexterous manipulation skills, such as cleaning a kitchen or bedroom, in entirely new homes.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
FireGuard: A Generalized Microarchitecture for Fine-Grained Security Analysis on OoO Superscalar Cores
Authors:
Zhe Jiang,
Sam Ainsworth,
Timothy Jones
Abstract:
High-performance security guarantees rely on hardware support. Generic programmable support for fine-grained instruction analysis has gained broad interest in the literature as a fundamental building block for the security of future processors. Yet, implementation in real out-of-order (OoO) superscalar processors presents tough challenges that cannot be explored in highly abstract simulators. We d…
▽ More
High-performance security guarantees rely on hardware support. Generic programmable support for fine-grained instruction analysis has gained broad interest in the literature as a fundamental building block for the security of future processors. Yet, implementation in real out-of-order (OoO) superscalar processors presents tough challenges that cannot be explored in highly abstract simulators. We detail the challenges of implementing complex programmable pathways without critical paths or contention. We then introduce FireGuard, the first implementation of fine-grained instruction analysis on a real OoO superscalar processor. We establish an end-to-end system, including microarchitecture, SoC, ISA and programming model. Experiments show that our solution simultaneously ensures both security and performance of the system, with parallel scalability. We examine the feasibility of building FireGuard into modern SoCs: Apple's M1-Pro, Huawei's Kirin-960, and Intel's i7-12700F, where less than 1% silicon area is introduced. The Repo. of FireGuard's source code: https://github.com/SEU-ACAL/reproduce-FireGuard-DAC-25.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
MEEK: Re-thinking Heterogeneous Parallel Error Detection Architecture for Real-World OoO Superscalar Processors
Authors:
Zhe Jiang,
Minli Liao,
Sam Ainsworth,
Dean You,
Timothy Jones
Abstract:
Heterogeneous parallel error detection is an approach to achieving fault-tolerant processors, leveraging multiple power-efficient cores to re-execute software originally run on a high-performance core. Yet, its complex components, gathering data cross-chip from many parts of the core, raise questions of how to build it into commodity cores without heavy design invasion and extensive re-engineering…
▽ More
Heterogeneous parallel error detection is an approach to achieving fault-tolerant processors, leveraging multiple power-efficient cores to re-execute software originally run on a high-performance core. Yet, its complex components, gathering data cross-chip from many parts of the core, raise questions of how to build it into commodity cores without heavy design invasion and extensive re-engineering.
We build the first full-RTL design, MEEK, into an open-source SoC, from microarchitecture and ISA to the OS and programming model. We identify and solve bottlenecks and bugs overlooked in previous work, and demonstrate that MEEK offers microsecond-level detection capacity with affordable overheads. By trading off architectural functionalities across codesigned hardware-software layers, MEEK features only light changes to a mature out-of-order superscalar core, simple coordinating software layers, and a few lines of operating-system code. The Repo. of MEEK's source code: https://github.com/SEU-ACAL/reproduce-MEEK-DAC-25.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Large Scale Multi-Task Bayesian Optimization with Large Language Models
Authors:
Yimeng Zeng,
Natalie Maus,
Haydn Thomas Jones,
Jeffrey Tao,
Fangping Wan,
Marcelo Der Torossian Torres,
Cesar de la Fuente-Nunez,
Ryan Marcus,
Osbert Bastani,
Jacob R. Gardner
Abstract:
In multi-task Bayesian optimization, the goal is to leverage experience from optimizing existing tasks to improve the efficiency of optimizing new ones. While approaches using multi-task Gaussian processes or deep kernel transfer exist, the performance improvement is marginal when scaling beyond a moderate number of tasks. We introduce a novel approach leveraging large language models (LLMs) to le…
▽ More
In multi-task Bayesian optimization, the goal is to leverage experience from optimizing existing tasks to improve the efficiency of optimizing new ones. While approaches using multi-task Gaussian processes or deep kernel transfer exist, the performance improvement is marginal when scaling beyond a moderate number of tasks. We introduce a novel approach leveraging large language models (LLMs) to learn from, and improve upon, previous optimization trajectories, scaling to approximately 1500 distinct tasks. Specifically, we propose a feedback loop in which an LLM is fine-tuned on the high quality solutions to specific tasks found by Bayesian optimization (BO). This LLM is then used to generate initialization points for future BO searches for new tasks. The trajectories of these new searches provide additional training data for fine-tuning the LLM, completing the loop. We evaluate our method on two distinct domains: database query optimization and antimicrobial peptide design. Results demonstrate that our approach creates a positive feedback loop, where the LLM's generated initializations gradually improve, leading to better optimization performance. As this feedback loop continues, we find that the LLM is eventually able to generate solutions to new tasks in just a few shots that are better than the solutions produced by "from scratch" by Bayesian optimization while simultaneously requiring significantly fewer oracle calls.
△ Less
Submitted 12 June, 2025; v1 submitted 11 March, 2025;
originally announced March 2025.
-
A Solver-Aided Hierarchical Language for LLM-Driven CAD Design
Authors:
Benjamin T. Jones,
Felix Hähnlein,
Zihan Zhang,
Maaz Ahmad,
Vladimir Kim,
Adriana Schulz
Abstract:
Large language models (LLMs) have been enormously successful in solving a wide variety of structured and unstructured generative tasks, but they struggle to generate procedural geometry in Computer Aided Design (CAD). These difficulties arise from an inability to do spatial reasoning and the necessity to guide a model through complex, long range planning to generate complex geometry. We enable gen…
▽ More
Large language models (LLMs) have been enormously successful in solving a wide variety of structured and unstructured generative tasks, but they struggle to generate procedural geometry in Computer Aided Design (CAD). These difficulties arise from an inability to do spatial reasoning and the necessity to guide a model through complex, long range planning to generate complex geometry. We enable generative CAD Design with LLMs through the introduction of a solver-aided, hierarchical domain specific language (DSL) called AIDL, which offloads the spatial reasoning requirements to a geometric constraint solver. Additionally, we show that in the few-shot regime, AIDL outperforms even a language with in-training data (OpenSCAD), both in terms of generating visual results closer to the prompt and creating objects that are easier to post-process and reason about.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization
Authors:
Natalie Maus,
Kyurae Kim,
Yimeng Zeng,
Haydn Thomas Jones,
Fangping Wan,
Marcelo Der Torossian Torres,
Cesar de la Fuente-Nunez,
Jacob R. Gardner
Abstract:
In multi-objective black-box optimization, the goal is typically to find solutions that optimize a set of $T$ black-box objective functions, $f_1, \ldots f_T$, simultaneously. Traditional approaches often seek a single Pareto-optimal set that balances trade-offs among all objectives. In contrast, we consider a problem setting that departs from this paradigm: finding a small set of $K < T$ solution…
▽ More
In multi-objective black-box optimization, the goal is typically to find solutions that optimize a set of $T$ black-box objective functions, $f_1, \ldots f_T$, simultaneously. Traditional approaches often seek a single Pareto-optimal set that balances trade-offs among all objectives. In contrast, we consider a problem setting that departs from this paradigm: finding a small set of $K < T$ solutions, that collectively "cover" the $T$ objectives. A set of solutions is defined as "covering" if, for each objective $f_1, \ldots f_T$, there is at least one good solution. A motivating example for this problem setting occurs in drug design. For example, we may have $T$ pathogens and aim to identify a set of $K < T$ antibiotics such that at least one antibiotic can be used to treat each pathogen. This problem, known as coverage optimization, has yet to be tackled with the Bayesian optimization (BO) framework. To fill this void, we develop Multi-Objective Coverage Bayesian Optimization (MOCOBO), a BO algorithm for solving coverage optimization. Our approach is based on a new acquisition function reminiscent of expected improvement in the vanilla BO setup. We demonstrate the performance of our method on high-dimensional black-box optimization tasks, including applications in peptide and molecular design. Results show that the coverage of the $K < T$ solutions found by MOCOBO matches or nearly matches the coverage of $T$ solutions obtained by optimizing each objective individually. Furthermore, in in vitro experiments, the peptides found by MOCOBO exhibited high potency against drug-resistant pathogens, further demonstrating the potential of MOCOBO for drug discovery. All of our code is publicly available at the following link: https://github.com/nataliemaus/mocobo.
△ Less
Submitted 27 October, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
Efficient quantum-enhanced classical simulation for patches of quantum landscapes
Authors:
Sacha Lerch,
Ricard Puig,
Manuel S. Rudolph,
Armando Angrisani,
Tyson Jones,
M. Cerezo,
Supanut Thanasilp,
Zoë Holmes
Abstract:
Understanding the capabilities of classical simulation methods is key to identifying where quantum computers are advantageous. Not only does this ensure that quantum computers are used only where necessary, but also one can potentially identify subroutines that can be offloaded onto a classical device. In this work, we show that it is always possible to generate a classical surrogate of a sub-regi…
▽ More
Understanding the capabilities of classical simulation methods is key to identifying where quantum computers are advantageous. Not only does this ensure that quantum computers are used only where necessary, but also one can potentially identify subroutines that can be offloaded onto a classical device. In this work, we show that it is always possible to generate a classical surrogate of a sub-region (dubbed a "patch") of an expectation landscape produced by a parameterized quantum circuit. That is, we provide a quantum-enhanced classical algorithm which, after simple measurements on a quantum device, allows one to classically simulate approximate expectation values of a subregion of a landscape. We provide time and sample complexity guarantees for a range of families of circuits of interest, and further numerically demonstrate our simulation algorithms on an exactly verifiable simulation of a Hamiltonian variational ansatz and long-time dynamics simulation on a 127-qubit heavy-hex topology.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
$π_0$: A Vision-Language-Action Flow Model for General Robot Control
Authors:
Kevin Black,
Noah Brown,
Danny Driess,
Adnan Esmail,
Michael Equi,
Chelsea Finn,
Niccolo Fusai,
Lachy Groom,
Karol Hausman,
Brian Ichter,
Szymon Jakubczak,
Tim Jones,
Liyiming Ke,
Sergey Levine,
Adrian Li-Bell,
Mohith Mothukuri,
Suraj Nair,
Karl Pertsch,
Lucy Xiaoyang Shi,
James Tanner,
Quan Vuong,
Anna Walling,
Haohuan Wang,
Ury Zhilinsky
Abstract:
Robot learning holds tremendous promise to unlock the full potential of flexible, general, and dexterous robot systems, as well as to address some of the deepest questions in artificial intelligence. However, bringing robot learning to the level of generality required for effective real-world systems faces major obstacles in terms of data, generalization, and robustness. In this paper, we discuss…
▽ More
Robot learning holds tremendous promise to unlock the full potential of flexible, general, and dexterous robot systems, as well as to address some of the deepest questions in artificial intelligence. However, bringing robot learning to the level of generality required for effective real-world systems faces major obstacles in terms of data, generalization, and robustness. In this paper, we discuss how generalist robot policies (i.e., robot foundation models) can address these challenges, and how we can design effective generalist robot policies for complex and highly dexterous tasks. We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We then discuss how this model can be trained on a large and diverse dataset from multiple dexterous robot platforms, including single-arm robots, dual-arm robots, and mobile manipulators. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people and from a high-level VLM policy, and its ability to acquire new skills via fine-tuning. Our results cover a wide variety of tasks, such as laundry folding, table cleaning, and assembling boxes.
△ Less
Submitted 8 January, 2026; v1 submitted 31 October, 2024;
originally announced October 2024.
-
Modularity in Transformers: Investigating Neuron Separability & Specialization
Authors:
Nicholas Pochinkov,
Thomas Jones,
Mohammed Rashidur Rahman
Abstract:
Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze…
▽ More
Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze the overlap and specialization of neurons across different tasks and data subsets. Our findings reveal evidence of task-specific neuron clusters, with varying degrees of overlap between related tasks. We observe that neuron importance patterns persist to some extent even in randomly initialized models, suggesting an inherent structure that training refines. Additionally, we find that neuron clusters identified through MoEfication correspond more strongly to task-specific neurons in earlier and later layers of the models. This work contributes to a more nuanced understanding of transformer internals and offers insights into potential avenues for improving model interpretability and efficiency.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Self-deployable contracting-cord metamaterials with tunable mechanical properties
Authors:
Wenzhong Yan,
Talmage Jones,
Christopher L. Jawetz,
Ryan H. Lee,
Jonathan B. Hopkins,
Ankur Mehta
Abstract:
Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design stra…
▽ More
Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design strategy to create reversibly self-deployable metamaterials with continuously tunable post-deployment stiffness and damping. Our metamaterial comprises contracting actuators threaded through beads with matching conical concavo-convex interfaces in networked chains. The slack network conforms to arbitrary shapes, but when actuated, it self-assembles into a preprogrammed configuration with beads gathered together. Further contraction of the actuators can dynamically tune the assembly's mechanical properties through the beads' particle jamming, while maintaining the overall structure with minimal change. We show that, after deployment, such metamaterials exhibit pronounced tunability in bending-dominated configurations: they can become more than 35 times stiffer and change their damping capability by over 50%. Through systematic analysis, we find that the beads'conical angle can introduce geometric nonlinearity, which has a major effect on the self-deployability and tunability of the metamaterial. Our work provides routes towards reversibly self-deployable, lightweight, and tunable metamaterials, with potential applications in soft robotics, reconfigurable architectures, and space engineering.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
A Wearable Resistance Devices Motor Learning Effects in Exercise
Authors:
Eugenio Frias-Miranda,
Hong-Anh Nguyen,
Jeremy Hampton,
Trenner Jones,
Benjamin Spotts,
Matthew Cochran,
Deva Chan,
Laura H Blumenschein
Abstract:
The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learni…
▽ More
The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learning. However, the focus on active force production often forces devices to either be confined to simple movements or interventions. As such, in this paper, we investigate how passive device behaviors can contribute to the process of motor learning by themselves. Our approach involves using a wearable resistance (WR) device, which is outfitted with elastic bands, to apply a force field that changes in response to a person's movements while performing exercises. We develop a method to measure the produced forces from the device without impeding the function and we characterize the device's force generation abilities. We then present a study assessing the impact of the WR device on motor learning of proper squat form compared to visual or no feedback. Biometrics such as knee and hip angles were used to monitor and assess subject performance. Our findings indicate that the force fields produced while training with the WR device can improve performance in full-body exercises similarly to a more direct visual feedback mechanism, though the improvement is not consistent across all performance metrics. Through our research, we contribute important insights into the application of passive wearable resistance technology in practical exercise settings.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Speech foundation models in healthcare: Effect of layer selection on pathological speech feature prediction
Authors:
Daniela A. Wiepert,
Rene L. Utianski,
Joseph R. Duffy,
John L. Stricker,
Leland R. Barnard,
David T. Jones,
Hugo Botha
Abstract:
Accurately extracting clinical information from speech is critical to the diagnosis and treatment of many neurological conditions. As such, there is interest in leveraging AI for automatic, objective assessments of clinical speech to facilitate diagnosis and treatment of speech disorders. We explore transfer learning using foundation models, focusing on the impact of layer selection for the downst…
▽ More
Accurately extracting clinical information from speech is critical to the diagnosis and treatment of many neurological conditions. As such, there is interest in leveraging AI for automatic, objective assessments of clinical speech to facilitate diagnosis and treatment of speech disorders. We explore transfer learning using foundation models, focusing on the impact of layer selection for the downstream task of predicting pathological speech features. We find that selecting an optimal layer can greatly improve performance (~15.8% increase in balanced accuracy per feature as compared to worst layer, ~13.6% increase as compared to final layer), though the best layer varies by predicted feature and does not always generalize well to unseen data. A learned weighted sum offers comparable performance to the average best layer in-distribution (only ~1.2% lower) and had strong generalization for out-of-distribution data (only 1.5% lower than the average best layer).
△ Less
Submitted 21 June, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations
Authors:
Francieli Boito,
Jim Brandt,
Valeria Cardellini,
Philip Carns,
Florina M. Ciorba,
Hilary Egan,
Ahmed Eleliemy,
Ann Gentile,
Thomas Gruber,
Jeff Hanson,
Utz-Uwe Haus,
Kevin Huck,
Thomas Ilsche,
Thomas Jakobsche,
Terry Jones,
Sven Karlsson,
Abdullah Mueen,
Michael Ott,
Tapasya Patki,
Ivy Peng,
Krishnan Raghavan,
Stephen Simms,
Kathleen Shoga,
Michael Showerman,
Devesh Tiwari
, et al. (2 additional authors not shown)
Abstract:
Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and throughput. Because of the complexity and scale of systems and workflows and the need for low-latency response to address dynamic circumstances, automated feedback and response have the potential to be more…
▽ More
Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and throughput. Because of the complexity and scale of systems and workflows and the need for low-latency response to address dynamic circumstances, automated feedback and response have the potential to be more effective than current human-in-the-loop approaches which are laborious and error prone. Progress has been limited, however, by factors such as the lack of infrastructure and feedback hooks, and successful deployment is often site- and case-specific. In this position paper we report on the outcomes and plans from a recent Dagstuhl Seminar, seeking to carve a path for community progress in the development of autonomous feedback loops for MODA, based on the established formalism of similar (MAPE-K) loops in autonomous computing and self-adaptive systems. By defining and developing such loops for significant cases experienced across HPC sites, we seek to extract commonalities and develop conventions that will facilitate interoperability and interchangeability with system hardware, software, and applications across different sites, and will motivate vendors and others to provide telemetry interfaces and feedback hooks to enable community development and pervasive deployment of MODA autonomy loops.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Detecting Speech Abnormalities with a Perceiver-based Sequence Classifier that Leverages a Universal Speech Model
Authors:
Hagen Soltau,
Izhak Shafran,
Alex Ottenwess,
Joseph R. JR Duffy,
Rene L. Utianski,
Leland R. Barnard,
John L. Stricker,
Daniela Wiepert,
David T. Jones,
Hugo Botha
Abstract:
We propose a Perceiver-based sequence classifier to detect abnormalities in speech reflective of several neurological disorders. We combine this classifier with a Universal Speech Model (USM) that is trained (unsupervised) on 12 million hours of diverse audio recordings. Our model compresses long sequences into a small set of class-specific latent representations and a factorized projection is use…
▽ More
We propose a Perceiver-based sequence classifier to detect abnormalities in speech reflective of several neurological disorders. We combine this classifier with a Universal Speech Model (USM) that is trained (unsupervised) on 12 million hours of diverse audio recordings. Our model compresses long sequences into a small set of class-specific latent representations and a factorized projection is used to predict different attributes of the disordered input speech. The benefit of our approach is that it allows us to model different regions of the input for different classes and is at the same time data efficient. We evaluated the proposed model extensively on a curated corpus from the Mayo Clinic. Our model outperforms standard transformer (80.9%) and perceiver (81.8%) models and achieves an average accuracy of 83.1%. With limited task-specific data, we find that pretraining is important and surprisingly pretraining with the unrelated automatic speech recognition (ASR) task is also beneficial. Encodings from the middle layers provide a mix of both acoustic and phonetic information and achieve best prediction results compared to just using the final layer encodings (83.1% vs. 79.6%). The results are promising and with further refinements may help clinicians detect speech abnormalities without needing access to highly specialized speech-language pathologists.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Zero-shot CAD Program Re-Parameterization for Interactive Manipulation
Authors:
Milin Kodnongbua,
Benjamin T. Jones,
Maaz Bin Safeer Ahmad,
Vladimir G. Kim,
Adriana Schulz
Abstract:
Parametric CAD models encode entire families of shapes that should, in principle, be easy for designers to explore. However, in practice, parametric CAD models can be difficult to manipulate due to implicit semantic constraints among parameter values. Finding and enforcing these semantic constraints solely from geometry or programmatic shape representations is not possible because these constraint…
▽ More
Parametric CAD models encode entire families of shapes that should, in principle, be easy for designers to explore. However, in practice, parametric CAD models can be difficult to manipulate due to implicit semantic constraints among parameter values. Finding and enforcing these semantic constraints solely from geometry or programmatic shape representations is not possible because these constraints ultimately reflect design intent. They are informed by the designer's experience and semantics in the real world. To address this challenge, we introduce a zero-shot pipeline that leverages pre-trained large language and image model to infer meaningful space of variations for a shape. We then re-parameterize a new constrained parametric CAD program that captures these variations, enabling effortless exploration of the design space along meaningful design axes.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
SparCA: Sparse Compressed Agglomeration for Feature Extraction and Dimensionality Reduction
Authors:
Leland Barnard,
Farwa Ali,
Hugo Botha,
David T. Jones
Abstract:
The most effective dimensionality reduction procedures produce interpretable features from the raw input space while also providing good performance for downstream supervised learning tasks. For many methods, this requires optimizing one or more hyperparameters for a specific task, which can limit generalizability. In this study we propose sparse compressed agglomeration (SparCA), a novel dimensio…
▽ More
The most effective dimensionality reduction procedures produce interpretable features from the raw input space while also providing good performance for downstream supervised learning tasks. For many methods, this requires optimizing one or more hyperparameters for a specific task, which can limit generalizability. In this study we propose sparse compressed agglomeration (SparCA), a novel dimensionality reduction procedure that involves a multistep hierarchical feature grouping, compression, and feature selection process. We demonstrate the characteristics and performance of the SparCA method across heterogenous synthetic and real-world datasets, including images, natural language, and single cell gene expression data. Our results show that SparCA is applicable to a wide range of data types, produces highly interpretable features, and shows compelling performance on downstream supervised learning tasks without the need for hyperparameter tuning.
△ Less
Submitted 26 January, 2023;
originally announced February 2023.
-
Effects of Backtracking on PageRank
Authors:
Cory Glover,
Tyler Jones,
Mark Kempton,
Alice Oveson
Abstract:
In this paper, we consider three variations on standard PageRank: Non-backtracking PageRank, $μ$-PageRank, and $\infty$-PageRank, all of which alter the standard formula by adjusting the likelihood of backtracking in the algorithm's random walk. We show that in the case of regular and bipartite biregular graphs, standard PageRank and its variants are equivalent. We also compare each centrality mea…
▽ More
In this paper, we consider three variations on standard PageRank: Non-backtracking PageRank, $μ$-PageRank, and $\infty$-PageRank, all of which alter the standard formula by adjusting the likelihood of backtracking in the algorithm's random walk. We show that in the case of regular and bipartite biregular graphs, standard PageRank and its variants are equivalent. We also compare each centrality measure and investigate their clustering capabilities.
△ Less
Submitted 9 February, 2026; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Self-Supervised Representation Learning for CAD
Authors:
Benjamin T. Jones,
Michael Hu,
Vladimir G. Kim,
Adriana Schulz
Abstract:
The design of man-made objects is dominated by computer aided design (CAD) tools. Assisting design with data-driven machine learning methods is hampered by lack of labeled data in CAD's native format; the parametric boundary representation (B-Rep). Several data sets of mechanical parts in B-Rep format have recently been released for machine learning research. However, large scale databases are lar…
▽ More
The design of man-made objects is dominated by computer aided design (CAD) tools. Assisting design with data-driven machine learning methods is hampered by lack of labeled data in CAD's native format; the parametric boundary representation (B-Rep). Several data sets of mechanical parts in B-Rep format have recently been released for machine learning research. However, large scale databases are largely unlabeled, and labeled datasets are small. Additionally, task specific label sets are rare, and costly to annotate. This work proposes to leverage unlabeled CAD geometry on supervised learning tasks. We learn a novel, hybrid implicit/explicit surface representation for B-Rep geometry, and show that this pre-training significantly improves few-shot learning performance and also achieves state-of-the-art performance on several existing B-Rep benchmarks.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Risk of re-identification for shared clinical speech recordings
Authors:
Daniela A. Wiepert,
Bradley A. Malin,
Joseph R. Duffy,
Rene L. Utianski,
John L. Stricker,
David T. Jones,
Hugo Botha
Abstract:
Large, curated datasets are required to leverage speech-based tools in healthcare. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (i.e., voiceprints), sharing recordings raises privacy concerns. We examine the re-identification risk for speech recordings, without reference to demographic or metadata, using a state-of-the-ar…
▽ More
Large, curated datasets are required to leverage speech-based tools in healthcare. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (i.e., voiceprints), sharing recordings raises privacy concerns. We examine the re-identification risk for speech recordings, without reference to demographic or metadata, using a state-of-the-art speaker recognition system. We demonstrate that the risk is inversely related to the number of comparisons an adversary must consider, i.e., the search space. Risk is high for a small search space but drops as the search space grows ($precision >0.85$ for $<1*10^{6}$ comparisons, $precision <0.5$ for $>3*10^{6}$ comparisons). Next, we show that the nature of a speech recording influences re-identification risk, with non-connected speech (e.g., vowel prolongation) being harder to identify. Our findings suggest that speaker recognition systems can be used to re-identify participants in specific circumstances, but in practice, the re-identification risk appears low.
△ Less
Submitted 21 August, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Mates2Motion: Learning How Mechanical CAD Assemblies Work
Authors:
James Noeckel,
Benjamin T. Jones,
Karl Willis,
Brian Curless,
Adriana Schulz
Abstract:
We describe our work on inferring the degrees of freedom between mated parts in mechanical assemblies using deep learning on CAD representations. We train our model using a large dataset of real-world mechanical assemblies consisting of CAD parts and mates joining them together. We present methods for re-defining these mates to make them better reflect the motion of the assembly, as well as narrow…
▽ More
We describe our work on inferring the degrees of freedom between mated parts in mechanical assemblies using deep learning on CAD representations. We train our model using a large dataset of real-world mechanical assemblies consisting of CAD parts and mates joining them together. We present methods for re-defining these mates to make them better reflect the motion of the assembly, as well as narrowing down the possible axes of motion. We also conduct a user study to create a motion-annotated test set with more reliable labels.
△ Less
Submitted 4 May, 2023; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Spectral Power Profile Optimization of Field-Deployed WDM Network by Remote Link Modeling
Authors:
Rasmus T. Jones,
Kyle R. H. Bottrill,
Natsupa Taengnoi,
Periklis Petropoulos,
Metodi P. Yankov
Abstract:
A digital twin model of a multi-node WDM network is obtained from a single access point. The model is used to predict and optimize the transmit power profile for each link in the network and up to 2.2~dB of margin improvements are obtained w.r.t. unoptimized transmission.
A digital twin model of a multi-node WDM network is obtained from a single access point. The model is used to predict and optimize the transmit power profile for each link in the network and up to 2.2~dB of margin improvements are obtained w.r.t. unoptimized transmission.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Battery Cloud with Advanced Algorithms
Authors:
Xiaojun Li,
David Jauernig,
Mengzhu Gao,
Trevor Jones
Abstract:
A Battery Cloud or cloud battery management system leverages the cloud computational power and data storage to improve battery safety, performance, and economy. This work will present the Battery Cloud that collects measured battery data from electric vehicles and energy storage systems. Advanced algorithms are applied to improve battery performance. Using remote vehicle data, we train and validat…
▽ More
A Battery Cloud or cloud battery management system leverages the cloud computational power and data storage to improve battery safety, performance, and economy. This work will present the Battery Cloud that collects measured battery data from electric vehicles and energy storage systems. Advanced algorithms are applied to improve battery performance. Using remote vehicle data, we train and validate an artificial neural network to estimate pack SOC during vehicle charging. The strategy is then tested on vehicles. Furthermore, high accuracy and onboard battery state of health estimation methods for electric vehicles are developed based on the differential voltage (DVA) and incremental capacity analysis (ICA). Using cycling data from battery cells at various temperatures, we extract the charging cycles and calculate the DVA and ICA curves, from which multiple features are extracted, analyzed, and eventually used to estimate the state of health. For battery safety, a data-driven thermal anomaly detection method is developed. The method can detect unforeseen anomalies such as thermal runaways at the very early stage. With the further development of the internet of things, more and more battery data will be available. Potential applications of battery cloud also include areas such as battery manufacture, recycling, and electric vehicle battery swap.
△ Less
Submitted 12 May, 2022; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Local Latent Space Bayesian Optimization over Structured Inputs
Authors:
Natalie Maus,
Haydn T. Jones,
Juston S. Moore,
Matt J. Kusner,
John Bradshaw,
Jacob R. Gardner
Abstract:
Bayesian optimization over the latent spaces of deep autoencoder models (DAEs) has recently emerged as a promising new approach for optimizing challenging black-box functions over structured, discrete, hard-to-enumerate search spaces (e.g., molecules). Here the DAE dramatically simplifies the search space by mapping inputs into a continuous latent space where familiar Bayesian optimization tools c…
▽ More
Bayesian optimization over the latent spaces of deep autoencoder models (DAEs) has recently emerged as a promising new approach for optimizing challenging black-box functions over structured, discrete, hard-to-enumerate search spaces (e.g., molecules). Here the DAE dramatically simplifies the search space by mapping inputs into a continuous latent space where familiar Bayesian optimization tools can be more readily applied. Despite this simplification, the latent space typically remains high-dimensional. Thus, even with a well-suited latent space, these approaches do not necessarily provide a complete solution, but may rather shift the structured optimization problem to a high-dimensional one. In this paper, we propose LOL-BO, which adapts the notion of trust regions explored in recent work on high-dimensional Bayesian optimization to the structured setting. By reformulating the encoder to function as both an encoder for the DAE globally and as a deep kernel for the surrogate model within a trust region, we better align the notion of local optimization in the latent space with local optimization in the input space. LOL-BO achieves as much as 20 times improvement over state-of-the-art latent space Bayesian optimization methods across six real-world benchmarks, demonstrating that improvement in optimization strategies is as important as developing better DAE models.
△ Less
Submitted 22 February, 2023; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Online Application Guidance for Heterogeneous Memory Systems
Authors:
M. Ben Olson,
Brandon Kammerdiener,
Kshitij A. Doshi,
Terry Jones,
Michael R. Jantz
Abstract:
Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize t…
▽ More
Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize the different types of memory efficiently, new data management strategies are needed to match application usage to the best available memory technology. However, current proposals for managing heterogeneous memories are limited because they either: 1) do not consider high-level application behavior when assigning data to different types of memory, or 2) require separate program execution (with a representative input) to collect information about how the application uses memory resources.
This work presents a toolset for addressing the limitations of existing approaches for managing complex memories. It extends the application runtime layer with automated monitoring and management routines that assign application data to the best tier of memory based on previous usage, without any need for source code modification or a separate profiling run. It evaluates this approach on a state-of-the-art server platform with both conventional DDR4 SDRAM and non-volatile Intel Optane DC memory, using both memory-intensive high performance computing (HPC) applications as well as standard benchmarks. Overall, the results show that this approach improves program performance significantly compared to a standard unguided approach across a variety of workloads and system configurations. Additionally, we show that this approach achieves similar performance as a comparable offline profiling-based approach after a short startup period, without requiring separate program execution or offline analysis steps.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Scalable Community Detection in Massive Networks Using Aggregated Relational Data
Authors:
Timothy Jones,
Owen G. Ward,
Yiran Jiang,
John Paisley,
Tian Zheng
Abstract:
The mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model for community detection. Fitting such large Bayesian network models quickly becomes computationally infeasible when the number of nodes grows into hundreds of thousands and millions. In this paper we propose a novel mini-batch strategy based on aggregated relational data that leverages nodal information to fit MM…
▽ More
The mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model for community detection. Fitting such large Bayesian network models quickly becomes computationally infeasible when the number of nodes grows into hundreds of thousands and millions. In this paper we propose a novel mini-batch strategy based on aggregated relational data that leverages nodal information to fit MMSB to massive networks. We describe a scalable inference method that can utilize nodal information that often accompanies real-world networks. Conditioning on this extra information leads to a model that admits a parallel stochastic variational inference algorithm, utilizing stochastic gradients of bipartite graph formed from aggregated network ties between node subpopulations. We apply our method to a citation network with over two million nodes and 25 million edges, capturing explainable structure in this network. Our method recovers parameters and achieves better convergence on simulated networks generated according to the MMSB.
△ Less
Submitted 23 May, 2024; v1 submitted 22 July, 2021;
originally announced August 2021.
-
Reinforcement Learning based Disease Progression Model for Alzheimer's Disease
Authors:
Krishnakant V. Saboo,
Anirudh Choudhary,
Yurui Cao,
Gregory A. Worrell,
David T. Jones,
Ravishankar K. Iyer
Abstract:
We model Alzheimer's disease (AD) progression by combining differential equations (DEs) and reinforcement learning (RL) with domain knowledge. DEs provide relationships between some, but not all, factors relevant to AD. We assume that the missing relationships must satisfy general criteria about the working of the brain, for e.g., maximizing cognition while minimizing the cost of supporting cognit…
▽ More
We model Alzheimer's disease (AD) progression by combining differential equations (DEs) and reinforcement learning (RL) with domain knowledge. DEs provide relationships between some, but not all, factors relevant to AD. We assume that the missing relationships must satisfy general criteria about the working of the brain, for e.g., maximizing cognition while minimizing the cost of supporting cognition. This allows us to extract the missing relationships by using RL to optimize an objective (reward) function that captures the above criteria. We use our model consisting of DEs (as a simulator) and the trained RL agent to predict individualized 10-year AD progression using baseline (year 0) features on synthetic and real data. The model was comparable or better at predicting 10-year cognition trajectories than state-of-the-art learning-based models. Our interpretable model demonstrated, and provided insights into, "recovery/compensatory" processes that mitigate the effect of AD, even though those processes were not explicitly encoded in the model. Our framework combines DEs with RL for modelling AD progression and has broad applicability for understanding other neurological disorders.
△ Less
Submitted 2 November, 2021; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Long-Range Time-Synchronisation Methods in LoRaWAN-based IoT
Authors:
Timothy Jones,
Khondokar Fida Hasan
Abstract:
LoRa (Long-Range) is an LPWAN (low-power wide-area network) protocol that is part of the IoT family that focusses on long-range communication of up to 14km, albeit with delay-inherent transmissions. Three IoT-based time synchronisation methodologies are analysed, and their efficacy measured through a systematic critical literature review. These include a GNSS-based method, an off-the-shelf GPS har…
▽ More
LoRa (Long-Range) is an LPWAN (low-power wide-area network) protocol that is part of the IoT family that focusses on long-range communication of up to 14km, albeit with delay-inherent transmissions. Three IoT-based time synchronisation methodologies are analysed, and their efficacy measured through a systematic critical literature review. These include a GNSS-based method, an off-the-shelf GPS hardware resampling method, and the LongShoT method, within the context of vehicular ad-hoc networks (VANET), wireless sensor networks (WSN), and long-range wide area network (LoRaWAN) respectively. Although two of the three methods are not LoRaWAN-specific, the findings obtained from the research are applied to the context of LoRa in the proposed methodology. A methodology for selecting a time synchronisation methodology with regards to LoRa specifically is posited, whereby each requirement of synchronisation objective, energy consumption and costs, scenario and security analysis, application requirements, microcontroller requirements and transceiver requirements are taken into consideration. These are then followed by a fine-grain approach to the selection of a particular time-sync method. The resultant methodology may not only have implications in the field of research, where practitioners may adopt this literature review as a baseline understanding of time synchronisation methods and obstacles encountered toward LoRa, however developers of applications for the LoRaWAN may adapt the analysed methods outlined within.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Data-driven Thermal Anomaly Detection for Batteries using Unsupervised Shape Clustering
Authors:
Xiaojun Li,
Jianwei Li,
Ali Abdollahi,
Trevor Jones
Abstract:
For electric vehicles (EV) and energy storage (ES) batteries, thermal runaway is a critical issue as it can lead to uncontrollable fires or even explosions. Thermal anomaly detection can identify problematic battery packs that may eventually undergo thermal runaway. However, there are common challenges like data unavailability, environment and configuration variations, and battery aging. We propos…
▽ More
For electric vehicles (EV) and energy storage (ES) batteries, thermal runaway is a critical issue as it can lead to uncontrollable fires or even explosions. Thermal anomaly detection can identify problematic battery packs that may eventually undergo thermal runaway. However, there are common challenges like data unavailability, environment and configuration variations, and battery aging. We propose a data-driven method to detect battery thermal anomaly based on comparing shape-similarity between thermal measurements. Based on their shapes, the measurements are continuously being grouped into different clusters. Anomaly is detected by monitoring deviations within the clusters. Unlike model-based or other data-driven methods, the proposed method is robust to data loss and requires minimal reference data for different pack configurations. As the initial experimental results show, the method not only can be more accurate than the onboard BMS and but also can detect unforeseen anomalies at the early stage.
△ Less
Submitted 19 May, 2021; v1 submitted 15 March, 2021;
originally announced March 2021.
-
Scalable Blocking for Very Large Databases
Authors:
Andrew Borthwick,
Stephen Ash,
Bin Pang,
Shehzad Qureshi,
Timothy Jones
Abstract:
In the field of database deduplication, the goal is to find approximately matching records within a database. Blocking is a typical stage in this process that involves cheaply finding candidate pairs of records that are potential matches for further processing. We present here Hashed Dynamic Blocking, a new approach to blocking designed to address datasets larger than those studied in most prior w…
▽ More
In the field of database deduplication, the goal is to find approximately matching records within a database. Blocking is a typical stage in this process that involves cheaply finding candidate pairs of records that are potential matches for further processing. We present here Hashed Dynamic Blocking, a new approach to blocking designed to address datasets larger than those studied in most prior work. Hashed Dynamic Blocking (HDB) extends Dynamic Blocking, which leverages the insight that rare matching values and rare intersections of values are predictive of a matching relationship. We also present a novel use of Locality Sensitive Hashing (LSH) to build blocking key values for huge databases with a convenient configuration to control the trade-off between precision and recall. HDB achieves massive scale by minimizing data movement, using compact block representation, and greedily pruning ineffective candidate blocks using a Count-min Sketch approximate counting data structure. We benchmark the algorithm by focusing on real-world datasets in excess of one million rows, demonstrating that the algorithm displays linear time complexity scaling in this range. Furthermore, we execute HDB on a 530 million row industrial dataset, detecting 68 billion candidate pairs in less than three hours at a cost of $307 on a major cloud service.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
The gem5 Simulator: Version 20.0+
Authors:
Jason Lowe-Power,
Abdul Mutaal Ahmad,
Ayaz Akram,
Mohammad Alian,
Rico Amslinger,
Matteo Andreozzi,
Adrià Armejach,
Nils Asmussen,
Brad Beckmann,
Srikant Bharadwaj,
Gabe Black,
Gedare Bloom,
Bobby R. Bruce,
Daniel Rodrigues Carvalho,
Jeronimo Castrillon,
Lizhong Chen,
Nicolas Derumigny,
Stephan Diestelhorst,
Wendy Elsasser,
Carlos Escuin,
Marjan Fariborz,
Amin Farmahini-Farahani,
Pouya Fotouhi,
Ryan Gambord,
Jayneel Gandhi
, et al. (53 additional authors not shown)
Abstract:
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 si…
▽ More
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 simulator has been under active development over the last nine years since the original gem5 release. In this time, there have been over 7500 commits to the codebase from over 250 unique contributors which have improved the simulator by adding new features, fixing bugs, and increasing the code quality. In this paper, we give and overview of gem5's usage and features, describe the current state of the gem5 simulator, and enumerate the major changes since the initial release of gem5. We also discuss how the gem5 simulator has transitioned to a formal governance model to enable continued improvement and community support for the next 20 years of computer architecture research.
△ Less
Submitted 29 September, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.
-
A Coefficient of Determination for Probabilistic Topic Models
Authors:
Tommy Jones
Abstract:
This research proposes a new (old) metric for evaluating goodness of fit in topic models, the coefficient of determination, or $R^2$. Within the context of topic modeling, $R^2$ has the same interpretation that it does when used in a broader class of statistical models. Reporting $R^2$ with topic models addresses two current problems in topic modeling: a lack of standard cross-contextual evaluatio…
▽ More
This research proposes a new (old) metric for evaluating goodness of fit in topic models, the coefficient of determination, or $R^2$. Within the context of topic modeling, $R^2$ has the same interpretation that it does when used in a broader class of statistical models. Reporting $R^2$ with topic models addresses two current problems in topic modeling: a lack of standard cross-contextual evaluation metrics for topic modeling and ease of communication with lay audiences. The author proposes that $R^2$ should be reported as a standard metric when constructing topic models.
△ Less
Submitted 25 November, 2019; v1 submitted 19 November, 2019;
originally announced November 2019.
-
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State
Authors:
Sam Ainsworth,
Timothy M. Jones
Abstract:
The disclosure of the Spectre speculative-execution attacks in January 2018 has left a severe vulnerability that systems are still struggling with how to patch. The solutions that currently exist tend to have incomplete coverage, perform badly, or have highly undesirable edge cases that cause application domains to break.
MuonTrap allows processors to continue to speculate, avoiding significant…
▽ More
The disclosure of the Spectre speculative-execution attacks in January 2018 has left a severe vulnerability that systems are still struggling with how to patch. The solutions that currently exist tend to have incomplete coverage, perform badly, or have highly undesirable edge cases that cause application domains to break.
MuonTrap allows processors to continue to speculate, avoiding significant reductions in performance, without impacting security. We instead prevent the propagation of any state based on speculative execution, by placing the results of speculative cache accesses into a small, fast L0 filter cache, that is non-inclusive, non-exclusive with the rest of the cache hierarchy. This isolates all parts of the system that can't be quickly cleared on any change in threat domain.
MuonTrap uses these speculative filter caches, which are cleared on context and protection-domain switches, along with a series of extensions to the cache coherence protocol and prefetcher. This renders systems immune to cross-domain information leakage via Spectre and a host of similar attacks based on speculative execution, with low performance impact and few changes to the CPU design.
△ Less
Submitted 28 April, 2020; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Organization of machine learning based product development as per ISO 26262 and ISO/PAS 21448
Authors:
Krystian Radlak,
Michał Szczepankiewicz,
Tim Jones,
Piotr Serwa
Abstract:
Machine learning (ML) algorithms generate a continuous stream of success stories from various domains and enable many novel applications in safety-critical systems. With the advent of autonomous driving, ML algorithms are being used in the automotive domain, where the applicable functional safety standard is ISO 26262. However, requirements and recommendations provided by ISO 26262 do not cover sp…
▽ More
Machine learning (ML) algorithms generate a continuous stream of success stories from various domains and enable many novel applications in safety-critical systems. With the advent of autonomous driving, ML algorithms are being used in the automotive domain, where the applicable functional safety standard is ISO 26262. However, requirements and recommendations provided by ISO 26262 do not cover specific properties of machine learning algorithms. Therefore, specific aspects of ML (e.g., dataset requirements, performance evaluation metrics, lack of interpretability) must be addressed within some work products, which collect documentation resulting from one or more associated requirements and recommendations of ISO 26262. In this paper, we propose how key technical aspects and supporting processes related to development of ML-based systems can be organized according to ISO 26262 phases, sub-phases, and work products. We follow the same approach as in the ISO/PAS 21448 standard, which complements ISO 26262, in order to account for edge cases that can lead to hazards not directly caused by system failure.%, but resulting from functional insufficiencies of the intended functionality or by reasonably foreseeable misuse by persons.
△ Less
Submitted 6 January, 2021; v1 submitted 7 October, 2019;
originally announced October 2019.
-
End-to-end Learning for GMI Optimized Geometric Constellation Shape
Authors:
Rasmus T. Jones,
Metodi P. Yankov,
Darko Zibar
Abstract:
Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.
Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
On Verifying Timed Hyperproperties
Authors:
Hsi-Ming Ho,
Ruoyu Zhou,
Timothy M. Jones
Abstract:
We study the satisfiability and model-checking problems for timed hyperproperties specified with HyperMTL, a timed extension of HyperLTL. Depending on whether interleaving of events in different traces is allowed, two possible semantics can be defined for timed hyperproperties: asynchronous and synchronous. While the satisfiability problem can be decided similarly to HyperLTL regardless of the cho…
▽ More
We study the satisfiability and model-checking problems for timed hyperproperties specified with HyperMTL, a timed extension of HyperLTL. Depending on whether interleaving of events in different traces is allowed, two possible semantics can be defined for timed hyperproperties: asynchronous and synchronous. While the satisfiability problem can be decided similarly to HyperLTL regardless of the choice of semantics, we show that the model-checking problem, unless the specification is alternation-free, is undecidable even when very restricted timing constraints are allowed. On the positive side, we show that model checking HyperMTL with quantifier alternations is possible under certain conditions in the synchronous semantics, or when there is a fixed bound on the length of the time domain.
△ Less
Submitted 24 December, 2018;
originally announced December 2018.
-
Anomaly Detection in Paleoclimate Records using Permutation Entropy
Authors:
Joshua Garland,
Tyler R. Jones,
Michael Neuder,
Valerie Morris,
James W. C. White,
Elizabeth Bradley
Abstract:
Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces:…
▽ More
Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces: specifically, in the amount of new information that appeared at every time step. We conjectured that this effect was due to noise introduced by an older laboratory instrument. In this paper, we validate that conjecture by re-analyzing a section of the ice core using a more-advanced version of the laboratory instrument. The anomalous noise levels are absent from the permutation entropy traces of the new data. In other sections of the core, we show that permutation entropy techniques can be used to identify anomalies in the raw data that are not associated with climatic or glaciological processes, but rather effects occurring during field work, laboratory analysis, or data post-processing. These examples make it clear that permutation entropy is a useful forensic tool for identifying sections of data that require targeted re-analysis---and can even be useful in guiding that analysis.
△ Less
Submitted 29 November, 2018; v1 submitted 3 November, 2018;
originally announced November 2018.