Skip to main content

Showing 1–50 of 61 results for author: Jones, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2603.08721  [pdf, ps, other

    cs.AR cs.LG cs.SE

    KernelCraft: Benchmarking for Agentic Close-to-Metal Kernel Generation on Emerging Hardware

    Authors: Jiayi Nie, Haoran Wu, Yao Lai, Zeyu Cao, Cheng Zhang, Binglei Lou, Erwei Wang, Jianyi Cheng, Timothy M. Jones, Robert Mullins, Rika Antonova, Yiren Zhao

    Abstract: New AI accelerators with novel instruction set architectures (ISAs) often require developers to manually craft low-level kernels -- a time-consuming, laborious, and error-prone process that cannot scale across diverse hardware targets. This prevents emerging hardware platforms from reaching the market efficiently. While prior LLM-based code generation has shown promise in mature GPU ecosystems, it… ▽ More

    Submitted 10 February, 2026; originally announced March 2026.

  2. arXiv:2603.01570  [pdf, ps, other

    cs.DB cs.LG

    Adversarial Query Synthesis via Bayesian Optimization

    Authors: Jeffrey Tao, Yimeng Zeng, Haydn Thomas Jones, Natalie Maus, Osbert Bastani, Jacob R. Gardner, Ryan Marcus

    Abstract: Benchmark workloads are extremely important to the database management research community, especially as more machine learning components are integrated into database systems. Here, we propose a Bayesian optimization technique to automatically search for difficult benchmark queries, significantly reducing the amount of manual effort usually required. In preliminary experiments, we show that our ap… ▽ More

    Submitted 2 March, 2026; originally announced March 2026.

  3. arXiv:2601.22382  [pdf, ps, other

    cs.LG

    Purely Agentic Black-Box Optimization for Biological Design

    Authors: Natalie Maus, Yimeng Zeng, Haydn Thomas Jones, Yining Huang, Gaurav Ng Goel, Alden Rose, Kyurae Kim, Hyun-Su Lee, Marcelo Der Torossian Torres, Fangping Wan, Cesar de la Fuente-Nunez, Mark Yatskar, Osbert Bastani, Jacob R. Gardner

    Abstract: Many key challenges in biological design-such as small-molecule drug discovery, antimicrobial peptide development, and protein engineering-can be framed as black-box optimization over vast, complex structured spaces. Existing methods rely mainly on raw structural data and struggle to exploit the rich scientific literature. While large language models (LLMs) have been added to these pipelines, they… ▽ More

    Submitted 29 January, 2026; originally announced January 2026.

  4. arXiv:2601.15399  [pdf, ps, other

    cs.LG

    Attention-Informed Surrogates for Navigating Power-Performance Trade-offs in HPC

    Authors: Ashna Nawar Ahmed, Banooqa Banday, Terry Jones, Tanzima Z. Islam

    Abstract: High-Performance Computing (HPC) schedulers must balance user performance with facility-wide resource constraints. The task boils down to selecting the optimal number of nodes for a given job. We present a surrogate-assisted multi-objective Bayesian optimization (MOBO) framework to automate this complex decision. Our core hypothesis is that surrogate models informed by attention-based embeddings o… ▽ More

    Submitted 21 January, 2026; originally announced January 2026.

    Comments: 13 pages, 6 figures Published in MLForSys workshop in NeurIPS 2025 Link: https://openreview.net/forum?id=R0Vc9lnDd5

  5. arXiv:2511.14759  [pdf, ps, other

    cs.LG cs.RO

    $π^{*}_{0.6}$: a VLA That Learns From Experience

    Authors: Physical Intelligence, Ali Amin, Raichelle Aniceto, Ashwin Balakrishna, Kevin Black, Ken Conley, Grace Connors, James Darpinian, Karan Dhabalia, Jared DiCarlo, Danny Driess, Michael Equi, Adnan Esmail, Yunhao Fang, Chelsea Finn, Catherine Glossop, Thomas Godden, Ivan Goryachev, Lachy Groom, Hunter Hancock, Karol Hausman, Gashon Hussein, Brian Ichter, Szymon Jakubczak, Rowan Jen , et al. (31 additional authors not shown)

    Abstract: We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditioning. Our method incorporates heterogeneous data into the self-improvement process, including demon… ▽ More

    Submitted 18 November, 2025; v1 submitted 18 November, 2025; originally announced November 2025.

  6. arXiv:2509.21713  [pdf

    cs.CY cs.AI

    Developing Strategies to Increase Capacity in AI Education

    Authors: Noah Q. Cowit, Sri Yash Tadimalla, Stephanie T. Jones, Mary Lou Maher, Tracy Camp, Enrico Pontelli

    Abstract: Many institutions are currently grappling with teaching artificial intelligence (AI) in the face of growing demand and relevance in our world. The Computing Research Association (CRA) has conducted 32 moderated virtual roundtable discussions of 202 experts committed to improving AI education. These discussions slot into four focus areas: AI Knowledge Areas and Pedagogy, Infrastructure Challenges i… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

    Comments: This is a 40 page report prepared by the CRA based on 32 virtual roundtable discussions with 202 experts committed to developing AI Education from varied backgrounds

  7. arXiv:2509.09505  [pdf, ps, other

    cs.AR

    Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference

    Authors: Haoran Wu, Can Xiao, Jiayi Nie, Xuan Guo, Binglei Lou, Jeffrey T. H. Wong, Zhiwen Mo, Cheng Zhang, Przemyslaw Forys, Wayne Luk, Hongxiang Fan, Jianyi Cheng, Timothy M. Jones, Rika Antonova, Robert Mullins, Aaron Zhao

    Abstract: LLMs now form the backbone of AI agents for a diverse array of applications, including tool use, command-line agents, and web or computer use agents. These agentic LLM inference tasks are fundamentally different from chatbot-focused inference -- they often have much larger context lengths to capture complex, prolonged inputs, such as entire webpage DOMs or complicated tool call trajectories. This,… ▽ More

    Submitted 24 September, 2025; v1 submitted 11 September, 2025; originally announced September 2025.

  8. arXiv:2509.05448  [pdf, ps, other

    cs.CE cs.AI

    Newton to Einstein: Axiom-Based Discovery via Game Design

    Authors: Pingchuan Ma, Benjamin Tod Jones, Tsun-Hsuan Wang, Minghao Guo, Michal Piotr Lipiec, Chuang Gan, Wojciech Matusik

    Abstract: This position paper argues that machine learning for scientific discovery should shift from inductive pattern recognition to axiom-based reasoning. We propose a game design framework in which scientific inquiry is recast as a rule-evolving system: agents operate within environments governed by axioms and modify them to explain outlier observations. Unlike conventional ML approaches that operate wi… ▽ More

    Submitted 5 September, 2025; originally announced September 2025.

  9. arXiv:2508.20016  [pdf, ps, other

    cs.DC cs.AI cs.ET eess.SY

    HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling

    Authors: Matthias Maiterth, Wesley H. Brewer, Jaya S. Kuruvella, Arunavo Dey, Tanzima Z. Islam, Kevin Menear, Dmitry Duplyakin, Rashadul Kabir, Tapasya Patki, Terry Jones, Feiyi Wang

    Abstract: Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter c… ▽ More

    Submitted 27 August, 2025; v1 submitted 27 August, 2025; originally announced August 2025.

  10. arXiv:2508.10899  [pdf, ps, other

    cs.LG

    A Dataset for Distilling Knowledge Priors from Literature for Therapeutic Design

    Authors: Haydn Thomas Jones, Natalie Maus, Josh Magnus Ludan, Maggie Ziyu Huan, Jiaming Liang, Marcelo Der Torossian Torres, Jiatao Liang, Zachary Ives, Yoseph Barash, Cesar de la Fuente-Nunez, Jacob R. Gardner, Mark Yatskar

    Abstract: AI-driven discovery can greatly reduce design time and enhance new therapeutics' effectiveness. Models using simulators explore broad design spaces but risk violating implicit constraints due to a lack of experimental priors. For example, in a new analysis we performed on a diverse set of models on the GuacaMol benchmark using supervised classifiers, over 60\% of molecules proposed had high probab… ▽ More

    Submitted 11 September, 2025; v1 submitted 14 August, 2025; originally announced August 2025.

  11. arXiv:2508.08492  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Momentum Point-Perplexity Mechanics in Large Language Models

    Authors: Lorenzo Tomaz, Judd Rosenblatt, Thomas Berry Jones, Diogo Schwerz de Lucena

    Abstract: We take a physics-based approach to studying how the internal hidden states of large language models change from token to token during inference. Across 20 open-source transformer models (135M-3B parameters), we find that a quantity combining the rate of change in hidden states and the model's next-token certainty, analogous to energy in physics, remains nearly constant. Random-weight models conse… ▽ More

    Submitted 11 August, 2025; originally announced August 2025.

  12. arXiv:2507.22943  [pdf

    cs.CL stat.ME

    A chart review process aided by natural language processing and multi-wave adaptive sampling to expedite validation of code-based algorithms for large database studies

    Authors: Shirley V Wang, Georg Hahn, Sushama Kattinakere Sreedhara, Mufaddal Mahesri, Haritha S. Pillai, Rajendra Aldis, Joyce Lii, Sarah K. Dutcher, Rhoda Eniafe, Jamal T. Jones, Keewan Kim, Jiwei He, Hana Lee, Sengwee Toh, Rishi J Desai, Jie Yang

    Abstract: Background: One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassifica… ▽ More

    Submitted 25 July, 2025; originally announced July 2025.

  13. arXiv:2507.16983  [pdf, ps, other

    cs.LG cs.RO

    Hierarchical Reinforcement Learning Framework for Adaptive Walking Control Using General Value Functions of Lower-Limb Sensor Signals

    Authors: Sonny T. Jones, Grange M. Simpson, Patrick M. Pilarski, Ashley N. Dalrymple

    Abstract: Rehabilitation technology is a natural setting to study the shared learning and decision-making of human and machine agents. In this work, we explore the use of Hierarchical Reinforcement Learning (HRL) to develop adaptive control strategies for lower-limb exoskeletons, aiming to enhance mobility and autonomy for individuals with motor impairments. Inspired by prominent models of biological sensor… ▽ More

    Submitted 22 July, 2025; originally announced July 2025.

    Comments: 5 pages, 3 figures, accepted at the 6th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM2025), June 11-14, 2025

  14. arXiv:2505.08135  [pdf, ps, other

    cs.SE cs.AI cs.DC cs.PF

    Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

    Authors: Keita Teranishi, Harshitha Menon, William F. Godoy, Prasanna Balaprakash, David Bau, Tal Ben-Nun, Abhinav Bhatele, Franz Franchetti, Michael Franusich, Todd Gamblin, Giorgis Georgakoudis, Tom Goldstein, Arjun Guha, Steven Hahn, Costin Iancu, Zheming Jin, Terry Jones, Tze Meng Low, Het Mankad, Narasinga Rao Miniskar, Mohammad Alaul Haque Monil, Daniel Nichols, Konstantinos Parasyris, Swaroop Pophale, Pedro Valero-Lara , et al. (3 additional authors not shown)

    Abstract: We discuss the challenges and propose research directions for using AI to revolutionize the development of high-performance computing (HPC) software. AI technologies, in particular large language models, have transformed every aspect of software development. For its part, HPC software is recognized as a highly specialized scientific field of its own. We discuss the challenges associated with lever… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 12 pages, 1 Figure, Accepted at "The 1st International Workshop on Foundational Large Language Models Advances for HPC" LLM4HPC to be held in conjunction with ISC High Performance 2025

    Journal ref: In: Neuwirth, S., Paul, A.K., Weinzierl, T., Carson, E.C. (eds) High Performance Computing. ISC High Performance 2025. Lecture Notes in Computer Science, vol 16091. Springer, Cham

  15. arXiv:2504.16054  [pdf, other

    cs.LG cs.RO

    $π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

    Authors: Physical Intelligence, Kevin Black, Noah Brown, James Darpinian, Karan Dhabalia, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Niccolo Fusai, Manuel Y. Galliker, Dibya Ghosh, Lachy Groom, Karol Hausman, Brian Ichter, Szymon Jakubczak, Tim Jones, Liyiming Ke, Devin LeBlanc, Sergey Levine, Adrian Li-Bell, Mohith Mothukuri, Suraj Nair, Karl Pertsch, Allen Z. Ren , et al. (11 additional authors not shown)

    Abstract: In order for robots to be useful, they must perform practically relevant tasks in the real world, outside of the lab. While vision-language-action (VLA) models have demonstrated impressive results for end-to-end robot control, it remains an open question how far such models can generalize in the wild. We describe $π_{0.5}$, a new model based on $π_{0}$ that uses co-training on heterogeneous tasks… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  16. arXiv:2504.01380  [pdf, other

    cs.CR cs.AR

    FireGuard: A Generalized Microarchitecture for Fine-Grained Security Analysis on OoO Superscalar Cores

    Authors: Zhe Jiang, Sam Ainsworth, Timothy Jones

    Abstract: High-performance security guarantees rely on hardware support. Generic programmable support for fine-grained instruction analysis has gained broad interest in the literature as a fundamental building block for the security of future processors. Yet, implementation in real out-of-order (OoO) superscalar processors presents tough challenges that cannot be explored in highly abstract simulators. We d… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  17. arXiv:2504.01347  [pdf, other

    cs.AR

    MEEK: Re-thinking Heterogeneous Parallel Error Detection Architecture for Real-World OoO Superscalar Processors

    Authors: Zhe Jiang, Minli Liao, Sam Ainsworth, Dean You, Timothy Jones

    Abstract: Heterogeneous parallel error detection is an approach to achieving fault-tolerant processors, leveraging multiple power-efficient cores to re-execute software originally run on a high-performance core. Yet, its complex components, gathering data cross-chip from many parts of the core, raise questions of how to build it into commodity cores without heavy design invasion and extensive re-engineering… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  18. arXiv:2503.08131  [pdf, ps, other

    cs.LG

    Large Scale Multi-Task Bayesian Optimization with Large Language Models

    Authors: Yimeng Zeng, Natalie Maus, Haydn Thomas Jones, Jeffrey Tao, Fangping Wan, Marcelo Der Torossian Torres, Cesar de la Fuente-Nunez, Ryan Marcus, Osbert Bastani, Jacob R. Gardner

    Abstract: In multi-task Bayesian optimization, the goal is to leverage experience from optimizing existing tasks to improve the efficiency of optimizing new ones. While approaches using multi-task Gaussian processes or deep kernel transfer exist, the performance improvement is marginal when scaling beyond a moderate number of tasks. We introduce a novel approach leveraging large language models (LLMs) to le… ▽ More

    Submitted 12 June, 2025; v1 submitted 11 March, 2025; originally announced March 2025.

  19. arXiv:2502.09819  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.PL

    A Solver-Aided Hierarchical Language for LLM-Driven CAD Design

    Authors: Benjamin T. Jones, Felix Hähnlein, Zihan Zhang, Maaz Ahmad, Vladimir Kim, Adriana Schulz

    Abstract: Large language models (LLMs) have been enormously successful in solving a wide variety of structured and unstructured generative tasks, but they struggle to generate procedural geometry in Computer Aided Design (CAD). These difficulties arise from an inability to do spatial reasoning and the necessity to guide a model through complex, long range planning to generate complex geometry. We enable gen… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

  20. arXiv:2501.19342  [pdf, ps, other

    cs.LG

    Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization

    Authors: Natalie Maus, Kyurae Kim, Yimeng Zeng, Haydn Thomas Jones, Fangping Wan, Marcelo Der Torossian Torres, Cesar de la Fuente-Nunez, Jacob R. Gardner

    Abstract: In multi-objective black-box optimization, the goal is typically to find solutions that optimize a set of $T$ black-box objective functions, $f_1, \ldots f_T$, simultaneously. Traditional approaches often seek a single Pareto-optimal set that balances trade-offs among all objectives. In contrast, we consider a problem setting that departs from this paradigm: finding a small set of $K < T$ solution… ▽ More

    Submitted 27 October, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

  21. arXiv:2411.19896  [pdf, other

    quant-ph cs.LG stat.ML

    Efficient quantum-enhanced classical simulation for patches of quantum landscapes

    Authors: Sacha Lerch, Ricard Puig, Manuel S. Rudolph, Armando Angrisani, Tyson Jones, M. Cerezo, Supanut Thanasilp, Zoë Holmes

    Abstract: Understanding the capabilities of classical simulation methods is key to identifying where quantum computers are advantageous. Not only does this ensure that quantum computers are used only where necessary, but also one can potentially identify subroutines that can be offloaded onto a classical device. In this work, we show that it is always possible to generate a classical surrogate of a sub-regi… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Comments: 10 + 47 pages, 4 figures

    Report number: LA-UR: LA-UR-24-3269

  22. arXiv:2410.24164  [pdf, ps, other

    cs.LG cs.RO

    $π_0$: A Vision-Language-Action Flow Model for General Robot Control

    Authors: Kevin Black, Noah Brown, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Niccolo Fusai, Lachy Groom, Karol Hausman, Brian Ichter, Szymon Jakubczak, Tim Jones, Liyiming Ke, Sergey Levine, Adrian Li-Bell, Mohith Mothukuri, Suraj Nair, Karl Pertsch, Lucy Xiaoyang Shi, James Tanner, Quan Vuong, Anna Walling, Haohuan Wang, Ury Zhilinsky

    Abstract: Robot learning holds tremendous promise to unlock the full potential of flexible, general, and dexterous robot systems, as well as to address some of the deepest questions in artificial intelligence. However, bringing robot learning to the level of generality required for effective real-world systems faces major obstacles in terms of data, generalization, and robustness. In this paper, we discuss… ▽ More

    Submitted 8 January, 2026; v1 submitted 31 October, 2024; originally announced October 2024.

    Comments: See project website for videos: https://physicalintelligence.company/blog/pi0 Published in RSS 2025

  23. arXiv:2408.17324  [pdf, other

    cs.LG cs.AI cs.CL

    Modularity in Transformers: Investigating Neuron Separability & Specialization

    Authors: Nicholas Pochinkov, Thomas Jones, Mohammed Rashidur Rahman

    Abstract: Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 11 pages, 6 figures

    MSC Class: 68T07 (Primary) 68Q32; 68T05 (Secondary) ACM Class: I.2.4; I.2.6; I.2.7

  24. arXiv:2407.06362  [pdf, other

    cs.RO physics.app-ph

    Self-deployable contracting-cord metamaterials with tunable mechanical properties

    Authors: Wenzhong Yan, Talmage Jones, Christopher L. Jawetz, Ryan H. Lee, Jonathan B. Hopkins, Ankur Mehta

    Abstract: Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design stra… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 6 figures

    Journal ref: Materials Horizons (2024)

  25. arXiv:2405.15113  [pdf, other

    cs.RO

    A Wearable Resistance Devices Motor Learning Effects in Exercise

    Authors: Eugenio Frias-Miranda, Hong-Anh Nguyen, Jeremy Hampton, Trenner Jones, Benjamin Spotts, Matthew Cochran, Deva Chan, Laura H Blumenschein

    Abstract: The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learni… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 8 pages, 9 figures, To be published in IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob) 2024

  26. arXiv:2402.01796  [pdf, other

    eess.AS cs.CL cs.LG

    Speech foundation models in healthcare: Effect of layer selection on pathological speech feature prediction

    Authors: Daniela A. Wiepert, Rene L. Utianski, Joseph R. Duffy, John L. Stricker, Leland R. Barnard, David T. Jones, Hugo Botha

    Abstract: Accurately extracting clinical information from speech is critical to the diagnosis and treatment of many neurological conditions. As such, there is interest in leveraging AI for automatic, objective assessments of clinical speech to facilitate diagnosis and treatment of speech disorders. We explore transfer learning using foundation models, focusing on the impact of layer selection for the downst… ▽ More

    Submitted 21 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted to INTERSPEECH 2024

  27. arXiv:2401.16971  [pdf, other

    cs.DC

    Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations

    Authors: Francieli Boito, Jim Brandt, Valeria Cardellini, Philip Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin Huck, Thomas Ilsche, Thomas Jakobsche, Terry Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael Showerman, Devesh Tiwari , et al. (2 additional authors not shown)

    Abstract: Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and throughput. Because of the complexity and scale of systems and workflows and the need for low-latency response to address dynamic circumstances, automated feedback and response have the potential to be more… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  28. arXiv:2310.13010  [pdf, other

    eess.AS cs.AI

    Detecting Speech Abnormalities with a Perceiver-based Sequence Classifier that Leverages a Universal Speech Model

    Authors: Hagen Soltau, Izhak Shafran, Alex Ottenwess, Joseph R. JR Duffy, Rene L. Utianski, Leland R. Barnard, John L. Stricker, Daniela Wiepert, David T. Jones, Hugo Botha

    Abstract: We propose a Perceiver-based sequence classifier to detect abnormalities in speech reflective of several neurological disorders. We combine this classifier with a Universal Speech Model (USM) that is trained (unsupervised) on 12 million hours of diverse audio recordings. Our model compresses long sequences into a small set of class-specific latent representations and a factorized projection is use… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Journal ref: Proc. ASRU, 2023

  29. arXiv:2306.03217  [pdf, other

    cs.GR

    Zero-shot CAD Program Re-Parameterization for Interactive Manipulation

    Authors: Milin Kodnongbua, Benjamin T. Jones, Maaz Bin Safeer Ahmad, Vladimir G. Kim, Adriana Schulz

    Abstract: Parametric CAD models encode entire families of shapes that should, in principle, be easy for designers to explore. However, in practice, parametric CAD models can be difficult to manipulate due to implicit semantic constraints among parameter values. Finding and enforcing these semantic constraints solely from geometry or programmatic shape representations is not possible because these constraint… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  30. arXiv:2302.10776  [pdf

    cs.LG stat.AP

    SparCA: Sparse Compressed Agglomeration for Feature Extraction and Dimensionality Reduction

    Authors: Leland Barnard, Farwa Ali, Hugo Botha, David T. Jones

    Abstract: The most effective dimensionality reduction procedures produce interpretable features from the raw input space while also providing good performance for downstream supervised learning tasks. For many methods, this requires optimizing one or more hyperparameters for a specific task, which can limit generalizability. In this study we propose sparse compressed agglomeration (SparCA), a novel dimensio… ▽ More

    Submitted 26 January, 2023; originally announced February 2023.

    Comments: 17 pages, 5 figures, 3 tables

  31. arXiv:2211.13353  [pdf, ps, other

    cs.SI math.CO

    Effects of Backtracking on PageRank

    Authors: Cory Glover, Tyler Jones, Mark Kempton, Alice Oveson

    Abstract: In this paper, we consider three variations on standard PageRank: Non-backtracking PageRank, $μ$-PageRank, and $\infty$-PageRank, all of which alter the standard formula by adjusting the likelihood of backtracking in the algorithm's random walk. We show that in the case of regular and bipartite biregular graphs, standard PageRank and its variants are equivalent. We also compare each centrality mea… ▽ More

    Submitted 9 February, 2026; v1 submitted 23 November, 2022; originally announced November 2022.

    MSC Class: 05C85; 05C82; 05C50

  32. arXiv:2210.10807  [pdf, other

    cs.CV cs.GR

    Self-Supervised Representation Learning for CAD

    Authors: Benjamin T. Jones, Michael Hu, Vladimir G. Kim, Adriana Schulz

    Abstract: The design of man-made objects is dominated by computer aided design (CAD) tools. Assisting design with data-driven machine learning methods is hampered by lack of labeled data in CAD's native format; the parametric boundary representation (B-Rep). Several data sets of mechanical parts in B-Rep format have recently been released for machine learning research. However, large scale databases are lar… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

  33. arXiv:2210.09975  [pdf

    eess.AS cs.CR cs.LG cs.SD

    Risk of re-identification for shared clinical speech recordings

    Authors: Daniela A. Wiepert, Bradley A. Malin, Joseph R. Duffy, Rene L. Utianski, John L. Stricker, David T. Jones, Hugo Botha

    Abstract: Large, curated datasets are required to leverage speech-based tools in healthcare. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (i.e., voiceprints), sharing recordings raises privacy concerns. We examine the re-identification risk for speech recordings, without reference to demographic or metadata, using a state-of-the-ar… ▽ More

    Submitted 21 August, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 24 pages, 6 figures

  34. arXiv:2208.01779  [pdf, other

    cs.CV

    Mates2Motion: Learning How Mechanical CAD Assemblies Work

    Authors: James Noeckel, Benjamin T. Jones, Karl Willis, Brian Curless, Adriana Schulz

    Abstract: We describe our work on inferring the degrees of freedom between mated parts in mechanical assemblies using deep learning on CAD representations. We train our model using a large dataset of real-world mechanical assemblies consisting of CAD parts and mates joining them together. We present methods for re-defining these mates to make them better reflect the motion of the assembly, as well as narrow… ▽ More

    Submitted 4 May, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: Contains 5 pages, 2 figures. Presented at the ICML 2022 Workshop on Machine Learning in Computational Design

  35. arXiv:2207.01336  [pdf, ps, other

    cs.NI cs.LG

    Spectral Power Profile Optimization of Field-Deployed WDM Network by Remote Link Modeling

    Authors: Rasmus T. Jones, Kyle R. H. Bottrill, Natsupa Taengnoi, Periklis Petropoulos, Metodi P. Yankov

    Abstract: A digital twin model of a multi-node WDM network is obtained from a single access point. The model is used to predict and optimize the transmit power profile for each link in the network and up to 2.2~dB of margin improvements are obtained w.r.t. unoptimized transmission.

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: accepted, European Conference on Optical Communications, ECOC 2022

  36. arXiv:2203.03737  [pdf, other

    eess.SY cs.AI

    Battery Cloud with Advanced Algorithms

    Authors: Xiaojun Li, David Jauernig, Mengzhu Gao, Trevor Jones

    Abstract: A Battery Cloud or cloud battery management system leverages the cloud computational power and data storage to improve battery safety, performance, and economy. This work will present the Battery Cloud that collects measured battery data from electric vehicles and energy storage systems. Advanced algorithms are applied to improve battery performance. Using remote vehicle data, we train and validat… ▽ More

    Submitted 12 May, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

  37. arXiv:2201.11872  [pdf, other

    cs.LG stat.ML

    Local Latent Space Bayesian Optimization over Structured Inputs

    Authors: Natalie Maus, Haydn T. Jones, Juston S. Moore, Matt J. Kusner, John Bradshaw, Jacob R. Gardner

    Abstract: Bayesian optimization over the latent spaces of deep autoencoder models (DAEs) has recently emerged as a promising new approach for optimizing challenging black-box functions over structured, discrete, hard-to-enumerate search spaces (e.g., molecules). Here the DAE dramatically simplifies the search space by mapping inputs into a continuous latent space where familiar Bayesian optimization tools c… ▽ More

    Submitted 22 February, 2023; v1 submitted 27 January, 2022; originally announced January 2022.

  38. arXiv:2110.02150  [pdf, other

    cs.PF

    Online Application Guidance for Heterogeneous Memory Systems

    Authors: M. Ben Olson, Brandon Kammerdiener, Kshitij A. Doshi, Terry Jones, Michael R. Jantz

    Abstract: Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize t… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  39. arXiv:2108.01727  [pdf, other

    cs.SI

    Scalable Community Detection in Massive Networks Using Aggregated Relational Data

    Authors: Timothy Jones, Owen G. Ward, Yiran Jiang, John Paisley, Tian Zheng

    Abstract: The mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model for community detection. Fitting such large Bayesian network models quickly becomes computationally infeasible when the number of nodes grows into hundreds of thousands and millions. In this paper we propose a novel mini-batch strategy based on aggregated relational data that leverages nodal information to fit MM… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 July, 2021; originally announced August 2021.

  40. arXiv:2106.16187  [pdf, other

    cs.LG

    Reinforcement Learning based Disease Progression Model for Alzheimer's Disease

    Authors: Krishnakant V. Saboo, Anirudh Choudhary, Yurui Cao, Gregory A. Worrell, David T. Jones, Ravishankar K. Iyer

    Abstract: We model Alzheimer's disease (AD) progression by combining differential equations (DEs) and reinforcement learning (RL) with domain knowledge. DEs provide relationships between some, but not all, factors relevant to AD. We assume that the missing relationships must satisfy general criteria about the working of the brain, for e.g., maximizing cognition while minimizing the cost of supporting cognit… ▽ More

    Submitted 2 November, 2021; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: 10 pages main text, 3 page references, 11 page appendix

  41. arXiv:2106.12117  [pdf

    cs.NI

    Long-Range Time-Synchronisation Methods in LoRaWAN-based IoT

    Authors: Timothy Jones, Khondokar Fida Hasan

    Abstract: LoRa (Long-Range) is an LPWAN (low-power wide-area network) protocol that is part of the IoT family that focusses on long-range communication of up to 14km, albeit with delay-inherent transmissions. Three IoT-based time synchronisation methodologies are analysed, and their efficacy measured through a systematic critical literature review. These include a GNSS-based method, an off-the-shelf GPS har… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

    Comments: 23 Pages

  42. Data-driven Thermal Anomaly Detection for Batteries using Unsupervised Shape Clustering

    Authors: Xiaojun Li, Jianwei Li, Ali Abdollahi, Trevor Jones

    Abstract: For electric vehicles (EV) and energy storage (ES) batteries, thermal runaway is a critical issue as it can lead to uncontrollable fires or even explosions. Thermal anomaly detection can identify problematic battery packs that may eventually undergo thermal runaway. However, there are common challenges like data unavailability, environment and configuration variations, and battery aging. We propos… ▽ More

    Submitted 19 May, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

    Comments: 6 pages

    Journal ref: 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE), 2021, pp. 1-6

  43. arXiv:2008.08285  [pdf, other

    cs.DB cs.DC cs.DS

    Scalable Blocking for Very Large Databases

    Authors: Andrew Borthwick, Stephen Ash, Bin Pang, Shehzad Qureshi, Timothy Jones

    Abstract: In the field of database deduplication, the goal is to find approximately matching records within a database. Blocking is a typical stage in this process that involves cheaply finding candidate pairs of records that are potential matches for further processing. We present here Hashed Dynamic Blocking, a new approach to blocking designed to address datasets larger than those studied in most prior w… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

  44. arXiv:2007.03152  [pdf, other

    cs.AR

    The gem5 Simulator: Version 20.0+

    Authors: Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Brad Beckmann, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jeronimo Castrillon, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Carlos Escuin, Marjan Fariborz, Amin Farmahini-Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi , et al. (53 additional authors not shown)

    Abstract: The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 si… ▽ More

    Submitted 29 September, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: Source, comments, and feedback: https://github.com/darchr/gem5-20-paper

  45. arXiv:1911.11061  [pdf, other

    cs.IR cs.LG stat.ML

    A Coefficient of Determination for Probabilistic Topic Models

    Authors: Tommy Jones

    Abstract: This research proposes a new (old) metric for evaluating goodness of fit in topic models, the coefficient of determination, or $R^2$. Within the context of topic modeling, $R^2$ has the same interpretation that it does when used in a broader class of statistical models. Reporting $R^2$ with topic models addresses two current problems in topic modeling: a lack of standard cross-contextual evaluatio… ▽ More

    Submitted 25 November, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

  46. MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State

    Authors: Sam Ainsworth, Timothy M. Jones

    Abstract: The disclosure of the Spectre speculative-execution attacks in January 2018 has left a severe vulnerability that systems are still struggling with how to patch. The solutions that currently exist tend to have incomplete coverage, perform badly, or have highly undesirable edge cases that cause application domains to break. MuonTrap allows processors to continue to speculate, avoiding significant… ▽ More

    Submitted 28 April, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

  47. Organization of machine learning based product development as per ISO 26262 and ISO/PAS 21448

    Authors: Krystian Radlak, Michał Szczepankiewicz, Tim Jones, Piotr Serwa

    Abstract: Machine learning (ML) algorithms generate a continuous stream of success stories from various domains and enable many novel applications in safety-critical systems. With the advent of autonomous driving, ML algorithms are being used in the automotive domain, where the applicable functional safety standard is ISO 26262. However, requirements and recommendations provided by ISO 26262 do not cover sp… ▽ More

    Submitted 6 January, 2021; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: 10 pages, 2 figures

  48. arXiv:1907.08535  [pdf, other

    cs.IT eess.SP stat.ML

    End-to-end Learning for GMI Optimized Geometric Constellation Shape

    Authors: Rasmus T. Jones, Metodi P. Yankov, Darko Zibar

    Abstract: Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: submitted to ECOC 2019

  49. arXiv:1812.10005  [pdf, ps, other

    cs.LO

    On Verifying Timed Hyperproperties

    Authors: Hsi-Ming Ho, Ruoyu Zhou, Timothy M. Jones

    Abstract: We study the satisfiability and model-checking problems for timed hyperproperties specified with HyperMTL, a timed extension of HyperLTL. Depending on whether interleaving of events in different traces is allowed, two possible semantics can be defined for timed hyperproperties: asynchronous and synchronous. While the satisfiability problem can be decided similarly to HyperLTL regardless of the cho… ▽ More

    Submitted 24 December, 2018; originally announced December 2018.

  50. arXiv:1811.01272  [pdf, other

    physics.data-an cs.IT physics.ao-ph

    Anomaly Detection in Paleoclimate Records using Permutation Entropy

    Authors: Joshua Garland, Tyler R. Jones, Michael Neuder, Valerie Morris, James W. C. White, Elizabeth Bradley

    Abstract: Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces:… ▽ More

    Submitted 29 November, 2018; v1 submitted 3 November, 2018; originally announced November 2018.

    Comments: 15 pages, 7 figures

    Journal ref: Entropy 2018, 20(12), 931;