Skip to main content

Showing 1–50 of 90 results for author: Balaprakash, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.05068  [pdf, ps, other

    cs.LG

    Towards Scaling Law Analysis For Spatiotemporal Weather Data

    Authors: Alexander Kiefer, Prasanna Balaprakash, Xiao Wang

    Abstract: Compute-optimal scaling laws are relatively well studied for NLP and CV, where objectives are typically single-step and targets are comparatively homogeneous. Weather forecasting is harder to characterize in the same framework: autoregressive rollouts compound errors over long horizons, outputs couple many physical channels with disparate scales and predictability, and globally pooled test metrics… ▽ More

    Submitted 6 April, 2026; originally announced April 2026.

    Comments: 9 pages, 6 figures, High Performance Computing for Imaging 2026

  2. arXiv:2511.10810  [pdf, ps, other

    cs.AI

    HARNESS: Human-Agent Risk Navigation and Event Safety System for Proactive Hazard Forecasting in High-Risk DOE Environments

    Authors: Ran Elgedawy, Sanjay Das, Ethan Seefried, Gavin Wiggins, Ryan Burchfield, Dana Hewit, Sudarshan Srinivasan, Todd Thomas, Prasanna Balaprakash, Tirthankar Ghosal

    Abstract: Operational safety at mission-critical work sites is a top priority given the complex and hazardous nature of daily tasks. This paper presents the Human-Agent Risk Navigation and Event Safety System (HARNESS), a modular AI framework designed to forecast hazardous events and analyze operational risks in U.S. Department of Energy (DOE) environments. HARNESS integrates Large Language Models (LLMs) wi… ▽ More

    Submitted 13 November, 2025; originally announced November 2025.

  3. arXiv:2511.04956  [pdf, ps, other

    cs.AI cs.CL

    ORCHID: Orchestrated Retrieval-Augmented Classification with Human-in-the-Loop Intelligent Decision-Making for High-Risk Property

    Authors: Maria Mahbub, Vanessa Lama, Sanjay Das, Brian Starks, Christopher Polchek, Saffell Silvers, Lauren Deck, Prasanna Balaprakash, Tirthankar Ghosal

    Abstract: High-Risk Property (HRP) classification is critical at U.S. Department of Energy (DOE) sites, where inventories include sensitive and often dual-use equipment. Compliance must track evolving rules designated by various export control policies to make transparent and auditable decisions. Traditional expert-only workflows are time-consuming, backlog-prone, and struggle to keep pace with shifting reg… ▽ More

    Submitted 6 November, 2025; originally announced November 2025.

  4. arXiv:2510.25137  [pdf, ps, other

    cs.CY cs.MA

    The Iceberg Index: Measuring Skills-centered Exposure in the AI Economy

    Authors: Ayush Chopra, Santanu Bhattacharya, DeAndrea Salvador, Ayan Paul, Teddy Wright, Aditi Garg, Feroz Ahmad, Alice C. Schwarze, Ramesh Raskar, Prasanna Balaprakash

    Abstract: Artificial Intelligence is reshaping America's \… ▽ More

    Submitted 26 November, 2025; v1 submitted 28 October, 2025; originally announced October 2025.

    Comments: iceberg.mit.edu

  5. arXiv:2510.03413  [pdf, ps, other

    cs.CE cs.AI

    Report of the 2025 Workshop on Next-Generation Ecosystems for Scientific Computing: Harnessing Community, Software, and AI for Cross-Disciplinary Team Science

    Authors: Lois Curfman McInnes, Dorian Arnold, Prasanna Balaprakash, Mike Bernhardt, Beth Cerny, Anshu Dubey, Roscoe Giles, Denice Ward Hood, Mary Ann Leung, Vanessa Lopez-Marrero, Paul Messina, Olivia B. Newton, Chris Oehmen, Stefan M. Wild, Jim Willenbring, Lou Woodley, Tony Baylis, David E. Bernholdt, Chris Camano, Johannah Cohoon, Charles Ferenbaugh, Stephen M. Fiore, Sandra Gesing, Diego Gomez-Zara, James Howison , et al. (18 additional authors not shown)

    Abstract: This report summarizes insights from the 2025 Workshop on Next-Generation Ecosystems for Scientific Computing: Harnessing Community, Software, and AI for Cross-Disciplinary Team Science, which convened more than 40 experts from national laboratories, academia, industry, and community organizations to chart a path toward more powerful, sustainable, and collaborative scientific software ecosystems.… ▽ More

    Submitted 7 October, 2025; v1 submitted 3 October, 2025; originally announced October 2025.

    Comments: 38 pages, 6 figures

    Report number: ANL-25/47 MSC Class: 68T01; 68U01; 97M10 ACM Class: I.6.0; I.2.0; G.4; D.0

  6. arXiv:2510.00133  [pdf, ps, other

    cs.LG

    Large Language Models Inference Engines based on Spiking Neural Networks

    Authors: Adarsha Balaji, Sandeep Madireddy, Prasanna Balaprakash

    Abstract: Foundational models based on the transformer architecture are currently the state-of-the-art in general language modeling, as well as in scientific areas such as material science and climate. However, training and deploying these models is computationally challenging as the time and space complexity has a quadratic relation to the input sequence length. Several efforts exploring efficient computat… ▽ More

    Submitted 14 October, 2025; v1 submitted 30 September, 2025; originally announced October 2025.

  7. arXiv:2509.13978  [pdf, ps, other

    cs.DC cs.AI cs.DB

    LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology

    Authors: Renan Souza, Timothy Poteet, Brian Etz, Daniel Rosendo, Amal Gueroudji, Woong Shin, Prasanna Balaprakash, Rafael Ferreira da Silva

    Abstract: Modern scientific discovery increasingly relies on workflows that process data across the Edge, Cloud, and High Performance Computing (HPC) continuum. Comprehensive and in-depth analyses of these data are critical for hypothesis validation, anomaly detection, reproducibility, and impactful findings. Although workflow provenance techniques support such analyses, at large scale, the provenance data… ▽ More

    Submitted 23 September, 2025; v1 submitted 17 September, 2025; originally announced September 2025.

    Comments: Paper accepted in the proceedings of the Supercomputing Conference (SC). Cite it as Renan Souza, Timothy Poteet, Brian Etz, Daniel Rosendo, Amal Gueroudji, Woong Shin, Prasanna Balaprakash, and Rafael Ferreira da Silva. LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology. In WORKS at the ACM/IEEE International Conference on Supercomputing, 2025

    MSC Class: 68M14; 68M20; 68T07 ACM Class: C.2.4; D.1.3; I.2.0

  8. arXiv:2509.10378  [pdf, ps, other

    hep-lat cs.LG

    Matrix-free Neural Preconditioner for the Dirac Operator in Lattice Gauge Theory

    Authors: Yixuan Sun, Srinivas Eswar, Yin Lin, William Detmold, Phiala Shanahan, Xiaoye Li, Yang Liu, Prasanna Balaprakash

    Abstract: Linear systems arise in generating samples and in calculating observables in lattice quantum chromodynamics~(QCD). Solving the Hermitian positive definite systems, which are sparse but ill-conditioned, involves using iterative methods, such as Conjugate Gradient (CG), which are time-consuming and computationally expensive. Preconditioners can effectively accelerate this process, with the state-of-… ▽ More

    Submitted 12 September, 2025; originally announced September 2025.

  9. arXiv:2509.09915  [pdf, ps, other

    cs.AI cs.DC

    The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science

    Authors: Woong Shin, Renan Souza, Daniel Rosendo, Frédéric Suter, Feiyi Wang, Prasanna Balaprakash, Rafael Ferreira da Silva

    Abstract: Modern scientific discovery increasingly requires coordinating distributed facilities and heterogeneous resources, forcing researchers to act as manual workflow coordinators rather than scientists. Advances in AI leading to AI agents show exciting new opportunities that can accelerate scientific discovery by providing intelligence as a component in the ecosystem. However, it is unclear how this ne… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

  10. arXiv:2508.16489  [pdf, ps, other

    physics.ao-ph cs.LG

    Ensembles of Neural Surrogates for Parametric Sensitivity in Ocean Modeling

    Authors: Yixuan Sun, Romain Egele, Sri Hari Krishna Narayanan, Luke Van Roekel, Carmelo Gonzales, Steven Brus, Balu Nadiga, Sandeep Madireddy, Prasanna Balaprakash

    Abstract: Accurate simulations of the oceans are crucial in understanding the Earth system. Despite their efficiency, simulations at lower resolutions must rely on various uncertain parameterizations to account for unresolved processes. However, model sensitivity to parameterizations is difficult to quantify, making it challenging to tune these parameterizations to reproduce observations. Deep learning surr… ▽ More

    Submitted 26 August, 2025; v1 submitted 22 August, 2025; originally announced August 2025.

    Comments: 12 pages, 7 figures

  11. arXiv:2508.02866  [pdf, ps, other

    cs.DC cs.DB

    PROV-AGENT: Unified Provenance for Tracking AI Agent Interactions in Agentic Workflows

    Authors: Renan Souza, Amal Gueroudji, Stephen DeWitt, Daniel Rosendo, Tirthankar Ghosal, Robert Ross, Prasanna Balaprakash, Rafael Ferreira da Silva

    Abstract: Large Language Models (LLMs) and other foundation models are increasingly used as the core of AI agents. In agentic workflows, these agents plan tasks, interact with humans and peers, and influence scientific outcomes across federated and heterogeneous environments. However, agents can hallucinate or reason incorrectly, propagating errors when one agent's output becomes another's input. Thus, assu… ▽ More

    Submitted 20 August, 2025; v1 submitted 4 August, 2025; originally announced August 2025.

    Comments: Paper accepted for publication in the Proceedings of the 2025 IEEE 21st International Conference on e-Science. Cite it as: R. Souza, A. Gueroudji, S. DeWitt, D. Rosendo, T. Ghosal, R. Ross, P. Balaprakash, R. F. da Silva, "PROV-AGENT: Unified Provenance for Tracking AI Agent Interactions in Agentic Workflows," IEEE International Conference on e-Science, Chicago, IL, USA, 2025

    MSC Class: 68T42; 68T30; 68P20; 68Q85; 68M14; ACM Class: D.2.12; H.2.4; I.2.11; C.2.4; H.3.4

  12. arXiv:2506.21788  [pdf, ps, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.atm-clus

    Multi-task parallelism for robust pre-training of graph foundation models on multi-source, multi-fidelity atomistic modeling data

    Authors: Massimiliano Lupo Pasini, Jong Youl Choi, Pei Zhang, Kshitij Mehta, Rylie Weaver, Ashwin M. Aji, Karl W. Schulz, Jorda Polo, Prasanna Balaprakash

    Abstract: Graph foundation models using graph neural networks promise sustainable, efficient atomistic modeling. To tackle challenges of processing multi-source, multi-fidelity data during pre-training, recent studies employ multi-task learning, in which shared message passing layers initially process input atomistic structures regardless of source, then route them to multiple decoding heads that predict da… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 15 pages, 4 figures, 2 tables

    MSC Class: 68T07; 68T09 ACM Class: I.2; I.2.5; I.2.11

  13. arXiv:2506.21411  [pdf, ps, other

    cs.LG

    Distributed Cross-Channel Hierarchical Aggregation for Foundation Models

    Authors: Aristeidis Tsaris, Isaac Lyngaas, John Lagregren, Mohamed Wahib, Larry York, Prasanna Balaprakash, Dan Lu, Feiyi Wang, Xiao Wang

    Abstract: Vision-based scientific foundation models hold significant promise for advancing scientific discovery and innovation. This potential stems from their ability to aggregate images from diverse sources such as varying physical groundings or data acquisition systems and to learn spatio-temporal correlations using transformer architectures. However, tokenizing and aggregating images can be compute-inte… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  14. arXiv:2506.19863  [pdf, ps, other

    physics.comp-ph cs.AI

    Exploring the Capabilities of the Frontier Large Language Models for Nuclear Energy Research

    Authors: Ahmed Almeldein, Mohammed Alnaggar, Rick Archibald, Tom Beck, Arpan Biswas, Rike Bostelmann, Wes Brewer, Chris Bryan, Christopher Calle, Cihangir Celik, Rajni Chahal, Jong Youl Choi, Arindam Chowdhury, Mark Cianciosa, Franklin Curtis, Gregory Davidson, Sebastian De Pascuale, Lisa Fassino, Ana Gainaru, Yashika Ghai, Luke Gibson, Qian Gong, Christopher Greulich, Scott Greenwood, Cory Hauck , et al. (25 additional authors not shown)

    Abstract: The AI for Nuclear Energy workshop at Oak Ridge National Laboratory evaluated the potential of Large Language Models (LLMs) to accelerate fusion and fission research. Fourteen interdisciplinary teams explored diverse nuclear science challenges using ChatGPT, Gemini, Claude, and other AI models over a single day. Applications ranged from developing foundation models for fusion reactor control to au… ▽ More

    Submitted 26 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  15. arXiv:2506.02025  [pdf, ps, other

    cs.DC cs.AI

    Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling

    Authors: Prachi Jadhav, Hongwei Jin, Ewa Deelman, Prasanna Balaprakash

    Abstract: High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness. Traditional methods, including heuristic-based, e.g., First-Come-First-Served (FJFS) and Shortest Job First (SJF), or intensive optimization techniques, often lack adaptability to dynamic workloads and, more impo… ▽ More

    Submitted 3 September, 2025; v1 submitted 29 May, 2025; originally announced June 2025.

    Comments: 10 pages, 6 figures, work under review

  16. arXiv:2505.08135  [pdf, ps, other

    cs.SE cs.AI cs.DC cs.PF

    Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

    Authors: Keita Teranishi, Harshitha Menon, William F. Godoy, Prasanna Balaprakash, David Bau, Tal Ben-Nun, Abhinav Bhatele, Franz Franchetti, Michael Franusich, Todd Gamblin, Giorgis Georgakoudis, Tom Goldstein, Arjun Guha, Steven Hahn, Costin Iancu, Zheming Jin, Terry Jones, Tze Meng Low, Het Mankad, Narasinga Rao Miniskar, Mohammad Alaul Haque Monil, Daniel Nichols, Konstantinos Parasyris, Swaroop Pophale, Pedro Valero-Lara , et al. (3 additional authors not shown)

    Abstract: We discuss the challenges and propose research directions for using AI to revolutionize the development of high-performance computing (HPC) software. AI technologies, in particular large language models, have transformed every aspect of software development. For its part, HPC software is recognized as a highly specialized scientific field of its own. We discuss the challenges associated with lever… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 12 pages, 1 Figure, Accepted at "The 1st International Workshop on Foundational Large Language Models Advances for HPC" LLM4HPC to be held in conjunction with ISC High Performance 2025

    Journal ref: In: Neuwirth, S., Paul, A.K., Weinzierl, T., Carson, E.C. (eds) High Performance Computing. ISC High Performance 2025. Lecture Notes in Computer Science, vol 16091. Springer, Cham

  17. arXiv:2505.04802  [pdf, ps, other

    cs.LG astro-ph.EP cs.AI cs.DC physics.ao-ph

    ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

    Authors: Xiao Wang, Jong-Youl Choi, Takuya Kurihaya, Isaac Lyngaas, Hong-Jun Yoon, Xi Xiao, David Pugmire, Ming Fan, Nasik M. Nafi, Aristeidis Tsaris, Ashwin M. Aji, Maliha Hossain, Mohamed Wahib, Dali Wang, Peter Thornton, Prasanna Balaprakash, Moetasim Ashfaq, Dan Lu

    Abstract: Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-reso… ▽ More

    Submitted 1 September, 2025; v1 submitted 7 May, 2025; originally announced May 2025.

  18. arXiv:2504.08112  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling

    Authors: Chaojian Li, Zhifan Ye, Massimiliano Lupo Pasini, Jong Youl Choi, Cheng Wan, Yingyan Celine Lin, Prasanna Balaprakash

    Abstract: Atomistic materials modeling is a critical task with wide-ranging applications, from drug discovery to materials science, where accurate predictions of the target material property can lead to significant advancements in scientific discovery. Graph Neural Networks (GNNs) represent the state-of-the-art approach for modeling atomistic material data thanks to their capacity to capture complex relatio… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

    Comments: Accepted by DAC'25

  19. arXiv:2502.11657  [pdf, ps, other

    physics.plasm-ph cs.LG

    How does ion temperature gradient turbulence depend on magnetic geometry? Insights from data and machine learning

    Authors: Matt Landreman, Jong Youl Choi, Caio Alves, Prasanna Balaprakash, R. Michael Churchill, Rory Conlin, Gareth Roberg-Clark

    Abstract: Magnetic geometry has a significant effect on the level of turbulent transport in fusion plasmas. Here, we model and analyze this dependence using multiple machine learning methods and a dataset of > 200,000 nonlinear simulations of ion-temperature-gradient turbulence in diverse non-axisymmetric geometries. The dataset is generated using a large collection of both optimized and randomly generated… ▽ More

    Submitted 3 June, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: Updated version that was accepted by Journal of Plasma Physics, with three new figures

    Journal ref: J. Plasma Phys. 91 (2025) E120

  20. arXiv:2412.08776  [pdf, other

    cs.LG stat.ML

    Bayesian optimized deep ensemble for uncertainty quantification of deep neural networks: a system safety case study on sodium fast reactor thermal stratification modeling

    Authors: Zaid Abulawi, Rui Hu, Prasanna Balaprakash, Yang Liu

    Abstract: Accurate predictions and uncertainty quantification (UQ) are essential for decision-making in risk-sensitive fields such as system safety modeling. Deep ensembles (DEs) are efficient and scalable methods for UQ in Deep Neural Networks (DNNs); however, their performance is limited when constructed by simply retraining the same DNN multiple times with randomly sampled initializations. To overcome th… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

  21. arXiv:2410.15120  [pdf

    cs.LG cond-mat.mtrl-sci

    Generalizable Prediction Model of Molten Salt Mixture Density with Chemistry-Informed Transfer Learning

    Authors: Julian Barra, Shayan Shahbazi, Anthony Birri, Rajni Chahal, Ibrahim Isah, Muhammad Nouman Anwar, Tyler Starkus, Prasanna Balaprakash, Stephen Lam

    Abstract: Optimally designing molten salt applications requires knowledge of their thermophysical properties, but existing databases are incomplete, and experiments are challenging. Ideal mixing and Redlich-Kister models are computationally cheap but lack either accuracy or generality. To address this, a transfer learning approach using deep neural networks (DNNs) is proposed, combining Redlich-Kister model… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

    Comments: Manuscript contains 25 pages including references and other information. Manuscript contains 4 figures and 3 tables. To be submitted to ACS Journal of Chemical Theory and Computation

  22. arXiv:2407.17545  [pdf, other

    cs.SE cs.AI cs.CL

    Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

    Authors: Hongwei Jin, George Papadimitriou, Krishnan Raghavan, Pawel Zuk, Prasanna Balaprakash, Cong Wang, Anirban Mandal, Ewa Deelman

    Abstract: Anomaly detection in computational workflows is critical for ensuring system reliability and security. However, traditional rule-based methods struggle to detect novel anomalies. This paper leverages large language models (LLMs) for workflow anomaly detection by exploiting their ability to learn complex data patterns. Two approaches are investigated: 1) supervised fine-tuning (SFT), where pre-trai… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 12 pages, 14 figures, paper is accepted by SC'24, source code, see: https://github.com/PoSeiDon-Workflows/LLM_AD

  23. arXiv:2406.12909  [pdf, other

    cs.LG physics.comp-ph

    Scalable Training of Trustworthy and Energy-Efficient Predictive Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN

    Authors: Massimiliano Lupo Pasini, Jong Youl Choi, Kshitij Mehta, Pei Zhang, David Rogers, Jonghyun Bae, Khaled Z. Ibrahim, Ashwin M. Aji, Karl W. Schulz, Jorda Polo, Prasanna Balaprakash

    Abstract: We present our work on developing and training scalable, trustworthy, and energy-efficient predictive graph foundation models (GFMs) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) computations in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduct… ▽ More

    Submitted 1 November, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 51 pages, 32 figures

    MSC Class: 68T07; 68T09 ACM Class: C.2.4; I.2.11

  24. arXiv:2405.15780  [pdf, other

    cs.CV cs.LG

    Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier

    Authors: Aristeidis Tsaris, Chengming Zhang, Xiao Wang, Junqi Yin, Siyan Liu, Moetasim Ashfaq, Ming Fan, Jong Youl Choi, Mohamed Wahib, Dan Lu, Prasanna Balaprakash, Feiyi Wang

    Abstract: Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to… ▽ More

    Submitted 17 April, 2024; originally announced May 2024.

  25. arXiv:2405.10389  [pdf, other

    eess.SY cs.LG

    Physics-Informed Heterogeneous Graph Neural Networks for DC Blocker Placement

    Authors: Hongwei Jin, Prasanna Balaprakash, Allen Zou, Pieter Ghysels, Aditi S. Krishnapriyan, Adam Mate, Arthur Barnes, Russell Bent

    Abstract: The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of trans… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Paper is accepted by PSCC 2024

  26. arXiv:2405.06133  [pdf, other

    cs.DC

    Advancing Anomaly Detection in Computational Workflows with Active Learning

    Authors: Krishnan Raghavan, George Papadimitriou, Hongwei Jin, Anirban Mandal, Mariam Kiran, Prasanna Balaprakash, Ewa Deelman

    Abstract: A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale m… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  27. arXiv:2404.14712  [pdf, other

    physics.ao-ph cs.AI cs.DC eess.IV physics.geo-ph

    ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

    Authors: Xiao Wang, Siyan Liu, Aristeidis Tsaris, Jong-Youl Choi, Ashwin Aji, Ming Fan, Wei Zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash

    Abstract: Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitati… ▽ More

    Submitted 19 August, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  28. arXiv:2404.10689  [pdf, other

    cs.LG eess.SP

    Network architecture search of X-ray based scientific applications

    Authors: Adarsha Balaji, Ramyad Hadidi, Gregory Kollmer, Mohammed E. Fouda, Prasanna Balaprakash

    Abstract: X-ray and electron diffraction-based microscopy use bragg peak detection and ptychography to perform 3-D imaging at an atomic resolution. Typically, these techniques are implemented using computationally complex tasks such as a Psuedo-Voigt function or solving a complex inverse problem. Recently, the use of deep neural networks has improved the existing state-of-the-art approaches. However, the de… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  29. arXiv:2404.09703  [pdf, other

    cs.LG stat.ML

    AI Competitions and Benchmarks: Dataset Development

    Authors: Romain Egele, Julio C. S. Jacques Junior, Jan N. van Rijn, Isabelle Guyon, Xavier Baró, Albert Clapés, Prasanna Balaprakash, Sergio Escalera, Thomas Moeslund, Jun Wan

    Abstract: Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual dat… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Preprint version of the 3rd Chapter of the book: Competitions and Benchmarks, the science behind the contests (https://sites.google.com/chalearn.org/book/home)

  30. arXiv:2404.05768  [pdf, other

    cs.LG physics.ao-ph stat.ML

    Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach

    Authors: Yixuan Sun, Ololade Sowunmi, Romain Egele, Sri Hari Krishna Narayanan, Luke Van Roekel, Prasanna Balaprakash

    Abstract: Training an effective deep learning model to learn ocean processes involves careful choices of various hyperparameters. We leverage the advanced search algorithms for multiobjective optimization in DeepHyper, a scalable hyperparameter optimization software, to streamline the development of neural networks tailored for ocean modeling. The focus is on optimizing Fourier neural operators (FNOs), a da… ▽ More

    Submitted 10 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  31. arXiv:2404.04111  [pdf, other

    cs.LG

    The Unreasonable Effectiveness Of Early Discarding After One Epoch In Neural Network Hyperparameter Optimization

    Authors: Romain Egele, Felix Mohr, Tom Viering, Prasanna Balaprakash

    Abstract: To reach high performance with deep learning, hyperparameter optimization (HPO) is essential. This process is usually time-consuming due to costly evaluations of neural networks. Early discarding techniques limit the resources granted to unpromising candidates by observing the empirical learning curves and canceling neural network training as soon as the lack of competitiveness of a candidate beco… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  32. arXiv:2402.09222  [pdf, other

    cs.PF

    Integrating ytopt and libEnsemble to Autotune OpenMC

    Authors: Xingfu Wu, John R. Tramm, Jeffrey Larson, John-Luke Navarro, Prasanna Balaprakash, Brice Videau, Michael Kruse, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter configurations and progressively fitting a surrogate model over the input-output space until exhausting the user-defined maximum number of evaluations or the wall-cl… ▽ More

    Submitted 17 September, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  33. Transfer-Learning-Based Autotuning Using Gaussian Copula

    Authors: Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash

    Abstract: As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computatio… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2023, available at https://dl.acm.org/doi/10.1145/3577193.3593712

    ACM Class: I.2.4; G.3; D.2.8

    Journal ref: Proceedings of the 37th International Conference on Supercomputing (2023) 37-49

  34. arXiv:2312.12705  [pdf, other

    cs.DC cs.AI

    Optimizing Distributed Training on Frontier for Large Language Models

    Authors: Sajal Dash, Isaac Lyngaas, Junqi Yin, Xiao Wang, Romain Egele, Guojing Cong, Feiyi Wang, Prasanna Balaprakash

    Abstract: Large language models (LLMs) have demonstrated remarkable success as foundational models, benefiting various downstream applications through fine-tuning. Recent studies on loss scaling have demonstrated the superior performance of larger LLMs compared to their smaller counterparts. Nevertheless, training LLMs with billions of parameters poses significant challenges and requires considerable comput… ▽ More

    Submitted 21 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Edited the abstract to better communicate the scope of the work

  35. arXiv:2310.04610  [pdf, other

    cs.AI cs.LG

    DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

    Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

    Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  36. arXiv:2310.01247  [pdf, other

    cs.LG cs.DC

    Self-supervised Learning for Anomaly Detection in Computational Workflows

    Authors: Hongwei Jin, Krishnan Raghavan, George Papadimitriou, Cong Wang, Anirban Mandal, Ewa Deelman, Prasanna Balaprakash

    Abstract: Anomaly detection is the task of identifying abnormal behavior of a system. Anomaly detection in computational workflows is of special interest because of its wide implications in various domains such as cybersecurity, finance, and social networks. However, anomaly detection in computational workflows~(often modeled as graphs) is a relatively unexplored problem and poses distinct challenges. For i… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  37. arXiv:2309.14936  [pdf, other

    cs.LG cs.DC

    Parallel Multi-Objective Hyperparameter Optimization with Uniform Normalization and Bounded Objectives

    Authors: Romain Egele, Tyler Chang, Yixuan Sun, Venkatram Vishwanath, Prasanna Balaprakash

    Abstract: Machine learning (ML) methods offer a wide range of configurable hyperparameters that have a significant influence on their performance. While accuracy is a commonly used performance objective, in many settings, it is not sufficient. Optimizing the ML models with respect to multiple objectives such as accuracy, confidence, fairness, calibration, privacy, latency, and memory consumption is becoming… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Preprint with appendices

  38. arXiv:2309.07103  [pdf, other

    cs.SE cs.AI cs.DC cs.PL

    Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

    Authors: Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous wor… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted at LCPC 2023, The 36th International Workshop on Languages and Compilers for Parallel Computing http://www.lcpcworkshop.org/LCPC23/ . 13 pages, 5 figures, 1 table

  39. arXiv:2308.04539  [pdf, other

    cs.LG cs.AI cs.NE

    Improving Performance in Continual Learning Tasks using Bio-Inspired Architectures

    Authors: Sandeep Madireddy, Angel Yanguas-Gil, Prasanna Balaprakash

    Abstract: The ability to learn continuously from an incoming data stream without catastrophic forgetting is critical to designing intelligent systems. Many approaches to continual learning rely on stochastic gradient descent and its variants that employ global error updates, and hence need to adopt strategies such as memory buffers or replay to circumvent its stability, greed, and short-term memory limitati… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  40. arXiv:2307.15422  [pdf, other

    cs.LG

    Is One Epoch All You Need For Multi-Fidelity Hyperparameter Optimization?

    Authors: Romain Egele, Isabelle Guyon, Yixuan Sun, Prasanna Balaprakash

    Abstract: Hyperparameter optimization (HPO) is crucial for fine-tuning machine learning models but can be computationally expensive. To reduce costs, Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. The baseline involv… ▽ More

    Submitted 26 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: 5 pages, with extended appendices

  41. arXiv:2307.10438  [pdf

    cs.LG physics.chem-ph q-bio.BM

    Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search

    Authors: Shengli Jiang, Shiyi Qin, Reid C. Van Lehn, Prasanna Balaprakash, Victor M. Zavala

    Abstract: Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to quantify uncertainties in the predictions. This capability is crucial for ensuring the trustworthy use and deployment of models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated uncertaint… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

  42. arXiv:2306.15121  [pdf, other

    cs.AI cs.ET cs.PL

    Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

    Authors: William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offl… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at the Sixteenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2023 to be held in conjunction with ICPP 2023: The 52nd International Conference on Parallel Processing. 10 pages, 6 figures, 5 tables

  43. arXiv:2306.09930  [pdf, other

    cs.DC

    Flow-Bench: A Dataset for Computational Workflow Anomaly Detection

    Authors: George Papadimitriou, Hongwei Jin, Cong Wang, Rajiv Mayani, Krishnan Raghavan, Anirban Mandal, Prasanna Balaprakash, Ewa Deelman

    Abstract: A computational workflow, also known as workflow, consists of tasks that must be executed in a specific order to attain a specific goal. Often, in fields such as biology, chemistry, physics, and data science, among others, these workflows are complex and are executed in large-scale, distributed, and heterogeneous computing environments prone to failures and performance degradation. Therefore, anom… ▽ More

    Submitted 13 June, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Work under review, updated with more workflow data

  44. arXiv:2305.12030  [pdf, other

    cs.LG cs.AI math.OC

    Learning Continually on a Sequence of Graphs -- The Dynamical System Way

    Authors: Krishnan Raghavan, Prasanna Balaprakash

    Abstract: Continual learning~(CL) is a field concerned with learning a series of inter-related task with the tasks typically defined in the sense of either regression or classification. In recent years, CL has been studied extensively when these tasks are defined using Euclidean data -- data, such as images, that can be described by a set of vectors in an n-dimensional real space. However, the literature is… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  45. arXiv:2303.16869  [pdf, other

    cs.CE cs.LG math.NA

    Application of probabilistic modeling and automated machine learning framework for high-dimensional stress field

    Authors: Lele Luan, Nesar Ramachandra, Sandipp Krishnan Ravi, Anindya Bhaduri, Piyush Pandita, Prasanna Balaprakash, Mihai Anitescu, Changjie Sun, Liping Wang

    Abstract: Modern computational methods, involving highly sophisticated mathematical formulations, enable several tasks like modeling complex physical phenomenon, predicting key properties and design optimization. The higher fidelity in these computer models makes it computationally intensive to query them hundreds of times for optimization and one usually relies on a simplified model albeit at the cost of l… ▽ More

    Submitted 11 April, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 17 pages, 16 figures, IDETC Conference Submission

  46. arXiv:2303.16245  [pdf, other

    cs.DC cs.LG cs.PF

    ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

    Authors: Xingfu Wu, Prasanna Balaprakash, Michael Kruse, Jaehoon Koo, Brice Videau, Paul Hovland, Valerie Taylor, Brad Geltz, Siddhartha Jana, Mary Hall

    Abstract: As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application r… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Journal ref: to be pushilshed in CUG2023

  47. arXiv:2302.09748  [pdf, other

    cs.LG math.DS

    Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

    Authors: Romit Maulik, Romain Egele, Krishnan Raghavan, Prasanna Balaprakash

    Abstract: Classical problems in computational physics such as data-driven forecasting and signal reconstruction from sparse sensors have recently seen an explosion in deep neural network (DNN) based algorithmic approaches. However, most DNN models do not provide uncertainty estimates, which are crucial for establishing the trustworthiness of these techniques in downstream decision making tasks and scenarios… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

  48. arXiv:2302.01887  [pdf, other

    cs.LG

    Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach

    Authors: Tanwi Mallick, Joshua David Bergerson, Duane R. Verner, John K Hutchison, Leslie-Anne Levy, Prasanna Balaprakash

    Abstract: Natural language processing (NLP) is a promising approach for analyzing large volumes of climate-change and infrastructure-related scientific literature. However, best-in-practice NLP techniques require large collections of relevant documents (corpus). Furthermore, NLP techniques using machine learning and deep learning techniques require labels grouping the articles based on user-defined criteria… ▽ More

    Submitted 5 February, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

  49. arXiv:2210.04083  [pdf, other

    cs.LG cs.AI

    Unified Probabilistic Neural Architecture and Weight Ensembling Improves Model Robustness

    Authors: Sumegha Premchandar, Sandeep Madireddy, Sanket Jantre, Prasanna Balaprakash

    Abstract: Robust machine learning models with accurately calibrated uncertainties are crucial for safety-critical applications. Probabilistic machine learning and especially the Bayesian formalism provide a systematic framework to incorporate robustness through the distributional estimates and reason about uncertainty. Recent works have shown that approximate inference approaches that take the weight space… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  50. HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization

    Authors: Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Rob Ross

    Abstract: Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given wor… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted at IEEE Cluster 2022