Skip to main content

Showing 1–16 of 16 results for author: Godoy, W F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.21039  [pdf, ps, other

    cs.DC cs.CE cs.ET cs.PL

    Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

    Authors: William F. Godoy, Tatiana Melnichenko, Pedro Valero-Lara, Wael Elwasif, Philip Fackler, Rafael Ferreira Da Silva, Keita Teranishi, Jeffrey S. Vetter

    Abstract: We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM's Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python's interoperability and CUDA-like syntax for compile-time portable GPU programming. We target four s… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

    Comments: Accepted at the IEEE/ACM SC25 Conference WACCPD Workshop. The International Conference for High Performance Computing, Networking, Storage, and Analysis, St. Louis, MO, Nov 16-21, 2025. 15 pages, 7 figures. WFG and TM contributed equally

  2. arXiv:2505.08135  [pdf, ps, other

    cs.SE cs.AI cs.DC cs.PF

    Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

    Authors: Keita Teranishi, Harshitha Menon, William F. Godoy, Prasanna Balaprakash, David Bau, Tal Ben-Nun, Abhinav Bhatele, Franz Franchetti, Michael Franusich, Todd Gamblin, Giorgis Georgakoudis, Tom Goldstein, Arjun Guha, Steven Hahn, Costin Iancu, Zheming Jin, Terry Jones, Tze Meng Low, Het Mankad, Narasinga Rao Miniskar, Mohammad Alaul Haque Monil, Daniel Nichols, Konstantinos Parasyris, Swaroop Pophale, Pedro Valero-Lara , et al. (3 additional authors not shown)

    Abstract: We discuss the challenges and propose research directions for using AI to revolutionize the development of high-performance computing (HPC) software. AI technologies, in particular large language models, have transformed every aspect of software development. For its part, HPC software is recognized as a highly specialized scientific field of its own. We discuss the challenges associated with lever… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 12 pages, 1 Figure, Accepted at "The 1st International Workshop on Foundational Large Language Models Advances for HPC" LLM4HPC to be held in conjunction with ISC High Performance 2025

    Journal ref: In: Neuwirth, S., Paul, A.K., Weinzierl, T., Carson, E.C. (eds) High Performance Computing. ISC High Performance 2025. Lecture Notes in Computer Science, vol 16091. Springer, Cham

  3. Characterizing GPU Energy Usage in Exascale-Ready Portable Science Applications

    Authors: William F. Godoy, Oscar Hernandez, Paul R. C. Kent, Maria Patrou, Kazi Asifuzzaman, Narasinga Rao Miniskar, Pedro Valero-Lara, Jeffrey S. Vetter, Matthew D. Sinclair, Jason Lowe-Power, Bobby R. Bruce

    Abstract: We characterize the GPU energy usage of two widely adopted exascale-ready applications representing two classes of particle and mesh solvers: (i) QMCPACK, a quantum Monte Carlo package, and (ii) AMReXCastro, an adaptive mesh astrophysical code. We analyze power, temperature, utilization, and energy traces from double-/single (mixed)-precision benchmarks on NVIDIA's A100 and H100 and AMD's MI250X G… ▽ More

    Submitted 26 November, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Comments: 13 pages, 8 figures, 3 tables. Accepted at the Energy Efficiency with Sustainable Performance: Techniques, Tools, and Best Practices, EESP Workshop, in conjunction with ISC High Performance 2025

    Journal ref: In: Neuwirth, S., Paul, A.K., Weinzierl, T., Carson, E.C. (eds) High Performance Computing. ISC High Performance 2025. Lecture Notes in Computer Science, vol 16091. Springer, Cham

  4. Julia as a unifying end-to-end workflow language on the Frontier exascale system

    Authors: William F. Godoy, Pedro Valero-Lara, Caira Anderson, Katrina W. Lee, Ana Gainaru, Rafael Ferreira da Silva, Jeffrey S. Vetter

    Abstract: We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational… ▽ More

    Submitted 27 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 11 pages, 8 figures, accepted at the 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC23

  5. arXiv:2309.07103  [pdf, other

    cs.SE cs.AI cs.DC cs.PL

    Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

    Authors: Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous wor… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted at LCPC 2023, The 36th International Workshop on Languages and Compilers for Parallel Computing http://www.lcpcworkshop.org/LCPC23/ . 13 pages, 5 figures, 1 table

  6. arXiv:2307.11502  [pdf, other

    cs.SE cs.DC physics.comp-ph

    Software engineering to sustain a high-performance computing scientific application: QMCPACK

    Authors: William F. Godoy, Steven E. Hahn, Michael M. Walsh, Philip W. Fackler, Jaron T. Krogel, Peter W. Doak, Paul R. C. Kent, Alfredo A. Correa, Ye Luo, Mark Dewing

    Abstract: We provide an overview of the software engineering efforts and their impact in QMCPACK, a production-level ab-initio Quantum Monte Carlo open-source code targeting high-performance computing (HPC) systems. Aspects included are: (i) strategic expansion of continuous integration (CI) targeting CPUs, using GitHub Actions runners, and NVIDIA and AMD GPUs in pre-exascale systems, using self-hosted hard… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Accepted at the first US-RSE Conference, USRSE2023, https://us-rse.org/usrse23/, 8 pages, 3 figures, 4 tables

  7. arXiv:2306.15121  [pdf, other

    cs.AI cs.ET cs.PL

    Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

    Authors: William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offl… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at the Sixteenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2023 to be held in conjunction with ICPP 2023: The 52nd International Conference on Parallel Processing. 10 pages, 6 figures, 5 tables

  8. Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

    Authors: William F. Godoy, Pedro Valero-Lara, T. Elise Dettling, Christian Trefftz, Ian Jorquera, Thomas Sheehy, Ross G. Miller, Marc Gonzalez-Tallada, Jeffrey S. Vetter, Valentin Churavy

    Abstract: We explore the performance and portability of the high-level programming models: the LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) nodes: AMD Epyc CPUs and MI250X graphical processing units (GPUs) on Frontier's test bed Crusher system and Ampere's Arm-based CPUs and NVIDIA's A100 GPUs on the Wombat system at the Oak Ridge Leadership Computing Facilities. We comp… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: Accepted at the 28th HIPS workshop, held in conjunction with IPDPS 2023. 10 pages, 9 figures

  9. Giving RSEs a Larger Stage through the Better Scientific Software Fellowship

    Authors: William F. Godoy, Ritu Arora, Keith Beattie, David E. Bernholdt, Sarah E. Bratt, Daniel S. Katz, Ignacio Laguna, Amiya K. Maji, Addi Malviya Thakur, Rafael M. Mudafort, Nitin Sukhija, Damian Rouson, Cindy Rubio-González, Karan Vahi

    Abstract: The Better Scientific Software Fellowship (BSSwF) was launched in 2018 to foster and promote practices, processes, and tools to improve developer productivity and software sustainability of scientific codes. BSSwF's vision is to grow the community with practitioners, leaders, mentors, and consultants to increase the visibility of scientific software production and sustainability. Over the last fiv… ▽ More

    Submitted 14 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: submitted to Computing in Science & Engineering (CiSE), Special Issue on the Future of Research Software Engineers in the US

  10. arXiv:2211.02740  [pdf, other

    cs.DC

    Bridging HPC Communities through the Julia Programming Language

    Authors: Valentin Churavy, William F Godoy, Carsten Bauer, Hendrik Ranocha, Michael Schlottke-Lakemper, Ludovic Räss, Johannes Blaschke, Mosè Giordano, Erik Schnetter, Samuel Omlin, Jeffrey S. Vetter, Alan Edelman

    Abstract: The Julia programming language has evolved into a modern alternative to fill existing gaps in scientific computing and data science applications. Julia leverages a unified and coordinated single-language and ecosystem paradigm and has a proven track record of achieving high performance without sacrificing user productivity. These aspects make Julia a viable alternative to high-performance computin… ▽ More

    Submitted 10 November, 2022; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: 20 pages; improved image quality

  11. arXiv:2209.02610  [pdf, ps, other

    cs.SE

    A perspective to navigate the National Laboratory environment for RSE career growth

    Authors: William F Godoy

    Abstract: This paper shares a perspective for the research software engineering (RSE) community to navigate the National Laboratory landscape. The RSE role is a recent concept that led to organizational challenges to place and evaluate their impact, costs and benefits. The premise is that RSEs are a natural fit into the current landscape and can use traditional career growth strategies in science: publicati… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: 2 pages, paper presented at the RSE-HPC workshop https://us-rse.org/rse-hpc-2022/ , part of Supercomputing 2022 https://sc22.supercomputing.org/

  12. Modeling pre-Exascale AMR Parallel I/O Workloads via Proxy Applications

    Authors: William F Godoy, Jenna Delozier, Gregory R Watson

    Abstract: The present work investigates the modeling of pre-exascale input/output (I/O) workloads of Adaptive Mesh Refinement (AMR) simulations through a simple proxy application. We collect data from the AMReX Castro framework running on the Summit supercomputer for a wide range of scales and mesh partitions for the hydrodynamic Sedov case as a baseline to provide sufficient coverage to the formulated prox… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: 10 pages, 11 figures, accepted at Seventeenth International Workshop on Automatic Performance Tuning, iWAPT2022, held in conjunction with IEEE IPDPS 2022

  13. A Survey on Sustainable Software Ecosystems to Support Experimental and Observational Science at Oak Ridge National Laboratory

    Authors: David E Bernholdt, Mathieu Doucet, William F Godoy, Addi Malviya-Thakur, Gregory R Watson

    Abstract: In the search for a sustainable approach for software ecosystems that supports experimental and observational science (EOS) across Oak Ridge National Laboratory (ORNL), we conducted a survey to understand the current and future landscape of EOS software and data. This paper describes the survey design we used to identify significant areas of interest, gaps, and potential opportunities, followed by… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: 14 pages, no figures, only tables

    Journal ref: ICCS 2022, SE4Science Workshop

  14. Efficient loading of reduced data ensembles produced at ORNL SNS/HFIR neutron time-of-flight facilities

    Authors: William F Godoy, Andrei T Savici, Steven E Hahn, Peter F Peterson

    Abstract: We present algorithmic improvements to the loading operations of certain reduced data ensembles produced from neutron scattering experiments at Oak Ridge National Laboratory (ORNL) facilities. Ensembles from multiple measurements are required to cover a wide range of the phase space of a sample material of interest. They are stored using the standard NeXus schema on individual HDF5 files. This mak… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

    Comments: 7 pages, 6 figures, 4 tables, The Second International Workshop on Big Data Reduction held with 2021 IEEE International Conference on Big Data

  15. Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2

    Authors: Franz Poeschel, Juncheng E, William F. Godoy, Norbert Podhorszki, Scott Klasky, Greg Eisenhauer, Philip E. Davis, Lipeng Wan, Ana Gainaru, Junmin Gu, Fabian Koller, René Widera, Michael Bussmann, Axel Huebl

    Abstract: This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexibly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mes… ▽ More

    Submitted 19 January, 2022; v1 submitted 13 July, 2021; originally announced July 2021.

    Comments: 18 pages, 9 figures, SMC2021, supplementary material at https://zenodo.org/record/4906276

  16. Efficient Data Management in Neutron Scattering Data Reduction Workflows at ORNL

    Authors: William F Godoy, Peter F Peterson, Steven E Hahn, Jay J Billings

    Abstract: Oak Ridge National Laboratory (ORNL) experimental neutron science facilities produce 1.2\,TB a day of raw event-based data that is stored using the standard metadata-rich NeXus schema built on top of the HDF5 file format. Performance of several data reduction workflows is largely determined by the amount of time spent on the loading and processing algorithms in Mantid, an open-source data analysis… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    Comments: 7 pages, 4 figures, International Workshop on Big Data Reduction held with 2020 IEEE International Conference on Big Data