
The ALCF welcomed over 100 researchers to Argonne for the 2025 INCITE GPU Hackathon. (Image by Argonne National Laboratory)
The ALCF hosted its fifth GPU Hackathon to help attendees improve HPC and AI application performance on the facility’s computing resources.
The Argonne Leadership Computing Facility (ALCF) recently hosted its annual hackathon, bringing together researchers and developers to boost the performance of scientific codes on the facility’s Aurora and Polaris supercomputers. Held at the U.S. Department of Energy’s (DOE) Argonne National Laboratory, the hackathon paired attendees with ALCF staff mentors to scale and optimize their high performance computing (HPC) and artificial intelligence (AI) workloads for research proposals seeking access to ALCF systems.
Participating teams took advantage of the hands-on event to fine-tune their applications and demonstrate computational readiness for submissions to DOE’s INCITE (Innovative and Novel Computational Impact on Theory and Experiment) and ALCC (ASCR Leadership Computing Challenge) allocation programs. The hackathon also welcomed current INCITE and ALCC teams to work with staff experts to advance their projects on ALCF systems. The ALCF is a DOE Office of Science user facility at Argonne. ASCR stands for the Advanced Scientific Computing Research program of DOE’s Office of Science.
“Our goals are not limited to supporting existing teams, but also to recruit new research groups to leverage our HPC resources to advance their science,” says ALCF computational scientist Yasaman Ghadar, who organized this year’s event. “Over the course of the three-week program, experts will dedicate their time to each team, helping move projects forward in ways that could otherwise take months. The end result will be more science enabled on Argonne’s Aurora and Polaris systems.”
The majority of hackathon participants—22 out of 24 teams—were working toward proposals for the 2026 INCITE call.
This year’s event covered a wide range of topics, including onboarding attendees on Aurora, the TAU performance analysis toolkit, and Aurora’s Distributed Asynchronous Object Storage (DAOS) storage system.
After three years of attending the ALCF INCITE GPU Hackathon, Abraham Flores, a postdoctoral researcher with the Quantum Monte Carlo (QMC) group at Washington University in St. Louis, came back with the goal of getting his team’s code ready to take full advantage of Aurora for some large-scale science runs. His group, part of a large INCITE allocation, is using advanced quantum simulations to explore nuclear systems involved in the carbon-nitrogen-oxygen cycle—the process that helps power stars like our sun.
“My group studies nuclear physics from first principles, starting from our most realistic model of the interaction between nucleons (neutrons and protons) for solving the many-body Schrödinger equation,” says Flores. “With the significant code development and the power of Aurora, we are seeking to push our QMC simulations to systems never computed before in our framework.”
At the hackathon, the team homed in on a small, but critical chunk of their code—just 11 lines, but lines responsible for over half their total compute time. They pulled those lines into a standalone test program, ran real-world data through it, and started optimizing. The result was a 200x speedup on Aurora compared to their original serial version, thanks in part to a full code rewrite from Fortran 77 to modern Fortran 2023.
“When we began this work over three years ago, we theorized that GPUs should be able to devour the loop inside the 11 lines of code,” Flores says. “For us, this means the new science we wish to carry out is a very real possibility.”
Flores is already working the improvements back into the full codebase and gearing up for large-scale runs on Aurora.
Jason Stock, a postdoctoral researcher in Argonne’s Environmental Science Division, participated in the hackathon to accelerate his team’s efforts in data-driven Earth system modeling. With a background in computer science and a focus on generative machine learning, Stock is leading efforts to use probabilistic diffusion models for weather forecasts that stretch weeks to months ahead—a promising alternative to slower, traditional simulations.
“These models are costly, albeit faster than their numerical counterparts, and I was hoping to improve their inference speeds with a recent family of learning algorithms known as consistency modeling,” says Stock.
During the week, Stock and his team got consistency models up and running with real data, teamed up with experts to build a new training dataset, and found several ways to make their workflows more scalable on Aurora.
Stock also picked up hands-on experience with tools like DAOS and got a better feel for how job scheduling works at scale. “I had learned a bit more about the internal systems, including DAOS and the clusters' scheduling behavior,” he says. Stock is planning to fold what he learned at the hackathon into his team’s ongoing projects as they push to scale up.
Getting the most out of the Aurora was the goal for Hammad Farooq, a doctoral candidate in Bioinformatics at the University of Illinois Chicago. Farooq and his team attended the hackathon to further their work building high-resolution 3D models of genome folding inside individual cells. Their efforts could shed light on how gene regulation and non-coding genetic variants influence diseases.
“These models aim to uncover the regulatory structure of the genome and explore how genetic variations, particularly in non-coding regions, can impact chromatin organization and cellular behavior,” says Farooq. “My goal was to fully leverage Aurora’s extensive computational resources to further advance my research.”
At the hackathon, Farooq focused on getting his team’s C++ code running smoothly on Aurora’s Intel GPUs using OpenMP target offload.
“A key highlight of the hackathon was achieving our project goals with invaluable guidance from mentors and ALCF and Intel domain experts who were available in person,” says Farooq. “I gained practical experience in using Parsl to manage job submissions on Aurora and learned how to optimize DAOS for high-performance I/O.”
Since the hackathon, Farooq has been scaling his jobs to 2,000 nodes using Parsl and is planning a full switch to DAOS to speed up data handling even more. He and his team are staying connected with the ALCF community and signed up for more training to keep pushing their research forward.
Talks from the event are available for viewing on the ALCF’s YouTube channel.