Argonne’s Aurora supercomputer breaks exascale barrier

announcements
Aurora

Aurora provides the research community with advanced capabilities for large-scale projects involving simulations, artificial intelligence and data analysis. (Image by Argonne National Laboratory)

Aurora also earned the top spot in a measure of AI performance, achieving 10.6 exaflops on the HPL-MxP benchmark.

The Aurora supercomputer at the U.S. Department of Energy’s (DOE) Argonne National Laboratory has officially surpassed the exascale threshold, measuring over a quintillion calculations per second on the new Top500 list. The results were announced today at the ISC High Performance 2024 conference in Hamburg, Germany.

In its latest submission to the semi-annual list of the world’s most powerful supercomputers, Aurora registered 1.012 exaflops using 9,230 nodes, only 87 percent of the system’s 10,624 nodes. After making its Top500 debut in November 2023, the Argonne system retained its spot as the second fastest supercomputer and joined DOE’s Oak Ridge National Laboratory’s Frontier as the world’s second exascale machine.

Rick accepts the certificates for Aurora

Argonne's Rick Stevens (second from left) accepts the certificates for Aurora's performance results at the ISC High Performance 2024 conference in Hamburg, Germany.

Aurora also earned the top spot in a measure of AI performance, achieving 10.6 exaflops on 9,500 nodes on the HPL-MxP mixed-precision benchmark.

“We’re thrilled to see Aurora join the exascale club,” said Michael Papka, director of the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science user facility at Argonne. “I’m extremely proud of the Aurora team’s ongoing efforts to get the system up and running for the research community. We can’t wait to see what the full system will be capable of.”

As one of the world’s fastest supercomputers, Aurora gives scientists a powerful new tool for conducting research involving simulation, AI, and data analysis. The state-of-the-art system will not only enable breakthroughs in science and engineering but also spur new advances in technology and bolster the nation's innovation infrastructure.

“Aurora is fundamentally transforming how we do science for our country,” Argonne Laboratory Director Paul Kearns said. “It will accelerate scientific discovery by combining high-performance computing and artificial intelligence to fight climate change, develop life-saving medical treatments, create new materials, understand the universe, and so much more.”

“Aurora’s hardware excels at tackling both traditional scientific computing problems and AI-powered research,” added Rick Stevens, Argonne’s associate lab director for Computing, Environment and Life Sciences. “As AI continues to reshape the scientific landscape, Aurora gives us a platform to develop new tools and approaches that will significantly accelerate the pace of research.”

Built by Intel and Hewlett Packard Enterprise (HPE), Aurora’s first-of-its-kind architecture includes new technologies being deployed at an unprecedented scale. The supercomputer’s 63,744 graphics processing units (GPUs) make it the world’s largest GPU-powered system yet. It also has more endpoints in its interconnect technology than any other system to date.

The Aurora installation team, which includes staff from Argonne, Intel, and HPE, continues to work through system validation, verification, and scale-up activities. Their work has included addressing various hardware and software issues that emerge as the massive system nears full-scale operations.  

“Hitting exascale is a huge milestone, but enabling groundbreaking science is the ultimate goal,” said Susan Coghlan, ALCF project director for Aurora. “The new performance numbers, along with some promising runs from our early science teams, give us a glimpse of what will be possible with Aurora.” 

In addition to the new Top500 and HPL-MxP numbers, Argonne submitted results to other high-performance computing (HPC) benchmarks. On the Graph500 list — a measure for data-intensive applications — Aurora registered 24,250 GTEPS (giga-traversed edges per second) using only 4,096 nodes. The Argonne system also made its debut at third on the High-Performance Conjugate Gradients (HPCG) Benchmark list, achieving 5,612.6 teraflops with 4,096 nodes. Finally, Aurora’s storage system, DAOS, retained the top spot on the IO500 production list, a semi-annual ranking of HPC storage performance.

Teams participating in the ALCF’s Aurora Early Science Program and DOE’s Exascale Computing Project have been preparing to run their science projects on Aurora for the past several years. The teams have used Aurora and the Sunspot test and development system (outfitted with the same architecture) to scale and optimize codes for their initial science campaigns, demonstrating strong early performance gains. A few of the early science projects currently underway include: 

Speeding up drug discovery

Researchers are developing AI workflows that harness Aurora’s exascale computing power to sift through vast databases of chemical compounds in search of promising medicines to treat cancer and other diseases. Initially, the team was able to screen 11 billion drug molecules per hour using 128 nodes of Aurora. Doubling the number of nodes to 256 demonstrated linear scaling, enabling the team to screen 22 billion molecules per hour. As the team continues to scale up its approach, they aim to achieve the ability to screen 1 trillion candidates per hour using the full machine.

Modeling the cosmos

Simulating the evolution of the universe requires immense computing power. With Aurora, researchers have a powerful tool to increase the scale and complexity of their cosmological models, which will provide new insights into the structure and dynamics of the universe. In initial runs on Aurora, an early science team has used approximately 2,000 Aurora nodes to produce simulations and visualizations of the large-scale structure of the universe. These efforts have shown excellent single-GPU performance and demonstrated close to perfect scaling of performance extensible to the full machine. The team’s exascale simulations are expected to play a pivotal role in validating and refining our understanding of cosmological evolution. 

Mapping the brain

To advance research aimed at mapping neurons in the brain and their tens of thousands of connections, scientists are using Aurora to develop deep learning models for connectomic reconstruction. Based on their early runs, the team anticipates being able to reconstruct segments of the brain using datasets that are 1,000 times larger than their initial computations. The team’s computing techniques are paving the way to scale up from mapping cubic millimeters of brain tissue today to mapping the entire cubic centimeter of a mouse brain on Aurora and future supercomputers.

Read more on Aurora application performance 

Systems