The process begins with massive-data-producing electron microscopy images of the brain samples that have been sliced. The images of the slices are stitched, then reassembled to create a 3D volume, which is itself segmented, to figure out where the neurons and synapses are located.
The segmentation step relies on an artificial intelligence technique called a convolutional neural network; in this case, a flood-filling network developed by Google for the reconstruction of neural circuits from electron microscopy images of the brain. While it has demonstrated better performance than past approaches, the technique also comes with a higher computational cost.
“We have scaled that process and we’ve scaled thousands of nodes on the ALCF’s Theta supercomputer,” says Ferrier. “When you go to do the segmentation on that large volume, you have to distribute the data on the computer. And after you’ve run your inference on the data, you have to put it back together. Finally, it all has to be analyzed.
“So, the problem of executing a brain connectome is an exascale problem,” she adds. Without the power of an Aurora, the tasks could not be accomplished. With the ability to handle such massive amounts of data, they will be able to answer questions like what happens when we learn, what happens when you have a degenerative disease, how does the brain age?
Questions for which we have been seeking answers for millennia.
One machine to bind them all
Whether it’s the quest to develop a connectome or understand key flow physics to develop more efficient wind turbine blades, the merging and flourishing of data and artificial intelligence techniques and advanced computational resources is only possible because of the exponential and deliberate development of high-performance computing and data delivery systems.
“Supercomputer architectures are being structured to make them more amenable to dealing with large amounts of data and facilitating learning, in addition to traditional simulations,” says Venkat Vishwanath, ALCF data sciences lead. “And we are fitting these machines with massive conduits that allow us to stream large amounts of data from the outside world, like the Large Hadron Collider at CERN and our own Advanced Photon Source (APS), and enable data-driven models.”
Many current architectures still require the transfer of data from computer to computer, from one machine, the sole function of which is simulation, to another that excels in data analysis and/or machine learning.
Within the last few years, Argonne and the ALCF have made a solid investment in high-performance computing that gets them closer to a fully integrated machine. The process accelerated in 2017, with the introduction of the Cray XC40 system, Theta, which is capable of combining traditional simulation runs and machine learning techniques.
In 2020, with the benefit of the Coronavirus Aid, Relief and Economic Security (CARES) Act funding, Theta was augmented with 24 NVIDIA DGX A100 nodes, increasing the performance of the system by more than 6 petaflops and bringing GPU-enabled acceleration to the Theta workloads.
Arriving in 2021 will be ALCF’s newest machine, Polaris — a CPU/GPU hybrid resource that provides the opportunity for new and existing users to continue to prepare and scale their codes, and ultimately their science, on a resource that will look very much like future exascale systems. Polaris will provide substantial new compute capabilities for the facility and greatly expand its support for data and learning workloads. The system will be fully integrated with the 200-petabyte file system ALCF deployed in 2020, with increased data sharing support.
The ALCF will further drive simulation, data and learning to a new level in the near future, when they unveil one of the nation’s first exascale machines, Aurora. While it can perform a billion billion calculations per second, its main advantage may be its ability to conduct and converge simulation, data analysis, and machine learning under one hood. The end result will allow researchers to approach new types of, as well as much larger, problems and reduce time to solution.
“Aurora will change the game,” says the ALCF’s Papka. “We’re working with vendors Intel and HPE to assure that we can support science through this confluence of simulation, data and learning all on day one of Aurora’s deployment.”
Whether by Darwin or Turing, whether with chalkboard or graph paper, some of the world’s great scientific innovations were the product of one or several determined individuals who well understood the weight of applying balanced and varied approaches to support — or refute — a hypothesis.
Because current innovation is driven by collaboration among colleagues and between disciplines, the potential for discovery through the pragmatic application of new computational resources, coupled with unrestrained data flow, staggers the imagination.
The ALCF and APS are DOE Office of Science User Facilities.
The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.
About the Advanced Photon Source
The U. S. Department of Energy Office of Science’s Advanced Photon Source (APS) at Argonne National Laboratory is one of the world’s most productive X-ray light source facilities. The APS provides high-brightness X-ray beams to a diverse community of researchers in materials science, chemistry, condensed matter physics, the life and environmental sciences, and applied research. These X-rays are ideally suited for explorations of materials and biological structures; elemental distribution; chemical, magnetic, electronic states; and a wide range of technologically important engineering systems from batteries to fuel injector sprays, all of which are the foundations of our nation’s economic, technological, and physical well-being. Each year, more than 5,000 researchers use the APS to produce over 2,000 publications detailing impactful discoveries, and solve more vital biological protein structures than users of any other X-ray light source research facility. APS scientists and engineers innovate technology that is at the heart of advancing accelerator and light-source operations. This includes the insertion devices that produce extreme-brightness X-rays prized by researchers, lenses that focus the X-rays down to a few nanometers, instrumentation that maximizes the way the X-rays interact with samples being studied, and software that gathers and manages the massive quantity of data resulting from discovery research at the APS.
This research used resources of the Advanced Photon Source, a U.S. DOE Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357.
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.