Virtually reconstructing brains with exascale computing

science
Ferrier Image

A subset of neurons reconstructed on Aurora using the FFN convolutional neural network, from a sample of human brain tissue, based on electron microscopy images collected at Harvard. Image: Lichtman Lab, Harvard University

As part of the Aurora Early Science Program, scientists are using AI and exascale computing power to advance connectomics research.

The structure of the human brain is enormously complex and not well understood. Its 80 billion neurons, each connected to as many as 10,000 other neurons, support activities from sustaining vital life processes to defining who we are as individuals. With access to ultra-high-resolution images of brain tissue, researchers can leverage computer vision and machine learning techniques deployed at exascale to identify brain structure and function at the sub-cellular level.

Known as connectomics, the goal of this research is to understand how individual neurons connect to one another to function holistically. Neuroscientists and computational scientists work together to create connectomes, detailed maps of brains composed neuron by neuron.

“What we’re trying to do is reconstruct the shape and connectivity of neurons,” said Thomas Uram, data sciences and workflows team lead at Argonne Leadership Computing Facility (ALCF). 

The advances of the connectomics project stand to benefit researchers working in other disciplines as well.

“The work done to prepare this project for exascale will broadly aid exascale system users. For example, the electron microscopy algorithms under development promise extensive application to x-ray data, especially with the upgrade to Argonnes Advanced Photon Source,” said Uram, who is working on a large-scale connectomics project that uses ALCF resources. The ALCF and Advanced Photon Source are U.S. Department of Energy (DOE) Office of Science user facilities at DOE’s Argonne National Laboratory.

Extensive technological demands

Crucial to the computational work of Uram’s connectomics project—co-led by Argonne computer scientist Nicola Ferrier—is Aurora, the ALCF’s new Intel-Hewlett Packard Enterprise exascale system. The project has been supported under the ALCF’s Aurora Early Science Program (ESP) to prepare codes for the architecture and scale of the system.

Any effort to reconstruct significant brain structure—as massive a volume of data as exists—requires an enormous amount of compute time, so our research necessarily depends on exascale computing,” Uram explained.

The work leverages innovations in imaging (especially using electron microscopy), supercomputing, and artificial intelligence (AI) to improve our understanding of how the brain's neurons are arranged and connected.

“Using these technologies at this scale is possible today thanks to the power of ALCF computing resources,” Uram said. “The techniques developed to study neural structure have helped ensure that computing would scale from cubic millimeters of brain tissue at the start, to a cubic centimeter of mouse brain beyond that, and to larger volumes of human brain tissue in the future. As imaging technology advances, computing will need to achieve high performance on post-exascale machines to avoid becoming a bottleneck."

“Connectomics stresses many boundaries: high-throughput electron microscopy technology operating at nanometer resolution; tens of thousands of images, each with tens of gigapixels; accuracy sufficient to capture minuscule synaptic detail; computer vision methods to align corresponding structures across large images; and deep learning networks that can trace narrow axons and dendrites over large distances,” he continued, offering a glimpse of the scope of the project.

Multiple applications contribute to the 3D reconstruction of neurons; the most demanding of them perform image alignment and segmentation.

“The data that we’re working with now are actually human brain data,” Uram said. “The data come from a collaborative effort with Harvard University researchers, who have been pioneering fast parallel electron microscopy.” The tissue samples require significant preparation for connectomic analysis.

Alignment and segmentation

“The computational part of our work comes in after the brain sample tissue is very thinly sliced and imaged on the electron microscope; each slice is sectioned in tiles,” Uram said. “After we’ve obtained a large number of tiles from the electron microscope, we match them up so that their features correspond, and we do that across all of the tiles in the section. We refer to the tile assembly process as stitching.”

Before the 3D shape of neurons can be reconstructed, the 2D profiles of objects must be aligned between neighboring images in an image stack. Image misalignment can occur when tissue samples are cut into thin sections, or during imaging on the electron microscope. The Finite-Element Assisted Brain Assembly System (FEABAS) application—developed by collaborators at Harvard—employs template- and feature-matching techniques for coarse and fine-grained alignment, using a network-of-springs approach to produce optimal linear and local non-linear image transformations, to align the 2D image content between sections.

“Once we have a full section reconstructed, we examine the neighboring sections to make sure that they correspond and align them as necessary,” Uram said. “We must align them with a high degree of precision: the lateral resolution from the microscope is four nanometers. Accurately tracing fine structure in this data depends heavily on the quality of the alignment of the neuron content between neighboring images.”

Tracing neurons with machine learning

After stitching and alignment are complete, the connectomics researchers use various AI methods to speed up data processing and analysis.

“We use machine learning to find and trace objects—neurons, in effect—within the stack of images we’ve built,” Uram explained. “Without machine learning, a person has to sit down and manually trace the neurons, which significantly limits the amount of data that we’re able to capture.”

Analyzing the stitched and aligned image stack, a convolutional neural network model trained to identify neuron bodies and membranes reconstructs the 3D shapes of neurons. The Flood Filling Network code, developed at Google and adapted to run on ALCF systems, traces individual neurons over long distances, enabling analysis at the synaptic level.

Deep learning models for connectomic reconstruction have been trained on Aurora using as many as 512 nodes, demonstrating performance increases of up to 40 percent throughout the project’s lifetime.

“Collaborating with Intel, we have been working to run our model in a variety of configurations on Aurora and other Argonne systems,” Uram said. “That work has been highly productive, and Intel has been very helpful in learning how to run the model efficiently, both for training the model and using the trained model for segmentation.”

Once the team has produced a trained model, they use it to produce a segmentation of the larger volume of brain sample tissue. The larger volume vastly exceeds the scale of the training data, making segmentation the task for which the computing power of Aurora is most necessary—physically it would be on the order of a cubic millimeter of tissue, which at the desired imaging resolution represents approximately a petabyte of data. 

Reconstructions with these models have been run on Aurora using as many as 1,024 nodes (with multiple inference processes for each graphics processing unit) to produce a segmentation of a teravoxel of data. Projecting from these runs to the full machine, the researchers anticipate soon being able to segment a petavoxel dataset within a few days on Aurora.