ALCF Data Science Program selects projects for 2021-2022

ALCF Data Science Program

The ALCF Data Science Program supports data-intensive research projects that require the scale and capabilities of DOE's leadership-class computing resources.

The ALCF Data Science Program has awarded computing time and resources to four projects that will use novel AI and data science techniques to pursue data-driven discoveries.

The Argonne Leadership Computing Facility (ALCF) recently awarded computing time and resources to three new projects and one renewed project for 2021-2022, through its ALCF Data Science Program (ADSP).

Launched in 2016, the ADSP enables big data and artificial intelligence (AI) research that requires DOE’s leadership-class computing resources. The forward-looking allocation program is designed to explore and improve computational methods for data-driven discoveries across scientific disciplines. It also focuses on scaling the underlying data science technologies to fully utilize DOE supercomputers. The ALCF is a U.S. Department of Energy (DOE) Office of Science User Facility at DOE’s Argonne National Laboratory.

The new projects — which aim to accelerate autonomous molecular design, data analysis in neutrino experiments, and sky survey discovery — extract science from a range of unique data sources. The project selected for renewal will address challenges in fast, high-resolution X-ray imaging at the Advanced Photon Source (APS), a DOE Office of Science User Facility located at Argonne. Each project will employ leadership-class systems and infrastructure to develop and advance data science techniques, with novel approaches to machine learning, deep learning, and other cutting-edge AI methods.

“This year’s ADSP awards advance the use of artificial intelligence on ALCF systems beyond standalone networks to multi-network workflows integrated in scientific analysis chains,” said Taylor Childers, ALCF research scientist and co-lead of the ADSP program this year. “In addition, unsupervised techniques are targeting our upcoming system Polaris, which is ideal for deep learning applications and will serve as a testbed for our future exascale supercomputer, Aurora.”

ADSP awards are for two years and are renewed on an annual basis.

New ADSP projects

Autonomous Molecular Design for Redox Flow Batteries
Principal investigator (PI): Logan Ward, Argonne National Laboratory

Redox flow batteries can easily be scaled up to store large amounts of energy, making them a promising technology for electrical grid storage. The batteries work by storing energy in large tanks of electrolyte solutions, but they are currently limited by the performance of available electrolyte materials. With tens of millions of potential candidate molecules, scientists need an improved method to speed the discovery of optimal materials for redox flow batteries. The goal of this project is to build an autonomous AI application for supercomputers that can select and perform the simulation and machine learning tasks needed to identify better-performing molecules. Achieving this goal will require scaling individual tasks, such as computing material properties and training AI models, and then combining them into a cohesive application that will remove humans from the materials design process.

Machine Learning for Data Reconstruction to Accelerate Physics Discoveries in Accelerator-Based Neutrino Oscillation Experiments
PI: Marco Del Tutto, Fermi National Accelerator Laboratory (Fermilab)

The liquid argon time projection chamber (LArTPC) is an imaging detector that can record charged particle trajectories at sub-millimeter spatial resolution. It allows scientists to measure neutrino interactions with high precision, making it the detector of choice for current and future accelerator neutrino experiments, including Fermilab’s Short-Baseline Neutrino Program and Deep Underground Neutrino Experiment. A major goal of this project is to accelerate the analysis workflow in LArTPC experiments by orders of magnitude by deploying the first machine learning-based full reconstruction chain on a high-performance computing (HPC) system. The optimization of a traditional data reconstruction pipeline in these experiments is done “by hand,” and can take months to years each time researchers need to reprocess the whole dataset. The team’s goal is to reduce this process to hours using the ALCF’s upcoming Polaris system. This effort will accelerate the analysis pipeline, perhaps even enabling a full physics analysis online, allowing for more frequent and deeper inference of physics insights from experimental data.

Learning Optimal Image Representations for Current and Future Sky Surveys
PI: George Stein, Lawrence Berkeley National Laboratory

Sky surveys are the largest data generators in astronomy, imaging vast numbers of galaxies at high resolutions. To date, machine learning investigations of sky-survey data have provided a large number of high-impact results, including the detection of a large number of strongly gravitationally lensed systems and the classification of millions of galaxies. However, existing methods used in the field of astrophysics suffer from the standard limitations of supervised learning; they require extensive compute resources and development time to target singular objectives, and the performance is limited by the small amount of labeled data on which to train models. With this ADSP project, the team will use their recently developed self-supervised learning framework to extract meaningful representations from galaxy images in the Dark Energy Camera Legacy Survey dataset, providing a scalable data-driven approach capable of learning from unlabeled data. The team’s work aims to serve the broader community by accelerating sky survey discoveries following the release of image representations, trained models, and software. Researchers will be able to simply download the low-dimensional representations of galaxies to perform scientific analysis, or use the team’s pre-trained model and quickly fine-tune it to carry out a specific task.

Renewed ADSP project

Dynamic Compressed Sensing for Real-Time Tomographic Reconstruction
PI: Robert Hovden, University of Michigan

Using electron and X-ray tomography to perform 3D characterization of materials at the nano- and mesoscale is important to the development of a wide range of applications, including solar cells and semiconductor devices. To overcome experimental limitations and improve image quality for materials characterization, researchers are leveraging recent advancements in tomographic reconstruction algorithms, such as compressed sensing methods, to provide superior 3D resolution. In the first year of this ADSP project, researchers developed a dynamic tomography framework that uses compressed sensing algorithms to perform in-situ reconstruction while new data is being collected. In year two, the team will continue to conduct comprehensive simulations for real-time electron tomography and develop reconstruction methods for through-focal tomography, an approach that enhances resolution by combining images captured at different levels of focus. They will experimentally demonstrate the reconstruction workflow and methods on commercial scanning transmission electron microscopes and the ptychographic tomography instruments at the APS. By integrating their tool with an open-source 3D visualization and tomography software package, the team’s techniques will be accessible to a wide range of researchers and enable new material characterizations in academia and industry.


The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.

About the Advanced Photon Source

The U. S. Department of Energy Office of Science’s Advanced Photon Source (APS) at Argonne National Laboratory is one of the world’s most productive X-ray light source facilities. The APS provides high-brightness X-ray beams to a diverse community of researchers in materials science, chemistry, condensed matter physics, the life and environmental sciences, and applied research. These X-rays are ideally suited for explorations of materials and biological structures; elemental distribution; chemical, magnetic, electronic states; and a wide range of technologically important engineering systems from batteries to fuel injector sprays, all of which are the foundations of our nation’s economic, technological, and physical well-being. Each year, more than 5,000 researchers use the APS to produce over 2,000 publications detailing impactful discoveries, and solve more vital biological protein structures than users of any other X-ray light source research facility. APS scientists and engineers innovate technology that is at the heart of advancing accelerator and light-source operations. This includes the insertion devices that produce extreme-brightness X-rays prized by researchers, lenses that focus the X-rays down to a few nanometers, instrumentation that maximizes the way the X-rays interact with samples being studied, and software that gathers and manages the massive quantity of data resulting from discovery research at the APS.

This research used resources of the Advanced Photon Source, a U.S. DOE Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://​ener​gy​.gov/​s​c​ience.