Data-Driven Molecular Engineering of Advanced Functional Materials

PI Jacqueline Cole, University of Cambridge
Co-PI Alvaro Vázquez‐Mayagoitia, Argonne National Laboratory
Project Summary

This project will develop data-driven materials-by-design capabilities to accelerate the discovery of new materials for photovoltaic and quantum optical sensing applications. The team will achieve its goal by exploiting the latest advances in materials database auto-generation tools and data-mining, which harness artificial intelligence and machine learning.

Project Description

The world needs new materials to stimulate the chemical industry in key sectors of our economy: environment and sustainability, information storage, and optical sensing. Yet, nearly all functional materials are still discovered by ‘trial-and-error,’ whose lack of predictability affords a major materials bottleneck to technological innovation. The emerging field of data-driven molecular engineering offers a prospective solution to this problem. It enables systematic molecular design and engineering strategies to be encoded into algorithms that search through massive chemical datasets, and couple with computational workflows, to discover a material that suits a bespoke application. Such data-science approaches to materials discovery are only just becoming possible, given recent advances in artificial intelligence, rapid rises in high-performance-computing, and changes in government legislation that regulates open-access of scientific data. The U.S. government promotes this approach via the Materials Genome Initiative, which aims to reduce the average 20-year ‘molecule-to-market’ timeframe in industry.

This ALCC project will develop data-driven materials-by-design capabilities to accelerate the discovery of new materials for photovoltaic and quantum optical sensing applications. This goal lies at the heart of all six Basic Energy Science subprograms of the U.S. Department of Energy, cross-cutting fields of chemical sciences, materials sciences and engineering, computational materials, and energy innovation, using world-leading computational and experimental user facilities at Argonne National Laboratory. The ALCC project’s technical and scientific objectives are also central to five of the six initiatives of special priority for the DOE Office of Science, as identified by the Office of the Under Secretary for Science; namely, advanced and sustainable energy, artificial intelligence and machine learning, high-performance computing, large-scale scientific instrumentation, and quantum information science.

The team will achieve its goal by exploiting the latest advances in materials database auto-generation tools and data-mining, which harness artificial intelligence and machine learning. These have been developed by the Molecular Engineering group at the University of Cambridge. The tools have facilitated the group’s award-winning materials discovery capabilities through rational molecular design and experimental validation. This ALCC project will extend a partnership between the Molecular Engineering group at Cambridge and Argonne National Laboratory, where they have been already collaborating in computation and data-science, at the Argonne Leadership Computing Facility (ALCF) since 2016, and experimentally, at the Advanced Photon Source (APS) and the Center of Nanoscale Materials (CNM) since 2013. The team’s computational and data-science program builds upon the success of their two Argonne Data Science Project (ADSP) program allocations (2016-18, 2018-19), which have empowered the team’s data-driven discovery of new light-harvesting dyes for photovoltaic applications, and enabled them to build the world-largest open-access database of UV/vis absorption spectra. The collaborations with APS and CNM have facilitated the majority of the recent material discoveries in the Molecular Engineering group. The role of the APS and the CNM in this ALCC project will be to experimentally validate the team’s materials predictions on new photovoltaic chromophores and quantum optical sensors.

Allocations