Scientists use reinforcement learning to train quantum algorithm

Scientists use reinforcement learning to train quantum algorithm

This visualization of search path is obtained through reinforcement learning (RL) on a test problem with two parameters. The RL method leveraging past experience in solving similar problems quickly moves towards the solution to the unseen but similar problem. Image: Prasanna Balaprakash, Argonne National Laboratory

Scientists are investigating how to equip quantum computers with artificial intelligence and machine learning approaches.

Recent advancements in quantum computing have driven the scientific community’s quest to solve a certain class of complex problems for which quantum computers would be better suited than traditional supercomputers. To improve the efficiency with which quantum computers can solve these problems, scientists are investigating the use of artificial intelligence approaches. 

In a new study, scientists at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have developed a new algorithm based on reinforcement learning to find the optimal parameters for the Quantum Approximate Optimization Algorithm (QAOA), which allows a quantum computer to solve certain combinatorial problems such as those that arise in materials design, chemistry and wireless communications.

Combinatorial optimization problems are those for which the solution space gets exponentially larger as you expand the number of decision variables,” said Prasanna Balaprakash, a computer scientist with Argonne's Mathematics and Computer Science division and the Argonne Leadership Computing Facility, a DOE Office of Science User Facility. ​In one traditional example, you can find the shortest route for a salesman who needs to visit a few cities once by enumerating all possible routes, but given a couple thousand cities, the number of possible routes far exceeds the number of stars in the universe; even the fastest supercomputers cannot find the shortest route in a reasonable time.”

Scientists use reinforcement learning to train quantum algorithm

Energy landscapes of sample graphs from different classes. (Image: Prasanna Balaprakash, Argonne National Laboratory)

Developed recently, QAOA is considered as one of the leading candidates for demonstrating the advantage of quantum computers. QAOA is a hybrid quantum-classical algorithm that uses both classical and quantum computers for approximately solving combinatorial optimization problems.

The new algorithm developed at Argonne learns how to configure QAOA through a feedback mechanism. A particularity of the proposed algorithm is that it can be trained on smaller problem instances, and the trained model can adapt QAOA to larger problem instances. ​It’s a bit like having a self-driving car in traffic,” Balaprakash said. ​The algorithm can detect when it needs to make adjustments in the ​dials’ it uses to do the computation.” 

The QAOA could have significant benefits for solving combinatorial problems that arise with 5G wireless communications. According to Balaprakash, a scientific problem called Max-Cut can be used to model how different wireless devices talk to each other at the same time with minimum interference between them. Solving such problems at scale is challenging, yet is important for optimal wireless spectrum management.

Using machine learning to optimize the quantum algorithm involves training it with ​rewards” and ​penalties” depending on how well it performs, said Sami Khairy, a study author and graduate student at the Illinois Institute of Technology. ​It’s an iterative procedure that allows us to improve how the computation is running,” he said. ​It learns a better way to assign new parameters, and we want to assign good parameters as fast as possible.”

One of the big advantages of doing this kind of machine learning involves the ability to generalize the principles of the findings over the broader class of problem instances, Khairy explained. ​We’ve designed an optimization algorithm that works for several instances,” he said. ​In previous studies, it was as if we were training one driver to drive one kind of car; here, we have the ability to train our driver to adapt to many different kinds of cars, in real time.”

A paper based on the team’s work, ​Learning to Optimize Variational Quantum Circuits to Solve Combinatorial Problems,” was presented at the Artificial Intelligence (AI) conference AAAI-20 in February. The team includes Prasanna Balaprakash and Yuri Alexeev from Argonne, Sami Khairy from the Illinois Institute of Technology, Ruslan Shaydulin from Clemson University, and Lukasz Cincio from Los Alamos National Laboratory.


The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation's first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America's scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science.

The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit