The slowdown of Moore’s Law and the growth of compute-intensive workloads such as artificial intelligence has pushed the development of high-performance computing (HPC) towards accelerator-based systems. Among supercomputers in the Top500 list, the share of accelerator FLOPs has grown from 20% in 2010 to 76% in 2018. As part of this trend, many libraries and application codes for solving partial-differential-equations (PDEs) have been ported and adapted to run on accelerator hardware such as graphical processing units (GPUs). Consequently, it has now become imperative for PDE-constrained optimization algorithms to also live on and exploit the capabilities of the same hardware used by the underlying PDE solvers.
In the present work, we launch an investigation into the use of quasi-Newton (QN) methods on GPUs. QN approximations are among the most popular gradient-based kernels for solving large-scale nonlinear systems of equations, and are widely used for both continuous optimization and for PDE solutions. We implement both matrix-free and compact dense representations of popular QN methods in PETSc/TAO, and leverage PETSc data structure abstractions to profile QN performance on both CPUs and GPUs using MPI and ViennaCL backends.