Intro to AI Series: Parallel Training Methods for AI

Sam Foreman, ALCF
From February 6 through March 26, 2024, the ALCF will host an 8-part weekly virtual training series to teach undergraduates and graduates the fundamentals of using world-class supercomputers to advance the use of AI for research.

Intro to AI Series: Session 6

We present modern parallelism techniques and discuss how they can be used to train and distribute large models across many GPUs.


Sam Foreman is a Computational Scientist with a background in high energy physics, currently working as a postdoc in the ALCF. He is generally interested in the application of machine learning to computational problems in physics, particularly within the context of high-performance computing. Sam's current research focuses on using deep generative modeling to help build better sampling algorithms for simulations in lattice gauge theory.