Distributed Training

Corey Adams, Argonne National Laboratory
Huihuo Zheng, Argonne National Laboratory
Webinar Beginner
Distributed Training

Trainees will be acquainted with distributed deep learning methods that leverage multiple GPUs to reduce time-to-insight when training AI models. Specific tools include Horovod and DistributedDataParallel.

Day and Time: January 27, 3-5 p.m. US CT

This session is a part of the ALCF AI for Science Training Series

About the Speakers

Corey Adams is an assistant computer scientist at the Argonne Leadership Computing Facility.  Originally a high energy physicist working on neutrino physics problems,  he now works on applying deep learning and machine learning techniques to science problems – and still neutrino physics – on high performance super computers.  He has experience in classification, segmentation, sparse convolutional neural networks as well as running machine learning training at scale.

Huihuo Zheng is a computer scientist at the Argonne Leadership Computing Facility. His areas of interest are first-principles simulations of condensed matter systems, excited state properties of materials, strongly correlated electronic systems, and high-performance computing.