Distributed Training

Huihuo Zheng, Argonne National Laboratory
Webinar Beginner
Distributed Training

Trainees will be acquainted with distributed deep learning methods that leverage multiple GPUs to reduce time-to-insight when training AI models. Specific tools include Horovod and DistributedDataParallel.

Day and Time: January 27, 3-5 p.m. US CT

This session is a part of the ALCF AI for Science Training Series

About the Speakers

Corey Adams is an assistant computer scientist at the Argonne Leadership Computing Facility.  Originally a high energy physicist working on neutrino physics problems,  he now works on applying deep learning and machine learning techniques to science problems – and still neutrino physics – on high performance super computers.  He has experience in classification, segmentation, sparse convolutional neural networks as well as running machine learning training at scale.

Huihuo Zheng is a computer scientist at the Argonne Leadership Computing Facility. His areas of interest are first-principles simulations of condensed matter systems, excited state properties of materials, strongly correlated electronic systems, and high-performance computing.