Towards Exascale: A Sparse Communications (SpComm) Tool for Distributed SpMM on Multi-GPU Nodes

Mert Hidayetoglu, University of Illinois at Urbana-Champaign
Webinar
Theta

Description: Multi-GPU hierarchical communication network design has become commonplace in modern exascale supercomputers. For instance, Summit (the last petascale system) implements it well within a three-level hierarchy: first using high-bandwidth NVLinks within the set of three GPUs, second across sockets inside each node, and third across nodes using a slower Infiniband network. With varying degrees of bandwidths within the communications opens a sleeve of problems such as optimal data placement and communication for computation, number of hops, and amount of data amplifications. To this end, our research focuses on solutions that can provide efficient and optimized communications for sparse matrix computations at scale. Specifically, this talk will discuss a new tool to explore the sparse communications (SpComm) governed by the application and find optimization opportunities via reorganization of data movements and computations within the hierarchical network. The proposed tool targets rapidly growing complex problems involving scientific, machine learning, and graph analytics workloads with unstructured sparsity patterns. Application-driven TB-scale sparse matrix multiplication (SpMM) benchmarks on ThetaGPU and Summit show that the proposed optimizations for SpComm provide up to 60% reduction in communication volume across nodes (slow) with additional sets of local (intra-node) communications. These optimizations have already accelerated inverse multiple-scattering imaging (IPDPS'18), petascale X-ray imaging (SC19, SC20), sparse deep neural network inference (HPEC'20), and graph neural network training (MLSys'22). We are in the process of unifying these optimizations for the SpComm tool for the benefit of application developers for science production at the upcoming exascale platforms.

Bio: Mert Hidayetoglu is a PhD candidate at ECE Illinois with research at the intersection of large-scale applications, high-performance computing, and software systems. His dissertation is the optimization of unstructured data processing within complex hierarchical memory and communication architectures. Mert's work won the best paper award at SC20, was named the graph challenge champion at HPEC20, and was awarded the 2021 ACM/IEEE-CS George Michael Memorial HPC Fellowship.

Join on your computer or mobile app

Click here to join the meeting

Or call in (audio only)

+1 630-556-7958,,172532941#   United States, Big Rock

Phone Conference ID: 172 532 941#