Message-passing Interface (MPI) is a well-known parallel programming model in the realm of High-performance Computing (HPC). The recent MPI standards support Remote Memory Access (RMA) which has been widely utilized by many HPC applications since its introduction. With MPI’s general portability, there is a great opportunity to leverage MPI RMA for other shared-memory style programming models, such as Partitioned Global Address Space (PGAS). However, the semantic difference between MPI and PGAS poses a unique challenge in fusing the two distinct models. In this work, we investigate the performance of a popular PGAS implementation which has MPI RMA as its communication interface. Through extensive experimental analysis, we show the performance bottlenecks and pitfalls in using MPI RMA for PGAS. Based on our findings, we also explore a suite of optimization methods to counter those issues.
Bio: Huansong Fu is a PhD candidate in computer science at Florida State University (FSU) and a predoctoral appointee at Argonne National Laboratory. At FSU, he has been working with his advisor, Prof. Weikuan Yu, at the Computer Architecture and SysTems Research Lab. At Argonne, he has been working with his supervisor, Dr. Min Si, at the Programming Models and Runtime Systems (PMRS) group. His work involves building and optimizing distributed data analytics systems, distributed in-memory storage, parallel programming models, and general high-performance computing capabilities. Prior to joining in FSU, he earned master’s degree in computer science at Auburn University in 2015, and bachelor degree in information security at University of Electronic Science and Technology of China (UESTC) in 2011.