Performance Evaluation and Analysis Consortium End Station

PI Leonid Oliker, Lawrence Berkeley National Laboratory
Allgather implementations on 32K BG/P cores, highlighting the bucket algorithm from UIUC
Project Description

To facilitate further understanding of Leadership Class systems, this proposal focuses on five goals: (1) develop new programming models and runtime systems for emerging and future generation leadership computing platforms that exploit thread-level parallelism and potential architectural heterogeneity; (2) update and extend performance evaluation of all systems using suites of standard and custom micro, kernel, and application benchmarks; (3) continue to port performance tools and performance middleware to the BG/Q and XK7, make them available to high-end computing users, and further develop the tools and middleware to support the scale and unique modes of parallelism of the Leadership Class systems; (4) validate and modify performance prediction technologies to improve utility for production runs on the Leadership Class systems; and (5) analyze and help optimize current or candidate Leadership Class application codes and potentially develop new parallel algorithms.