Abstract:
The I/O subsystems of extreme-scale computing systems are becoming more complex as they stratify into tiers optimized for different balances of performance and capacity. Consequentially, obtaining a coherent picture of the workload that reflects everything from application-level I/O to back-end file system performance is becoming significantly more difficult. In an effort to continue to understand how I/O workloads are evolving to utilize emerging hierarchical architectures, LBNL and ANL are collaborating to develop a next-generation I/O characterization framework that collects and aligns data from all components of these emerging tiered architectures to provide a holistic understanding of I/O performance.
Using NERSC's Cori system as context, this talk will describe the challenges in understanding application I/O performance in the presence of flash-based burst buffers. Early results from holistic I/O analysis will be presented to demonstrate how aligning performance data from multiple sources across the I/O subsystem can provide a complete picture of performance bottlenecks on Cori's burst buffer.
Bio:
Glenn K. Lockwood is a performance engineer in NERSC's Advanced Technologies Group specializing in I/O characterization and data-intensive computing. Prior to joining ATG, he was a computational scientist at the San Diego Supercomputer Center where he led workload characterization efforts for future systems and supported genomics workloads on SDSC's flash-based HPC systems. Glenn holds a B.S. in ceramic engineering and a Ph.D. in materials science from Rutgers University.