Introduction to HDF5 for HPC Data Models, Analysis, and Performance

Help Desk

Introduction to HDF5 for HPC data models, analysis and performance

HDF5 is a data model, file format, and I/O library that became a ​de facto​ standard for HPC applications for achieving scalable I/O and storing and managing big data from computer modeling, large physics experiments and observations. This talk offers a comprehensive overview of HDF5 for anyone who works with big data in an HPC environment. The talk consists of two parts. Part I introduces the HDF5 data model and APIs for organizing data and performing I/O. Part II focuses on HDF5 advanced features such as parallel I/O and will give an overview of various parallel HDF5 tuning techniques such as collective metadata I/O, data aggregation, async, parallel compression, and other new HDF5 features that help to utilize HPC storage to its fullest potential.

Scot Breitenfeld