Exascale Computing Challenges and Round-Off Error Management in Exascale Applications

Philippe Ricoux (Total SA)
Georges Oppenheim (Paris-Sud University)
Vincent Baudoui (Postdoc)
Seminar

Future exascale computers will open up new perspectives in numerical simulation, but several challenges still need to be tackled before reaching 10^18 floating point operations per second. One goal is to figure out how to manage the errors that can happen in such systems in order to guarantee results accuracy.

Total SA, the major French oil and gas company sponsoring this study, is widely involved in research on exascale computing through projects like the "European Exascale Software Initiative" (EESI2). We will introduce the Total SA needs in massively parallel numerical solvers along with the works planned by the EESI2. We will also mention the orientations discussed at the “Big Data and Extreme-scale Computing” workshop that just took place in Charleston, SC.

We will then focus on simulation result validation via the study of round-off error propagation in massively parallel applications. Existing solutions to manage round-off errors in sequential codes do not necessarily extend to large scale. We will discuss the limits of the known error bounds against exascale applications particularities, and also present possible strategies to determine the sensitive sections of a code as part of future research work.

Short bio: Philippe Ricoux is one of the leaders of numerical simulation at Total SA and also leads the "European Exascale Software Initiative". Georges Oppenheim is a professor in mathematics at Paris-Sud University. Vincent Baudoui is a postdoctoral fellow at Argonne funded by Total SA.