In this session, we will cover how Data Parallel Python can be used to develop high-performing code for ALCF's upcoming Aurora supercomputer. The talk will introduce Numba-dppy and show examples of how to write data-parallel code inside numba.jit decorated functions and offload them to a SYCL device. We will provide examples of how to write an explicit kernel using the @numba_dppy.kernel decorator. Numba-dppy is packaged as part of Intel Distribution for Python*, which is included with the Intel oneAPI AI Analytics Toolkit.
The talk will also cover dpctl, a companion library intended to make it easier to write Python native extensions based on DPC++. Dpctl provides a Python binding for the DPCPP runtime classes, an API to manage devices, and wrappers for the Unified Shared Memory (USM) allocators to enable creation of Python objects that use SYCL USM for data allocation.
For use cases, we will cover Pairwise, Black Scholes, and K-Means as examples to demonstrate the CPU and GPU implementation of numba-dppy and practice live sample code on the Intel DevCloud and/or Argonne's Joint Laboratory for System Evaluation (JLSE).
About the Speaker
Praveen Kundurthy is a Developer Evangelist at Intel with over 15 years of experience in software development, optimization on Intel platforms. In his current role, he works with universities and developers in evangelizing and helping them understand AI and oneAPI concepts. He has expertise in C++, C#, Python programing languages and over the past few years at Intel, he has worked on topics spanning artificial intelligence, storage technologies, gaming, virtual reality and Android. Praveen has a master's degree in Computer Engineering from Mississippi State University.