Quick Start: Using Apache Spark for Large-Scale Data Processing

Help Desk

Hours: 9:00am-5:00pm CT M-F
Email: support@alcf.anl.gov

ALCF Dev Session

This is an interactive webinar focused on using Apache Spark, a framework for parallel data processing, on ALCF computing resources. The webinar will present a brief tutorial on Apache Spark, provide instructions for running the framework on ALCF systems, discuss the unique characteristics of Theta, and recommend a few tuning parameters to achieve optimal performance.

Xiao-Yong Jin, Argonne National Laboratory