Join us on Wednesday, December 3, 2025, for a webinar showcasing the Argonne Leadership Computing Facility (ALCF) Inference Service, which provides cloud-like access to diverse AI models—including Large Language Models (LLMs)—on existing high-performance computing (HPC) clusters. We will demonstrate how to integrate the Inference Service within scientific applications and share examples of interacting with our chat interface and API. The talk will cover our experience deploying and optimizing inference endpoints on the Sophia cluster, leveraging Globus Compute and frameworks like vLLM to support a broad range of models, including science foundation models and open-weight models such as gpt-oss, Meta Llama, and the Mistral family. We will also highlight our latest integration with Metis, a SambaNova SN40L cluster highly optimized for inference. Topics will include key technical advancements—such as efficient model loading, batch processing for large-scale inference, and authentication via Globus Auth—as well as challenges like resource contention, payload limitations, and fault tolerance, with performance metrics and practical applications. The ALCF Inference Service enables researchers to run secure, scalable AI inference on Argonne’s HPC systems, delivering enhanced accessibility, massive scalability, privacy, and performance tailored to scientific workflows to accelerate data-driven discovery.
Benoit Côté is a Software Developer in the Data Services and Workflows team at the Argonne Leadership Computing Facility. His work revolves around designing and hosting automated workflows and user-facing services for scientific applications. He obtained a PhD degree in Physics from Université Laval (Canada) in 2015. He held postdoctorate appointments at the University of Victoria (Canada), Michigan State University, and the Konkoly Observatory (Hungary) to combine galaxy formation and nuclear astrophysics to study the origin of the elements and isotopes in the Universe. He became a permanent research staff at the Konkoly Observatory in 2019, but decided to move back to North America during the pandemic. He then held a remote postdoctorate appointment to contribute to the development of a nucleosynthesis data software before joining Argonne in 2022.