A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems Publications 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Thorough Characterization and Analysis of Large Transformer Model Training At-Scale Publications Proceedings of the ACM on Measurement and Analysis of Computing Systems
Cross-Feature Transfer Learning for Efficient Tensor Program Generation Publications Applied Sciences
Centimani: Enabling Fast AI Accelerator Selection for DNN Training with a Novel Performance Predictor Publications USENIX ATC'24: Proceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference
Toward a Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Publications 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
V2684603: Interface Resolved Simulation of Two-Phase Flow Within a 360° Steam Separator Geometry Publications 77th Annual Meeting of the APS Division of Fluid Dynamics
Efficient Distributed Continual Learning for Steering Experiments in Real-Time Publications Future Generation Computer Systems
Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers Publications 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Bricks: A High-Performance Portability Layer for Computations on Block-Structured Grids Publications The International Journal of High Performance Computing Applications