VTune is an advanced profiling tool which helps optimize code for Intel architectures. VTune collects key profiling data, and presenting its findings through a powerful interface that simplifies interpretation and helps Aurora users focus on the most effective software optimizations—from computation and threading to memory and storage. Key features include:
- CPU and GPU Analysis: Tune the entire application’s performance―not just the accelerated portion.
- Optimize Offload: Tune offload performance on Intel PVC GPUs.
- Multilingual: Profile Data Parallel C++ (DPC++), C, C++, Fortran, OpenCL, Python, or any combination.
- Profile Threading, Memory, Persistent Memory, and Storage: Access a wealth of analysis types to identify a wide variety of performance issues.
- System or application: Get coarse-grained system data for an extended period or detailed results mapped to source code.
- Power: Optimize performance while avoiding power- and thermal-related throttling.
Reference

Find Answers Faster with Timeline Filtering

Intel VTune GPU Profiling