Extracting optimal performance from your application requires efficient use of your system's microarchitecture and parallelism at core, socket, and node levels. This webinar will focus on how to do just that using two profiling tools from Intel: Intel Vtune Amplifier and Intel Advisor. This webinar will focus on using these tools on Cray supercomputers such as ALCF's Theta.
After participating in this webinar you will be able to answer questions such as:
- Is my code getting vectorized? What is my vector efficiency?
- Am I using memory wisely? Is this kernel/function DRAM or cache bound? What is my cache hit ratio?
- Am I using the full system? Are all cores doing work? What is preventing further scaling?
- Which routines do I need to optimize and in what way?