ISC'13

June 16-20, 2013

Leipzig, Germany

Session Details

 
Name: Tutorial 02
Node-Level Performance Engineering
 
Time: Sunday, June 16, 2013
9:00 AM - 6:00 PM
 
Room:   Seminar Room 8 (SR 8)
CCL - Congress Center Leipzig
 
Breaks:11:00 AM - 11:30 AM Coffee Break
1:00 PM - 2:00 PM Lunch
4:00 PM - 4:30 PM Coffee Break
 
Presenter(s):   Georg Hager, Erlangen Regional Computing Center
  Jan Treibig, Erlangen Regional Computing Center
  Gerhard Wellein, Erlangen Regional Computing Center
 
Abstract:   This tutorial covers performance engineering approaches on the compute node level. "Performance engineering" as we define it is more than employing tools to identify hotspots and bottlenecks. It is about developing a thorough understanding of the interactions between software and hardware. This process must start at the core, socket, and node level, where the code gets executed that does the actual computational work. Once the architectural requirements of a code are understood and correlated with performance measurements, the potential benefit of optimizations can often be predicted.
We start by giving an overview of modern processor and node architectures, including accelerators. Typical bottlenecks such as instruction throughput and data transfer are identified using kernel benchmarks and put into the architectural context. The impact of optimizations like SIMD vectorization, data layout transformations, and cache blocking is shown, and different aspects of a "holistic" node-level performance modeling and engineering strategy are demonstrated. Using the LIKWID multicore tool suite we show the importance of topology awareness, affinity enforcement, and hardware performance metrics. The latter are used to support the performance engineering process by supplying information that can validate or falsify performance models.

Content Level
25% Introductory, 50% Intermediate, 25% Advanced

Prerequisites
Some knowledge about MPI and OpenMP, and some (basic) knowledge about typical processor and node architectures (cores, caches, sockets).
 
 
Program may be subject to changes.