The Trusted Leader in High Performance Computing



Course Length: Price:

Current Schedule:
4 days
Contact Us

This course is available ON DEMAND - please contact us about scheduling training.

To register for this course, please Contact Us
To see a list of all courses, please go to the Complete Schedule

Learn how to write and optimize applications that fully leverage the multi-core processing capabilities of NVIDIA GPUs.

Learn how to write and optimize applications that fully leverage the multi-core processing capabilities of NVIDIA GPUs. Developed in partnership with NVIDIA, this comprehensive course will provide attendees with a solid understanding of GPU programming, GPU architectures, CUDA libraries and tools, debugging, optimization, profiling and an introduction to OpenACC. Taught by Acceleware developers who bring real world experience to the class room, students will benefit from hands-on exercises and progressive lectures, Individual laptops for student use and small class sizes to maximize learning.

Course Outline

Day One: Introduction to GPU Programming and GPU Architectures

  • Overview of GPU computing
  • Data-parallel architectures and the GPU programming model
  • GPU memory model & thread cooperation
  • Hands-on exercises: GPU memory management, simple CUDA kernels and shared memory and constant memory

Day Two: Advanced GPU Programming

  • Asynchronous operations
  • Advanced CUDA features
  • Libraries
  • Debugging GPU Programs
  • Hands-on-exercises: Asynchronous operations, CUDA features, experience with CUFFT, CUBLAS, Thrust, debugging

Day Three: Introduction to Optimizations

  • Introduction to optimization
  • Resource management, latency and occupancy
  • Memory performance optimizations
  • Profiling applications
  • Hands-on exercises: Arithmetic optimizations, occupancy calculator, profiling and memory access patterns

Day Four: Case Studies and OpenACC

  • CUDA compiler and user-defined libraries
  • OpenACC
  • Hands-on exercises: Case study exercise and OpenACC
  • Case study: Finite difference stencil algorithm or monte carlo simulations

To get the most out of the course we recommend that the attendees have a background in C/C++ and be familiar with the following concepts:

  • Pointers and pointer to pointers (*, **)
  • Taking the address of a variable (&)
  • Writing functions, for loops, if/else statements
  • Printing to standard output (printf, cout)
  • Memory allocation and deallocation
  • Arrays and indexing
  • Structures
  • General debugging