Fundamentals of Accelerated Computing with CUDA Python

Europe/Paris
Jean Lascoux (Ecole Polytechnique - Aile 0 - Ground Floor)

Jean Lascoux

Ecole Polytechnique - Aile 0 - Ground Floor

Ecole Polytechnique Route de Saclay Palaiseau
Laurent Series (Ecole Polytechnique), Yannick FITAMANT (CNRS)
Description

Fundamentals of Accelerated Computing with CUDA Python

This workshop from Nvidia DLI, supervised by a certified instructor, teaches you fundamental tools and techniques to running GPU-accelerated Python applications using CUDA GPUs and the Numba compiler. You'll work through dozens of hands-on coding exercises and, the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs observing impressive performance gains. After the workshop ends, you will have aditional resources to help you create new GPU-accelerated applications on your own.

Audience : Researchers, Engineers, PhD and post-docs

Duration : 8 hours

Date : June 5th, 2024 from 8.30 a.m. to 5.30 p.m.

Location: Ecole Polytechnique / Aile 0 / Jean Lascoux conference room

Maximum number of people : 20

Price : Free

Prerequisites :

  • hardware: Laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated workstation in the Cloud.
  • knowledge: Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations. Numpy competency, including the use of ndarrays and ufuncs. No previous knowledge of CUDA programming is required. 

 

Learning objectives :

At the conclusion of the workshop, you'll have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba :

  • GPU-accelerate NumPy ufuncs with a few lines of code.
  • Configure code parallelization using the CUDA thread hierarchy.
  • Write custom CUDA device kernels for maximum performance and flexibility.
  • Use memory coalescing and on-device shared memory to increase CUDA kernel bandwith.

 

Workshop language is English

 

Speaker : François Courteille (Nvidia certified instructor)

Inscription
Registration
    • 1
      Introduction - Introduction to CUDA Python with numba Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau

      Meet the instructor.
      Create an account at https://learn.nvidia.com/join
      --
      Begin working with the Numba compiler and CUDA programming in Python.
      Use Numba decorators to GPU-accelerate numerical Python functions.
      Optimize host-to-device and device-to-host memory transfers.

      Orateur: François Courteille
    • 10:00
      Coffee Break Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau
    • 2
      Introduction to CUDA Python With Numba Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau

      Begin working with the Numba compiler and CUDA programming in Python.
      Use Numba decorators to GPU-accelerate numerical Python functions.
      Optimize host-to-device and device-to-host memory transfers.

      Orateur: François Courteille
    • 12:30
      Lunch Break
    • 3
      Custom CUDA Kernels in Python with Numba Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau

      Learn CUDA’s parallel thread hierarchy and how to extend parallel program possibilities.
      Launch massively parallel custom CUDA kernels on the GPU.
      Utilize CUDA atomic operations to avoid race conditions during parallel execution.

      Orateur: François Courteille
    • 15:15
      Coffee Break Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau
    • 4
      Multidimensional Grids, and Shared Memory for CUDA Python with Numba Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau

      Learn multidimensional grid creation and how to work in parallel on 2D matrices.
      Leverage on-device shared memory to promote memory coalescing while reshaping 2D matrices.

      Orateur: François Courteille
    • 5
      Final review Jean Lascoux

      Jean Lascoux

      Ecole Polytechnique - Aile 0 - Ground Floor

      Ecole Polytechnique Route de Saclay Palaiseau

      Review key learnings and wrap up questions.
      Take the workshop survey.