In this session, we will explore two transformative Python technologies—Numba and Dask—that empower researchers to bridge the gap between Python’s flexibility and the performance demands of supercomputing environments. These tools unlock new possibilities for accelerating computationally intensive tasks and scaling workflows across clusters.
Session Outline:
- Supercharging Python with Numba: Just-in-Time (JIT) Compilation
Learn how Numba dynamically compiles performance-critical Python functions into optimized machine code, bypassing Python’s interpreter overhead. We’ll demonstrate how a few simple decorators can accelerate numerical and scientific code to near C/Fortran speeds while maintaining Python’s readability and interactivity. - Parallelism in Python: Threads, Processes, and the Global Interpreter Lock (GIL)
Dive into Python’s concurrency model, including the challenges posed by the GIL for multi-threaded programs. Discover how Numba’s nogil mode enables true multi-threading for CPU-bound tasks and how Dask leverages multi-processing to parallelize workflows across all available cores on a single node. - Distributed Computing with Dask: Scaling Beyond a Single Machine
Extend your computations to multi-node HPC clusters using Dask’s distributed arrays and data frames. We’ll showcase how Dask seamlessly scales familiar NumPy and pandas workflows to handle datasets larger than memory or across thousands of cores, all while managing task scheduling, load balancing, and fault tolerance
Who Should Attend?
This tutorial is designed for scientists, engineers, and developers working with computational workloads. Familiarity with Python basics is helpful but not required—attendees will leave with practical skills to optimize and scale Python code in HPC environments.