Distributed Parallel Computing With Python (Comet)

Presented on May 14, 2019, 11:00am PDT by Andrea Zonca, Ph.D.

This webinar provides an introduction to distributed computing with Python, we will show how to modify a standard Python script to use multiple CPU cores using the concurrent.futures module from the Python standard library and then the dask package. Then we will leverage dask workers running on other machines to distribute a data processing task and monitor its execution through the live dask dashboard. You will understand the difference between threads and processes, how the Global Interpreter Lock works and principles of distributed computing. All material will be available as Jupyter Notebooks.

Related Training Material:   Webinar Recording  | Interactive Video Tutorial Slides  |   Github Repo (examples)

 

About the Instructor:

Andrea Zonca, Pd.D. (SDSC | Senior Computational Scientist)

Andrea Zonca has a background in Cosmology, he has been working on analyzing Cosmic Microwave Background data from the Planck Satellite. In order to manage and analyze large datasets, he developed expertise in parallel programming in Python and C++. At SDSC he helps research groups in any field of science to port their data analysis pipelines to XSEDE supercomputers. Andrea is also a certified Software Carpentry instructor.

For more training info see:  Training for Advanced Computing Users