Events

SDSC at SC25

Join SDSC in booth 217 for the annual International Conference for High Performance Computing, Networking, Storage and Analysis.

St. Louis, MO

Advanced HPC-CI Webinar Series: R for HPC

We will focus on (1) R package management and user environment configuration on the HPC cluster and (2) understanding the R parallelism in HPC, such as using the "parallel" package in R and a few related packages to parallelize and enhance the performance of R programs.

Remote event

COMPLECS: Parallel Computing Concepts

A brief introduction to fundamental concepts in parallel computing. Topics include threads, processes, Amdahl’s Law, benchmarking, and factors that limit scalability. No programming experience needed.

Remote event

Intermediate Linux

Linux command line interface (CLI) skills are essential for advanced cyberinfrastructure (CI). This session covers filesystem hierarchy, permissions, links, wildcards, finding files, environment variables, modules, config files, aliases, history & Bash scripting tips.

Remote event

Batch Computing: Working with the Linux Scheduler

A brief introduction to the Linux scheduler, how to interact with it, and run your research workloads on your personal computer, a shared workstation, or even a high-performance computing system.

Remote event

COMPLECS: Linux Tools for Text Processing

An overview of commonly used Linux tools for searching and manipulating text. We progress from the simplest tools, such as head, tail, cut, paste, to more complex tools grep, awk and sed.

Remote event

Architecting Reproducible Science: A Practical Path Beyond the Notebook

Jupyter Notebooks are great for exploration, but terrible for reproducibility and scaling. This talk shows how to migrate notebook code into a structured Python package that can run anywhere—locally, on HPC clusters, or in automated pipelines. We’ll connect these practices to early-stage MLOps concepts, illustrating how packaging is the foundation for reproducible scientific computing.

Remote event

COMPLECS: Data Transfer

Efficiently transferring data is a critical part of building research workflows, whether working with experimental or simulated data on local or high-performance computing (HPC) systems. Here, we introduce key concepts and command-line tools for data transfer, including how to verify data integrity, use compression, and select appropriate transfer methods based on data size, location, and organization.

Remote event

Interactive Computing

Interactive high-performance computing (HPC) involves real-time user inputs that result in actions being performed on HPC compute nodes. This session presents an overview of interactive computing tools and methods.

Remote event

Linux Shell Scripting

Shell scripting improves productivity and reduces errors in HPC workflows by automating tasks like data processing, backups, and system monitoring. This session builds on basic Linux command-line skills to teach Bash scripting syntax, constructs, and best practices for effective automation.

Remote event

Fine Tuning Large Language Models (LLMs) with Domain Specific Datasets

Large language models (LLMs) are trained on massive, publicly available text datasets containing trillions of tokens. However, these models do not necessarily possess sufficient subject matter expertise and often struggle to provide meaningful responses to specialized prompts. Therefore, fine-tuning LLMs with domain-specific datasets – extracted from documents, articles, and other sources – is crucial. In this presentation, we will demonstrate fine-tuning a few smaller LLMs with LoRA approach using instruct datasets on Gaudi hardware.

Remote event

COMPLECS: Getting Started with HPC - 2-Day Workshop

SAVE THE DATE - This two-day virtual workshop is an extension of the COMPLECS webinar series to provide more in-depth discussion and complementary hands-on experience. This opportunity is designed to help users strengthen skills using HPC (High-Performance Computing) systems.

Remote event

Data Storage and File Systems

High-performance computing (HPC) systems use various specialized storage and file systems, each with different performance levels and best-use scenarios. Here, we explain how to properly use these systems for your research. We also introduce common filesystems, their architectures, appropriate I/O patterns, and key Linux tools for managing storage usage, backups, and file permissions.

Remote event

From Atoms to Algorithms: GPU Acceleration of Molecular Dynamics, DFT, and QM/MM Simulations

This talk highlights advances in GPU-accelerated molecular simulations and the hardware that enables them. We will begin with an overview of modern GPU architectures, supercomputers, and programming models that leverage massive parallelism for scientific computing. We will then discuss GPU-optimized molecular dynamics simulations with Amber and quantum chemistry calculations with QUICK, a massively parallel Hartree-Fock and DFT program that performs all major computations on GPUs. Together, these tools enable efficient QM/MM molecular dynamics simulations across diverse GPU platforms. Advances in software and GPU technology are thus transforming the scale and accuracy of simulations for complex chemical and biological systems.

Remote event

Using Regular Expressions with Linux Tools

Essentials of using regular expressions (regexes) with the Linux tools grep, awk and sed. Topics include quantifiers, wildcards, grouping, alternation, word boundaries, lazy and greedy matching and regex flavors. Attendees should at least be familiar with grep.

Remote event

Code Migration

This session introduces key strategies for transitioning computations to HPC systems, including using pre-installed software, compiling code with appropriate tools, and setting up Python, R, and conda environments. It also covers workflow management and introduces containerized solutions with Singularity, featuring hands-on practice on SDSC resources.

Remote event