SDSC at SC25
Join SDSC in booth 217 for the annual International Conference for High Performance Computing, Networking, Storage and Analysis.
St. Louis, MO
Join SDSC in booth 217 for the annual International Conference for High Performance Computing, Networking, Storage and Analysis.
St. Louis, MO
How to build and run your high-throughput and many-task computing workflows on high-performance computing systems using the Slurm Workload Manager.
Remote event
We will focus on (1) R package management and user environment configuration on the HPC cluster and (2) understanding the R parallelism in HPC, such as using the "parallel" package in R and a few related packages to parallelize and enhance the performance of R programs.
Remote event
A brief introduction to fundamental concepts in parallel computing. Topics include threads, processes, Amdahl’s Law, benchmarking, and factors that limit scalability. No programming experience needed.
Remote event
Linux command line interface (CLI) skills are essential for advanced cyberinfrastructure (CI). This session covers filesystem hierarchy, permissions, links, wildcards, finding files, environment variables, modules, config files, aliases, history & Bash scripting tips.
Remote event
This talk will start with an overview of the ICICLE (Intelligent CyberInfrastructure (CI) with Computational Learning in the Environment), an NSF-AI Institute, to address these challenges.
Remote event
A brief introduction to the Linux scheduler, how to interact with it, and run your research workloads on your personal computer, a shared workstation, or even a high-performance computing system.
Remote event
An overview of commonly used Linux tools for searching and manipulating text. We progress from the simplest tools, such as head, tail, cut, paste, to more complex tools grep, awk and sed.
Remote event
Jupyter Notebooks are great for exploration, but terrible for reproducibility and scaling. This talk shows how to migrate notebook code into a structured Python package that can run anywhere—locally, on HPC clusters, or in automated pipelines. We’ll connect these practices to early-stage MLOps concepts, illustrating how packaging is the foundation for reproducible scientific computing.
Remote event
Efficiently transferring data is a critical part of building research workflows, whether working with experimental or simulated data on local or high-performance computing (HPC) systems. Here, we introduce key concepts and command-line tools for data transfer, including how to verify data integrity, use compression, and select appropriate transfer methods based on data size, location, and organization.
Remote event
Interactive high-performance computing (HPC) involves real-time user inputs that result in actions being performed on HPC compute nodes. This session presents an overview of interactive computing tools and methods.
Remote event
Shell scripting improves productivity and reduces errors in HPC workflows by automating tasks like data processing, backups, and system monitoring. This session builds on basic Linux command-line skills to teach Bash scripting syntax, constructs, and best practices for effective automation.
Remote event
Large language models (LLMs) are trained on massive, publicly available text datasets containing trillions of tokens. However, these models do not necessarily possess sufficient subject matter expertise and often struggle to provide meaningful responses to specialized prompts. Therefore, fine-tuning LLMs with domain-specific datasets – extracted from documents, articles, and other sources – is crucial. In this presentation, we will demonstrate fine-tuning a few smaller LLMs with LoRA approach using instruct datasets on Gaudi hardware.
Remote event
SAVE THE DATE - This two-day virtual workshop is an extension of the COMPLECS webinar series to provide more in-depth discussion and complementary hands-on experience. This opportunity is designed to help users strengthen skills using HPC (High-Performance Computing) systems.
Remote event
High-performance computing (HPC) systems use various specialized storage and file systems, each with different performance levels and best-use scenarios. Here, we explain how to properly use these systems for your research. We also introduce common filesystems, their architectures, appropriate I/O patterns, and key Linux tools for managing storage usage, backups, and file permissions.
Remote event
This talk highlights advances in GPU-accelerated molecular simulations and the hardware that enables them. We will begin with an overview of modern GPU architectures, supercomputers, and programming models that leverage massive parallelism for scientific computing. We will then discuss GPU-optimized molecular dynamics simulations with Amber and quantum chemistry calculations with QUICK, a massively parallel Hartree-Fock and DFT program that performs all major computations on GPUs. Together, these tools enable efficient QM/MM molecular dynamics simulations across diverse GPU platforms. Advances in software and GPU technology are thus transforming the scale and accuracy of simulations for complex chemical and biological systems.
Remote event
Essentials of using regular expressions (regexes) with the Linux tools grep, awk and sed. Topics include quantifiers, wildcards, grouping, alternation, word boundaries, lazy and greedy matching and regex flavors. Attendees should at least be familiar with grep.
Remote event
This session introduces key strategies for transitioning computations to HPC systems, including using pre-installed software, compiling code with appropriate tools, and setting up Python, R, and conda environments. It also covers workflow management and introduces containerized solutions with Singularity, featuring hands-on practice on SDSC resources.
Remote event