News
SDSC Participates in HPC Education Project to Produce New Cyber Training Catalog
Published April 24, 2025
Computational Data Scientist Mary Thomas of the San Diego Supercomputer Center (SDSC), part of the UC San Diego School of Computing, Information and Data Sciences, has been participating in a U.S. multi-institutional initiative that streamlines discovery and collaboration through an innovative cyber training catalog. In an effort to enhance access to and sharing of cyber training resources, the High Performance Computing-Education (HPC-ED) project has integrated an intuitive interface, structured metadata and active community engagement.
“Our HPC-ED team is creating a system that allows training material owners to share their resources while maintaining ownership,” Thomas said. “Organizations can also enrich their local portals with shared materials, ultimately broadening the reach of valuable educational tools.”
The work has been published in the Journal of Computational Science Education. In addition to Thomas and Jiesen Zhang at SDSC, co-authors are HPC-ED project team members Susan Mehringer, Richard Knepper and Zilu Wang of the Center for Advanced Computing at Cornell University; Katharine Cahill of the New Jersey Institute of Technology; Charlie Dey of the Texas Advanced Computing Center; Brian Guilfoos of the Ohio Supercomputer Center; David Joiner of Kean University; John-Paul Navarro of Argonne National Laboratory and Jeaime Powell of Omnibond Systems.
Thomas explained that a key strength of the HPC-ED project is leveraging the flexible and well-established Globus Search framework. She said that this architecture allows the team to offer a scalable solution for a wide range of users – from individual educators to large organizations seeking to enhance their cyber training offerings.
“With insights gained from our pilot phase, we are now evolving to implement new features designed to better meet community needs,” Thomas said. “Enhanced tools and interfaces will cater to diverse preferences, such as enabling filtered search result downloads and integrating Jupyter Notebooks for streamlined resource sharing.”
Zhang, a software engineering intern at SDSC, said that this experience has broadened his understanding of the importance of efficient metadata. “We recently completed our pilot phase, which focused on prototyping the HPC-ED catalog, defining metadata structures, providing documentation and beginning the process of sharing and discovering materials,” Zhang explained. “Community feedback was gathered through surveys, presentations, tutorials and hackathons – enabling us to assess user needs and refine the platform accordingly.”
Future developments of HPC-ED will include the incorporation of Large Language Models (LLMs) to generate more accurate metadata and content recommendations. An additional focus will be on transitioning the catalog into full-scale production, expanding its user base and refining its tools to optimize usability.
According to Thomas, by fostering collaboration and leveraging cutting-edge technologies, HPC-ED is well on its way to revolutionizing how cyber training materials are shared and discovered, ensuring that critical knowledge reaches those who need it most.
“Creating a federated repository for training materials supports the ability of computational scientists to learn and extends the offerings available through local research computing and data teams,” said Knepper, director of the Cornell Center for Advanced Computing (CAC) and HPC-ED principal investigator.
Mehringer, first author of the paper and an associate director at the CAC, concurred. “Building a well-defined catalog achieves two important goals: supporting researchers by enabling improved material discovery and reducing training development duplication by simplifying sharing,” she said.
HPC-ED: Building a Federated Repository and Increasing Access through CyberTraining is supported by U.S. National Science Foundation (grant no. OAC-2320977).