SDSC

HPC/CI Internships:

HPC Training interns work closely with SDSC staff and the HPC Training Working Group to develop, maintain, and disseminate training materials. Example responsibilities include:

  • Test and maintain Jupyter notebooks that run on Expanse and other HPC systems.
  • Maintain and develop GitHub repositories for training materials.
  • Integrate training events into training repositories and archives.
  • Develop scripts to automate creation of training pages and interactive video pages.
  • Create interactive training pages for events and modules.
  • Develop scripts to gather training metrics (e.g., downloads, clones, site visits).
  • Develop and test related applications, tools, and web pages.

Internships:

  • HPC Training Catalog LLM Developer (Open 02/2/26 - 02/15/26)
    • General job description:
      • Develop and evaluate an LLM-powered assistant to help users discover, navigate, and apply HPC training materials (e.g., Expanse, NRP), including best practices, reproducible workflows, and troubleshooting guidance.
      • Design and implement retrieval-augmented generation (RAG) pipelines over SDSC training content, including slides, notebooks, documentation, videos, and catalog metadata, with source attribution and citation.
      • Build ingestion, indexing, and embedding workflows to support an AI-powered training catalog; iterate on prompts, evaluation methods, and user experience to improve accuracy and usefulness.
      • Integrate LLM capabilities with existing training websites and GitHub-hosted repositories supporting SDSC and NSF-funded training programs.
      • Collaborate with SDSC staff and training teams; contribute code, documentation, and experiments through GitHub.
    • Internship start/end dates: ongoing; opportunities available during summer and fall.
    • Required skills include: AI/ML fundamentals; hands-on experience with large language models (LLMs) and retrieval-augmented generation (RAG); high-performance computing; Python; Jupyter notebooks; Git/GitHub; Linux/Unix.
    • Preferred skills include: Familiarity with knowledge graph-based RAG approaches a plus; Vector databases and embedding models; LLM evaluation techniques; prompt engineering; basic web technologies (HTML/CSS/JavaScript); familiarity with research computing environments.
  • Project Website Developer (closed)
    • General job description:
    • Internship start/end dates: ongoing; opportunities available during summer and fall.
    • Required skills include: HTML5; experience developing websites using GitHub-hosted Jekyll themes; advanced Git/GitHub.
    • Preferred skills include: HTML/CSS/JavaScript; Liquid templating; Ruby and Bundler; experience with Apache, Python, YouTube, databases, metadata/search, JSON, and related web technologies; familiarity with HPC/CI.
  • Interactive Video Developer (closed)
    • General job description:
    • Internship start/end dates: ongoing; opportunities available during summer and fall.
    • Required skills include: experience with Apache, Python, YouTube videos, databases, metadata/search, JSON, and related web technologies.
    • Preferred skills include: Neo4j/knowledge graphs; fundamentals of AI/ML and LLMs.
  • General Internship Opportunity (closed)
    • General job description:
      • Responsibilities vary by project and applicant interests. Example tasks include:
        • Create and manage GitHub repositories and materials (code, presentations, videos, etc.).
        • Support the Training Materials and Document system:
          • Integrate training events into training repositories/archives.
          • Develop scripts to pull data from the Cascade asset factory DB for documents and reporting.
          • Automate collection of repository materials (e.g., using Git submodules).
        • Test training materials, including tutorials and notebooks.
        • Assist in creating project/training websites and outreach materials to improve user success.
        • Develop scripts to gather training metrics (e.g., downloads, clones, site visits).
        • Work with datasets, taxonomies, ontologies, and knowledge graphs to describe training resources.
    • Internship start/end dates: ongoing; opportunities available during summer and fall.
    • Required skills include: experience with Apache, Python, YouTube videos, databases, metadata/search, JSON, and related web technologies.

Qualifications Required for the Job:

  • Undergraduate or graduate working towards Bachelors of Science and/or Engineering, preferably in the fields of: Computer Science, Computer Engineering, Applied Mathematics, Mathematics, Electric Engineering.
  • Experience with Unix/Linux OS, GitHub, Slack, Discord are required.
  • For web based projects, knowledge of web technologies (e.g. Jekyll, HTML5, CSS, JavaScript, REST) and programming (Python, pgi/cgi), Jupyter Notebooks) is required.
  • Experience and/or education in software development with at least one of the key scientific programming languages such as C, C++, Fortran.

Additional Qualifications:

  • Experience with AI/ML, taxonomies, ontologies and knowledge graphs is desirable.
  • Experience with parallel programming such as CUDA, MPI, OpenMP is desirable.
  • Responsible, self-motivated, able to work independently, meet project deadlines.
  • Must be able to work 6 hours a week (minimum) during academic year, and 20-40 hours in summer.
  • Must be able to communicate effectively and work on teams both local and remote.
  • Have paid UCSD Fall 2023 student services fees or Readmit to work during summer.

For more information please contact:

Mary Thomas (mpthomas at ucsd.edu)
Please use the following in your SUBJECT line:
      SUBJECT:  SDSC HPC-Train Internship Inquiry: YourFirstName YourLastName