Skip to content

Resources & Services

High-Performance Computing

For almost 30 years, SDSC has led the way in deploying and supporting cutting-edge high performance computing systems for a wide range of users, from the campus to the national research community. From the earliest Cray systems to today’s data-intensive systems, SDSC has focused on providing innovative architectures designed to keep pace with the changing needs of science and engineering.

Whether you’re looking to expand computing beyond your lab or a business looking for that competitive advantage, SDSC’s HPC experts will guide potential users in selecting the right resource, thereby reducing time to solution while taking your science to the next level.

Take a look at what SDSC has to offer and let us help you discover your computing potential .

Triton HPC

Comet – HPC for the 99 Percent

Comet, a new petascale supercomputer designed to transform advanced scientific computing by expanding access and capacity among traditional as well as non-traditional research domains, will soon be taking shape at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego.

Comet will be capable of an overall peak performance of two petaflops, or two quadrillion operations per second. Comet will join SDSC's Gordon supercomputer as another key resource within the the NSF XSEDE (Extreme Science and Engineering Discovery Environment) program, which comprises the most advanced collection of integrated digital resources and services in the world.

Researchers can apply for time on Comet and other resources via XSEDE. Comet production startup is scheduled for early 2015, to be followed by a formal launch event in the spring.

Gateway to Discovery

Comet was designed to provide a solution for the emerging research requirements often referred to as the 'long tail' of science, the idea that many modestly sized, computationally based research projects represent a tremendous amount of research and resulting scientific impact and advancement.

"One of the key strategies for Comet is to support modest-scale users across the entire spectrum of NSF communities, while also welcoming research communities that are not typically users of more traditional HPC systems, such as genomics, the social sciences, and economics," said SDSC Director Michael Norman.

Key Features of Comet

  • Dell-integrated cluster based on the Intel© Xeon© Processor E5-2600 v3 family (two processors per node and 12 cores per processor running at 2.5GHz.)
  • Estimated overall peak performance of two petaflops – two quadrillion operations per second
  • Designed to optimize capacity for modest-scale jobs: Each 72-node rack (1,728 cores) features full bisection InfiniBand FDR interconnect from Mellanox, with a 4:1 bisection interconnect across the racks. Total node count is 1,944 or 46,656 cores.
  • Total 253 TB DDR4 RAM and 620 TB of flash memory.
  • Each compute node will have 128 GB (gigabytes) of traditional DRAM, and 320 GB of local flash memory.
  • Four large-memory nodes (1.5 TB of memory per), plus 36 NVIDIA GPU nodes to accommodate applications such as visualizations, molecular dynamics simulations, or de novo genome assembly.
  • 7.6 PB of Lustre-based high-performance storage; plus 6 PB of durable storage for data reliability
  • 100 Gbps connectivity to Internet2 and ESNet, allowing users to rapidly move data to SDSC for analysis and data sharing and return data to their institutions for local use
  • Comet is the first XSEDE production system to support high-performance Single Root I/O Virtualization at the multi-node cluster level.

Build-out Begins

Comet will be the successor to SDSC's Trestles computer cluster, to be decommissioned when Comet comes online. "Comet will have all of the features that made Trestles popular with users, with much greater capacity, while providing ease-of-access and minimal wait times to appeal to a broader base of researchers," said SDSC Deputy Director Richard Moore, a co-PI of the Comet project.

Secret Sauce

By the summer of 2015, Comet will be the first XSEDE production system to support high-performance virtualization at the multi-node cluster level. Comet's use of Single Root I/O Virtualization (SR-IOV) means researchers can use their own software environment, as they do with cloud computing, but can achieve the high performance they expect from a supercomputer.

"We are pioneering the area of virtualized clusters, specifically with SR-IOV," said Philip Papadopoulos, SDSC Chief Technical Officer. "This will allow virtual sub-clusters to run applications over InfiniBand at near-native speeds, representing a huge step forward in HPC virtualization. In fact the new 'secret sauce' in Comet is virtualization for customized software stacks, which will lower the entry barrier for a wide range of researchers."

"The variety of hardware and support for complex, customized software environments will be of particular benefit to Science Gateway developers," said Nancy Wilkins-Diehr, co-PI of the XSEDE program and SDSC’s associate director. "We now have more than 30 such Science Gateways running on XSEDE, each designed to address the computational needs of a particular community such as computational chemistry, atmospheric science or the social sciences."

For additional information please visit the Comet section of User Support.

Gordon - Meeting the Demands of Data Intensive Computing

The era of big data has arrived.  With exponential volumes of data being generated by large scale simulations and scientific instruments, the computing capability of traditional FLOPS-based (Floating Point Operations per Second) systems may no longer be sufficient for many research inquiries.  More than FLOPS, researchers require systems that can move huge amounts of data from disk to the processor at rates and in volumes that are an order of magnitude or more than the capabilities of most resources available today.

SDSC's Gordon was introduced in 2012 as one of the 50 fastest supercomputers in the world, and iss the first system built specifically for the challenges of data-intensive computing. It is the first HPC system to use massive amounts of flash-based memory. Gordon contains 300 TB of flash-based storage, similar to that found in smaller devices such as cellphones and laptops, but with greater performance and durability.

Gordon also deploys large memory “supernodes” based on ScaleMP’s vSMP Foundation software. The standard supernode has approximately 1 TB of DRAM, but larger memory configurations can be deployed as needed.

These features make Gordon an ideal platform for tackling data-intensive problems, and a well-balanced resource between speed and large memory capabilities. Gordon is designed to help advance science in domains such as genomics, graph analysis, computational chemistry, structural mechanics, image processing, geophysics, and data mining applications. The system’s supernodes are ideal for users with serial or threaded applications that require significantly more memory than is available on a single node of most other HPC systems, while Gordon’s flash-based I/O nodes may offer significant performance improvement for applications that exhibit random access data patterns or require fast access to significant amounts of scratch space.

Gordon is connected to SDSC’s recently upgraded Data Oasis storage system (see below), providing researchers with a complete array of compute and storage resources. Allocations on Gordon are available through the NSF XSEDE program. Dedicated I/O node allocations are processed through the XSEDE Start-Up allocation request, which can be turned around in a matter of a couple of weeks.

The table below provides a brief technical summary of Gordon. For additional information please visit the Gordon section of User Support.


Gordon

Performance

341 Tflop/s peak
560,000 IOPS

Compute Nodes

1,024 Intel XEON E5 (Sandy Bridge) 2.6 GHz dual socket; 16 cores/node; 64 GB 1333 MHz RAM (64 TB total);
80 GB Intel SSD per node

Flash based I/O nodes

64 Intel Westmere; dual socket; 12 cores/node; 48 GB 1330 DDR3 1333 MHz memory.

4.8 TB Intel 710 SSD/node (300 TB total)

Interconnect

Dual Rail, QDR, 3D torus of switches

Lustre-based parallel file systems

4 PB; Aggregate: 100 GB/s, Gordon Scratch : 50 GB/s

 

Trestles – High Productivity and Science Gateway Workhorse

Trestles is designed to enable modest-scale and gateway researchers to be as computationally productive as possible. Since entering production in early 2011, Trestles has attracted researchers from diverse areas who need access to a fully supported supercomputer with shorter turnaround times than has been typical for most systems. To respond to user requirements for more flexible access modes, Trestles features long run times (up to 2 weeks),  pre-emptive, on-demand queues for applications that require urgent access because of unpredictable natural or man-made events that have a societal impact, as well as user-settable reservations for researchers who need predictable access for their workflows. Trestles, which like Gordon features flash-based memory, was inspired by the idea that by tailoring a system for the majority of user jobs rather than a handful of researchers who run jobs at thousands of core counts, users would be rewarded with high throughput and scientific productivity.

Trestles is recognized as the leading science gateway platform in the NSF/XSEDE portfolio, with more than 650 users per month run through the popular CIPRES portal alone. Trestles users span a wide range of domains, including phylogenetic research, computational chemistry, material sciences, geographic information analysis, high-impact storm event prediction, biophysics, astronomy, cosmology, and gravitational physics. Users appreciate the rapid turnaround times for their modest-scale computational runs.

Like Gordon, allocations on Trestles are available through the NSF XSEDE program. Individual jobs are restricted to a maximum of 1,024 cores and annual awards are limited to a total of 1.5 million SUs, with exceptions made for gateways applications. Startup allocations of 50,000 SUs or less are also available via XSEDE.  Like Gordon, Trestles is connected to SDSC’s Data Oasis storage system. 

The table below provides a brief technical summary of Trestles. For additional information please visit the Trestles section of User Support.


Trestles

Performance

100 Tflop/s

Compute Nodes

324 AMD Magny-Cours 2.4 GHz quad socket; 32 cores/node; 64 GB 1333 MHz RAM (20 TB total);
120 GB Intel SSD per node

Interconnect

Fat Tree; QDR

Lustre-based parallel file system

800TB; 20 GB/s

 

Triton Shared Computing Cluster

Triton Shared Computing Cluster (TSCC) is a new computational cluster for research computing available through UC San Diego's RCI program. Designed as a turnkey, high performance computing resource, it features flexible usage and business models and professional system administration. Unlike traditional clusters, TSCC is a collaborative system wherein the majority of nodes are purchased and shared by the cluster users, known as condo owners. In addition to the participant-contributed condo nodes, TSCC has a collection of hotel nodes which are available to condo owners and to other researchers on a rental basis. The condo and hotel configurations contain both standard two-socket nodes and GPU nodes. The hotel configuration also features eight 512GB large-memory nodes.

The table below provides a brief technical summary of TSCC.


Triton Shared Computing Cluster (See full spec)

Performance

Condo/Hotel model provides wide array of node configurations

General Compute Nodes

Dual-socket, 8-core, 2.6GHz Intel Xeon E5-2670 (Sandy Bridge)
GPU and High-Memory PDAF nodes also available

Interconnect

10GbE (QDR InfiniBand optional)

Lustre-based parallel file system

Access to Data Oasis

 

Data Oasis – Integrated, High-Performance Parallel File System

At the heart of SDSC’s high performance computing systems is the high-performance, scalable, Data Oasis parallel file system.  Data Oasis is designed to meet the needs of high-performance and data-intensive computing: a central 10GbE core infrastructure that is integrated with all of SDSC’s compute and network resources; scalable storage units that can be easily expanded and modified as newer drive technology becomes available; a network design that allows for improved performance, scalability, and redundancy; and an open source design philosophy that leverages community resources and expertise.  The central network design also provides seamless gateway for those who wish to use SDSC’s Cloud Storage environment.

SDSC recently completed some upgrades to this Lustre-based system, which currently has four petabytes (PB) of capacity and sustained transfer rates of up to 100 GB/s to handle the data-intensive needs of Gordon, Trestles, and TSCC. Data Oasis has 64 storage building blocks which constitute the system’s Object Storage Servers (OSS’s).  Each of these is an I/O powerhouse in their own right. With dual-core Westmere processors, 36 high-speed SAS drives, and two dual-port 10GbE network cards, each OSS delivers sustained rates of over 2GB/s to remote clients. Data Oasis’ capacity and bandwidth are expandable with additional OSS’s, and at commodity pricing levels.

Data Oasis’ backbone network architecture uses a pair of large Arista 7508 10Gb/s Ethernet switches for dual-path reliability and performance.  These form the extreme performance network hub of SDSC's parallel, network file system, and cloud based storage with more than 450 active 10GbE connections, and capacity for 768 in total.

XSEDE

SDSC is a founding partner of the National Science Foundation’s XSEDE program, the follow-on to TeraGrid. XSEDE integrates high-performance computers, data services and expertise across some of the nation's most powerful supercomputer centers, creating the world's most comprehensive distributed cyberinfrastructure for open scientific research.

Contact us to learn more about SDSC's HPC resources
consult@sdsc.edu