SDSC: San Diego Supercomputer Center
Established: November 14, 1985
Web site: www.sdsc.edu
Leadership: Michael L. Norman, interim director
HPC and Storage Resources
Gordon - Meeting the Demands of Data Intensive Computing
Gordon is SDSC’s newest HPC resource and the first system built specifically for the challenges of data-intensive computing. Debuting in early 2012 as one of the 50 fastest supercomputers in the world, Gordon is the first HPC system to use massive amounts of flash-based memory. Gordon contains 300TB (terabytes) of flash-based storage, similar to that found in smaller devices such as cellphones and laptops, but with greater performance and durability. Gordon also deploys large memory “supernodes” based on ScaleMP’s vSMP Foundation software. The standard supernode has approximately 1TB of DRAM, but larger memory configurations can be deployed as needed.
As a well-balanced resource between speed and large memory capabilities, Gordon is an ideal platform for tackling data-intensive problems, and designed to help advance science in domains such as genomics, graph analysis, computational chemistry, structural mechanics, image processing, geophysics, and data mining applications. The system’s supernodes are ideal for users with serial or threaded applications that require significantly more memory than is available on a single node of most other HPC systems, while Gordon’s flash-based I/O nodes may offer significant performance improvement for applications that exhibit random access data patterns or require fast access to significant amounts of scratch space.
Gordon is connected to SDSC’s recently upgraded Data Oasis storage system (see below), providing researchers with a complete array of compute and storage resources. Allocations on Gordon are available through the National Science Foundation’s XSEDE program.
Trestles – High Productivity and Science Gateway Workhorse
Trestles is designed to enable modest-scale and gateway researchers to be as computationally productive as possible. Since entering production in early 2011, Trestles has attracted researchers from diverse areas who need access to a fully supported supercomputer with shorter turnaround times than has been typical for most systems. To respond to user requirements for more flexible access modes, Trestles features long run times (up to two weeks), pre-emptive, on-demand queues for applications that require urgent access because of unpredictable natural or man-made events that have a societal impact, as well as user-settable reservations for researchers who need predictable access for their workflows. Trestles, which like Gordon features flash-based memory, was inspired by the idea that by tailoring a system for the majority of user jobs rather than a handful of researchers who run jobs at thousands of core counts, users would be rewarded with high throughput and scientific productivity.
With a theoretical peak performance of 100 TFlop/s, Trestles is recognized as the leading science gateway platform in the NSF/XSEDE portfolio, with more than 650 users per month run through the popular CIPRES portal alone. Trestles users span a wide range of domains, including phylogenetic research, computational chemistry, material sciences, geographic information analysis, high-impact storm event prediction, biophysics, astronomy, cosmology, and gravitational physics. Like Gordon, allocations on Trestles are available through the NSF’s XSEDE program. Individual jobs are restricted to a maximum of 1,024 cores and annual awards are limited to a total of 1.5 million SUs, with exceptions made for gateways applications. Startup allocations of 50,000 SUs or less are also available via XSEDE. As with Gordon, Trestles is connected to SDSC’s Data Oasis storage system.
Triton Shared Computing Cluster
Designed as a turnkey, high-performance computing resource, the Triton Shared Computing Cluster (TSCC) features flexible usage and business models and professional system administration. Unlike traditional clusters, TSCC is a collaborative system wherein the majority of nodes are purchased and shared by cluster users, known as condo owners. In addition to the participant-contributed condo nodes, TSCC has a collection of hotel nodes which are available to condo owners and to other researchers on a rental basis. The condo and hotel configurations contain both standard two-socket nodes and GPU nodes. The hotel configuration also features eight 512GB large-memory nodes.
Data Oasis – SDSC’s Integrated, High-Performance Parallel File System
At the heart of SDSC’s high performance computing systems is the high-performance, scalable, Data Oasis parallel file system. Data Oasis is configured to meet the needs of high-performance and data-intensive computing: a central 10GbE core infrastructure that is integrated with all of SDSC’s compute and network resources; scalable storage units that can be easily expanded and modified as newer drive technology becomes available; a network design that allows for improved performance, scalability, and redundancy; and an open source design philosophy that leverages community resources and expertise. The central network design also provides seamless gateway for those who wish to use SDSC’s Cloud Storage environment.
SDSC’s Lustre-based Data Oasis backbone network architecture uses a pair of large Arista 7508 10Gb/s Ethernet switches for dual-path reliability and performance. With currently four petabytes (PB) of capacity and sustained transfer rates of up to 100GB/s to handle the data-intensive needs of Gordon, Trestles, and TSCC, Data Oasis has 64 storage building blocks which constitute the system’s Object Storage Servers (OSS’s). Each of these is an I/O powerhouse in their own right. With dual-core Westmere processors, 36 high-speed SAS drives, and two dual-port 10GbE network cards, each OSS delivers sustained rates of over 2GB/s to remote clients. Data Oasis’ capacity and bandwidth are expandable with additional OSS’s, and at commodity pricing levels.
Sierra and Lima
SDSC is also a resource partner in the FutureGrid XSEDE project, which provides a high-performance grid test bed that will allow scientists to collaboratively develop and test innovative approaches to parallel, grid, and cloud computing. SDSC hosts a 7TB machine called Sierra and a smaller 1.3TF machine called Lima. Both resources share 96TB of raw storage and are connected to the FutureGrid network via a 10GB link. Sierra provides cloud environments (such as Openstack and Nimbus) as well as supports HPC applications running on the "bare metal." Lima provides local SSD storage for I/O intensive applications. Users can run stand-alone or distributed experiments on Sierra and Lima as well as other FutureGrid machines.
The anatomy of a byte:
- Byte: A unit of computer information equal to one typed character.
- Megabyte: A million bytes; equal in size to a short novel.
- Gigabyte:A billion bytes; equal to information contained in a stack of books almost three stories high.
- Terabyte: A trillion bytes; about equal to the information printed on paper made from 50,000 trees.
- Petabyte: A quadrillion bytes. It would take 1,900 years to listen to a petabyte's worth of songs – if you had a large enough MP3 player.
- Exabyte: One quintillion bytes; every word ever spoken by humans could be stored on five exabytes.
- Zettabtye: One sextillion bytes; enough data to fill a stack of DVDs reaching halfway to Mars.
Rating a supercomputer's performance:
- Megaflops: A million floating point operations per second. The original Cray-1 supercomputer was capable of 80 megaflops.
- Gigaflops: A billion floating point operations per second. Today's personal computers are capable of gigaflops performance.
- Teraflops: A trillion (1012) floating point operations per second. Most of today's supercomputers are capable of teraflops performance.
- Petaflops: A quadrillion (1015) floating point operations per second. The latest supercomputer barrier to be broken. The fastest systems can now achieve about 2.5 petaflops.
- Exaflops: A quintillion (1018) floating point operations per second, and the new frontier for supercomputers, provided we can make exascale supercomputers 100 to 1,000 times as energy-efficient as today's fastest machines.
Some common uses for supercomputers: