Press Archive

HPSS Collaboration Announces File System with Infinite Capacity

Published 11/17/2005

Courtesy HPCwire

Today at the SC'05 high performance computing conference, the HPSS Collaboration announced the capability to combine the IBM General Parallel File System (GPFS) with the HPSS Collaboration's High Performance Storage System (HPSS) to provide a high performance file system with virtually infinite capacity.

This capability is being demonstrated in the Lawrence Berkeley National Laboratory and the IBM booths at SC'05 and will be available for production use early in 2006. The first users will be San Diego Supercomputer Center (SDSC) and the U.S. Department of Energy's National Energy Research Supercomputer Center (NERSC) at LBNL.

GPFS and HPSS are both cluster storage software for the high performance computing community. GPFS is a shared disk file system supporting in some cases over 2,000 nodes, or computational elements, in a high performance computing cluster. HPSS supports disk and tape, and moves less-used data to tape while keeping current data on disk. Both GPFS and HPSS are well established as leading applications for high capacity and high data rates, due in part to their cluster architectures. Both GPFS and HPSS currently support the AIX and Linux operating systems in use at SDSC and NERSC.

With the capability announced at SC05, users can now set up GPFS and HPSS to form a unified hierarchical storage system and disk backup system that automatically migrates data between GPFS disk space and HPSS-managed tape libraries.

"This demonstration is a very important milestone in creating highly effective global file systems that support many types of systems," said Bill Kramer, general conference chair of SC0 and general manager of the NERSC Center. "The growth of data-intensive applications and analytics across all science disciplines demands new approaches, and having HPSS integrated with GPFS will provide highly scaleable solutions for many types of problems."

Bob Coyne of IBM, the industry co-chair of the HPSS executive committee, said, "There are at least 10 institutions at SC05 who are both HPSS and GPFS users, many with over a petabyte of data, who have expressed interest in this capability. HPSS/GPFS will not only serve these existing users but will be an important step in simplifying the storage tools of the largest supercomputer centers and making them available to research institutions, universities and commercial users."

"Globally accessible data is becoming the most important part of Grid computing. The immense quantity of information demands full vertical integration from a transparent user interface via a high performance file system to an enormously capable archival manager," said Phil Andrews of the San Diego Supercomputer Center. "The integration of HPSS and GPFS close the gap between the long-term archival storage and the ultra high performance user access mechanisms."

GPFS is an IBM product. HPSS is developed by an ongoing collaboration of Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Sandia National Laboratories, Oak Ridge National Laboratory and IBM. HPSS is licensed to other users and supported by IBM under an agreement between IBM and the United States Department of Energy.