News Archive

Protein Data Bank Archives 50,000th Molecule Structure

Worldwide Research Archive Doubles in Size Since 2004

Published 04/09/2008

For Immediate Release

An adrenergic receptor (PDB ID 2rh1) as illustrated in the RCSB PDB's Molecule of the Month feature. This April edition is the 100th installment in the series, another RCSB PDB milestone. This structure is an important starting point for the design of drugs to fight heart disease, allergies, and mental disease. Through the PDB, the structure is now available to researchers around the world. Credit: RCSD PDB

Media Contacts:
Jan Zverina
SDSC Communications
858 534-5111 or

Warren R. Froelich
SDSC Communications
858 822-3622 or

Christine Zardecki
RCSB Protein Data Bank/Rutgers

The Protein Data Bank this month reached a significant milestone in its 37-year history as the 50,000th molecule structure was released into its archive, joining other structures vital to pharmacology, bioinformatics, and education.

With its origins in a handwritten petition circulated at a scientific meeting, the PDB is the single worldwide repository for the three-dimensional structures of large molecules and nucleic acids. This freely available online library allows biological researchers and students to study, store and share molecular information on a global scale. Officially founded in 1971 with seven structures at Brookhaven National Laboratory, the archive is currently managed by a consortium called the worldwide Protein Data Bank (wwPDB).

Today, the PDB archive receives approximately 25 new experimentally-determined structures from scientists each day - and more than 5 million files are downloaded from the PDB archive every month. Users include structural biologists, computational biologists, biochemists, and molecular biologists in academia, government, and industry as well as educators and students.

Hemoglobin (PDB ID 4hhb), one of the earliest structures deposited in the PDB archive. Today, there are dozens of structures of hemoglobin in the PDB, showing the process of oxygen binding and revealing the molecular details of sickle cell anemia. Credit: RCSD PDB
Notable examples include recent structures of the adrenergic receptor, which will revolutionize the discovery of drugs to fight heart disease, allergies, and numerous other diseases, and the many structures of enzymes from HIV, which have been pivotal in the design of new therapies to fight AIDS.

"Advances in science and technology have helped the archive grow by leaps and bounds in the last 10 years," said Dr. Helen M. Berman, director of the RCSB PDB and Board of Governors professor of chemistry and chemical biology, noting that the size of the PDB has doubled in just the last three-and-a-half years.

"We are estimating that the PDB will not only double, but triple to 150,000 structures by 2014," said Dr. Philip E. Bourne, Associate Director of the RCSB PDB and professor of pharmacology at the UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences.

The RCSB PDB is based at Rutgers University in New Jersey, and the San Diego Supercomputer Center (SDSC) and Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California at San Diego. Bourne, a distinguished scientist with SDSC, has been leveraging the resources of the supercomputer center to create a highly uniform and robust process for archiving and providing access to the molecular structures.

Backbone structure of the infectious epsilon15 virus (PDB ID 3c5b), a recent addition to the PDB archive. Other viruses in the archive include the rhinovirus, one of the major causes of the common cold, and the influenza virus. Credit: RCSD PDB
The RCSB PDB is responsible for releasing PDB entries into the archive after they have been reviewed and annotated. At Rutgers, RCSB PDB members annotate structures and develop the sophisticated infrastructure needed to handle these complex data. The primary PDB FTP site is based at SDSC, which serves as the distribution point for PDB users. In addition to the SDSC site, there are failover sites at both the UCSD Skaggs School and Rutgers University to ensure constant access.

In addition to a comprehensive website and database that lets users search, analyze, and visualize the structures of biological macromolecules and their relationships to sequence, function, and disease, the RCSB PDB features a Molecule of the Month series, which recently published its 100th installment. Proteins, one of the main building blocks for living organisms, come in a variety of shapes, with the form of a protein corresponding to its function. The structures housed in the PDB demonstrate great diversity in size, complexity, and function, including:

  • Insulin, the protein deficient in diabetic patients
  • p53 tumor suppressor, a protein often implicated in cancer
  • Anthrax toxin, the disease-causing protein made by anthrax
  • Amyloid peptide, a protein implicated in Alzheimer's disease

The RCSB PDB is supported by funds from the National Science Foundation, the National Institute of General Medical Sciences, the Office of Science, the Department of Energy, the National Library of Medicine, the National Cancer Institute, the National Center for Research Resources, the National Institute of Biomedical Imaging and Bioengineering, the National Institute of Neurological Disorders and Stroke, and the National Institute of Diabetes & Digestive & Kidney Diseases.

Related Links

RCSB Protein Data Bank:
San Diego Supercomputer Center:
Rutgers, the State University of New Jersey:
National Science Foundation:
Worldwide Protein Data Bank: