Pharm 207/Bio 207 

Using Internet Resources in Molecular Biology - Lecture 8

Protein Structure Classification


 

Lecturer: Philip E,. Bourne
Last Update: 13-Nov-2001

Table of Contents

Lecture Outline

  • How are protein structures classified?
  • What Web sites provide a classification?
    • If the structure is in the PDB refer to:
    • If the structure is not in the PDB (more on this next lecture)
  • Perform a classification using a protein kinase 
  • Understand this classification
  • Perform a classification of your pet protein


 


| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples


Introduction

Protein structures can be classified in a variety of interrelated ways - functional similarity, evolutionary similarity, and fold similarity. Here are a few rules to go by:

  1. Classical protein family classification is defined by function which comes from experimental biochemistry.
  2. Functional similarity and family classification can be inferred by sequence similarity.
  3. If there is greater than a 25% sequence identity there will be some structure homology. For example, the A and B chains of hemoglobin have 45% sequence homology, yet apart from small inserts have an identical 3-D structure. You can see this as follows: Go to CE in a separate browser window. Select the Option Align Two Chains. Hit the Calculate Alignment button. Review the sequence alignment. Hit the Press to Start Compare 3D button. Review the structure alignment.
  4. There is a Gaussian distribution of similar structures with a peak around 9% sequence identity. That is, structure similarity can exist that is not detectable by standard sequence comparison methods (more next lecture). Divergent evolution can go a long way and still retain the fold.
  5. Structure homology alone infers some kind of energentically favorable arrangement, but not necessarily a functional  relationship - convergent evolution.
  6. It is very difficult to distinguish convergent from divergent evolution.
  7. Much work has been done to classify proteins by structure.

This lecture deals with protein classification by structure. The different classifications have been derived empirically by examining existing protein structures as found in the PDB. Details of the classifciation are covered in Principles of Protein Structure Course 1997 - Tertiary structure Part II. Students are encouraged to review this material.

A globular protein consists of one or more domains - discrete parts of the protein with their own hydrophobic core and function. Domains are classified according to their tertiary structure. The different types of tertiary structure will become apparent as we work with our protein kinase test case.


| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |



 

Reading Materials

  1. J.S Richarson (1981) The Anatomy and Taxonomy of Protein Structure. Advances in Protein Chemistry, 34:167-339.
  2. Murzin AG, Brenner SE, Hubbard T, Chothia C. 1995. "scop: a structural classification of proteins database for the investigation of sequences and structures" Journal of Molecular Biology 247:536-540 [abstract]
  3. Orengo CA, Pearl FM, Bray JE, Todd AE, Martin AC, Lo Conte L, Thornton JM (1999) The CATH Database provides insights into protein structure/function relationships Nucleic Acids Res 1999 Jan 1;27(1):275-9. [abstract]
  4. L. Holm C. Sander Searching Protein Structure Databases has come of age. Porteins 19:165-173 HTML Version

| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples | 


Goals

To learn enough about the structural classification of proteins to make a basic assignment and to understand existing assignments of protein structures that you encounter.
 


| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples | 


Lecture Assignment

Classify the 3D structure of your "pet protein" using SCOP and CATH. Write a description of what you have found. Using Rasmol produce an image that reflects the 3D classification of your protein. If time permits, provide images on your web page of two others that belong to the same basic classification.
 


| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples | 


Example

Analysing the serine/threonine cAMP dependent protein kinase with SCOP

  1. Open another version of a Web browser and perform subsequent steps there
  2. Go to the US mirror of the SCOP Site
  3. Enter the hierarchy at the top. Notice the basic classes of protein. The number in brackets indicates the number of sub-classifications at the next level - not the number of structures.
  4. Search for "kinase" a list of hits for superfamily, family, fold and protein will appear.
  5. Select the first entry "Family: serine/threonin"
  6. Select "Protein: cAMP-dependent PK, catalytic subunit from pig (Sus scrofa)" Click on that - This reveals the lineage as follows:
  1. Root: scop
  2. Class: Alpha and beta proteins (a+b)
    Mainly antiparallel beta sheets (segregated alpha and beta regions)
  3. Fold: Protein kinase-like (PK-like)
    consists of two alpha+beta domains, C-terminal domain is mostly alpha helical
  4. Superfamily: Protein kinase-like (PK-like)
    shares functional and structural similarities with the ATP-grasp fold and PIPK
  5. Family: Serine/threonin kinases
  6. Protein: cAMP-dependent PK, catalytic subunit
  7. Species: Pig (Sus scrofa)

Thus we have established that this is a multi-domain protein containing both alpha helix and beta sheet and that it has a distinct fold refered to as the PK catalytic core. It is one of a family of serine /threonine kinases (refers to the phosphoryaltion site on the substrate). Clicking on any one of these classifications provides other members at that level. Thus it is quickly revealed that the other major class of kinases is the tyrosine kinases. In terms of known structures there several tyrosine kinases and a variety of distict serine/threonine kinases which are activated in various ways.

Analysing cAMP dependent protein kinase C with CATH

  1. Go to CATH
  2. Go to Information on CATH and read and understand the classification scheme
  3. Enter 1ATP as the PDB code in the Search box. This specific entry was revealed by SCOP
  4. Review the features of each domain represented
  5. Go to PDBsum for a further review

Thus..

The cAMP dependent protein kinase catalytic subunit consists of two domains. This is shown relative to the primary sequence (from CATH):
 



 


There is a small N terminal domain which is alpha beta and which is repsonsible for nucelotide binding and a larger C terminal domain which is mainly alpha and responsible for substrate binding. This is depicted on the 3-D structure:
 



 


The alpha beta domain is shown in red and the mainly alpha domain in blue. The domains are connected by a elongated strand and small helix shown in yellow. The CATH classification does not dar the distinction of this interdomain connecting region. A complete walk through of this structure is available.