 |
Pharm 207/Bio 207
Using Internet Resources in Molecular Biology
- Lecture 8
Protein Structure Classification
|
Lecture Outline
- How are protein structures classified?
- What Web sites provide a classification?
- If the structure is in the PDB refer to:
- If the structure is not in the PDB (more on this next lecture)
- Perform a classification using a protein kinase
- Understand this classification
- Perform a classification of your pet protein
|
|
|
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |
Introduction
Protein structures can be classified in a variety of interrelated ways - functional
similarity, evolutionary similarity, and fold similarity. Here are a few rules to go by:
- Classical protein family classification is defined by function which comes from
experimental biochemistry.
- Functional similarity and family classification can be inferred by sequence similarity.
- If there is greater than a 25% sequence identity there will be some structure homology.
For example, the A and B chains of hemoglobin have 45% sequence homology, yet apart from
small inserts have an identical 3-D structure. You can see this as follows: Go to CE in a separate browser window. Select the Option Align
Two Chains. Hit the Calculate Alignment button. Review the sequence
alignment. Hit the Press to Start Compare 3D button. Review the structure
alignment.
- There is a Gaussian distribution of similar structures with a peak around 9% sequence
identity. That is, structure similarity can exist that is not detectable by standard
sequence comparison methods (more next lecture). Divergent evolution can go a long way and
still retain the fold.
- Structure homology alone infers some kind of energentically favorable arrangement, but
not necessarily a functional relationship - convergent evolution.
- It is very difficult to distinguish convergent from divergent evolution.
- Much work has been done to classify proteins by structure.
This lecture deals with protein classification by structure. The different
classifications have been derived empirically by examining existing protein structures as
found in the PDB. Details of the classifciation are covered in Principles of Protein
Structure Course 1997 - Tertiary structure Part II. Students are encouraged to review
this material.
A globular protein consists of one or more domains - discrete parts of the protein with
their own hydrophobic core and function. Domains are classified according to their
tertiary structure. The different types of tertiary structure will become apparent as we
work with our protein kinase test case.
| TOC & Lecture Outline | Introduction
| Reading Materials | Lecture Goals and Assignment | Examples |
Reading Materials
- J.S Richarson (1981) The Anatomy and Taxonomy of Protein Structure. Advances in Protein
Chemistry, 34:167-339.
- Murzin AG, Brenner SE, Hubbard T, Chothia C. 1995. "scop: a structural
classification of proteins database for the investigation of sequences and
structures" Journal of Molecular Biology 247:536-540 [abstract]
- Orengo CA, Pearl FM, Bray JE, Todd AE, Martin AC, Lo Conte L, Thornton JM (1999) The
CATH Database provides insights into protein structure/function relationships Nucleic
Acids Res 1999 Jan 1;27(1):275-9. [abstract]
- L. Holm C. Sander Searching Protein Structure Databases has come of age. Porteins
19:165-173 HTML Version
| TOC & Lecture Outline | Introduction
| Reading Materials | Lecture Goals and Assignment | Examples |
Goals
To learn enough about the structural classification of proteins to make a
basic assignment and to understand existing assignments of protein structures that you
encounter.
| TOC & Lecture Outline | Introduction
| Reading Materials | Lecture Goals and Assignment | Examples |
Lecture Assignment
Classify the 3D structure of your "pet protein" using SCOP and
CATH. Write a description of what you have found. Using Rasmol produce an image that
reflects the 3D classification of your protein. If time permits, provide images on your
web page of two others that belong to the same basic classification.
| TOC & Lecture Outline | Introduction
| Reading Materials | Lecture Goals and Assignment
| Examples |
Example
Analysing the serine/threonine cAMP dependent protein kinase with SCOP
- Open another version of a Web browser and perform subsequent steps there
- Go to the US mirror of the SCOP Site
- Enter the hierarchy at the top. Notice the basic classes of protein. The number in
brackets indicates the number of sub-classifications at the next level - not the number of
structures.
- Search for "kinase" a list of hits for superfamily, family, fold and protein
will appear.
- Select the first entry "Family: serine/threonin"
- Select "Protein: cAMP-dependent PK, catalytic subunit from pig (Sus scrofa)"
Click on that - This reveals the lineage as follows:
- Root: scop
- Class: Alpha
and beta proteins (a+b)
Mainly antiparallel beta sheets (segregated alpha and beta regions)
- Fold: Protein
kinase-like (PK-like)
consists of two alpha+beta domains, C-terminal domain is mostly alpha helical
- Superfamily: Protein kinase-like (PK-like)
shares functional and structural similarities with the ATP-grasp fold and PIPK
- Family: Serine/threonin kinases
- Protein: cAMP-dependent PK, catalytic subunit
- Species: Pig (Sus scrofa)
Thus we have established that this is a multi-domain protein containing both alpha
helix and beta sheet and that it has a distinct fold refered to as the PK catalytic core.
It is one of a family of serine /threonine kinases (refers to the phosphoryaltion site on
the substrate). Clicking on any one of these classifications provides other members at
that level. Thus it is quickly revealed that the other major class of kinases is the
tyrosine kinases. In terms of known structures there several tyrosine kinases and a
variety of distict serine/threonine kinases which are activated in various ways.
Analysing cAMP dependent protein kinase C with CATH
- Go to CATH
- Go to Information on
CATH and read and understand the classification scheme
- Enter 1ATP as the PDB code in the Search box. This specific entry was revealed by SCOP
- Review the features of each domain represented
- Go to PDBsum for a further review
Thus..
The cAMP dependent protein kinase catalytic subunit consists of two domains. This is
shown relative to the primary sequence (from CATH):
There is a small N terminal domain which is alpha beta and which is repsonsible
for nucelotide binding and a larger C terminal domain which is mainly alpha and
responsible for substrate binding. This is depicted on the 3-D structure:
The alpha beta domain is shown in red and the mainly alpha domain in blue. The domains
are connected by a elongated strand and small helix shown in yellow. The CATH
classification does not dar the distinction of this interdomain connecting region. A
complete walk through
of this structure is available.