Pharm 207/Bio 207Using Internet Resources in Molecular Biology - Lecture 5Protein Secondary Structure Prediction |
|
Lecturer: Phil Bourne Table of Contents
|
Pharm 207/Bio 207
|
Protein secondary structure prediction is the determination of the regions of secondary structure in a protein at the level of alpha helix, beta sheet, and random coil, from information present in the primary protein sequence. No such methodology exists to distinguish more subtle secondary structure, for example the difference between alpha and 3.10 helix. This lecture will briefly review the history of available methods and their associated success rates. The majority of time will be spent reviewing sites where it is possible to calculate protein secondary structure and subsequently assessing the success rate of these sites using your pet protein.
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |
For those needing a refresher in the secondary structure of proteins please refer to Section 8 of the Principles of Protein Structure Course.
One way of introducing protein secondary structure prediction is to look at it chronologically. The following introduction follows the paper by Eisenhaber, Persson and Argo (1995) that is required reading for this lecture.
1974 Chou and Fasman propose a statistical method based on the propensities of amino acids to adopt secondary structures based on the observation of their location in 15 protein structures determined by X-ray diffraction. Clearly these statistics derive from the particular stereochemical and physicochemical properties of the amino acids. See for example, glycine and proline. These statistics have been refined over the years by a number of authors (including Chou and Fasman themselves) using a larger set of proteins. Rather than a position by position analysis the propensity of a position is calculated using an average over 5 or 6 residues surrounding each position. On a larger set of 62 proteins the base method reports a success rate of 50%.
1978 Garnier improved the method by using statistically significant pair-wise interactions as a determinant of the statistical significance. This improved the success rate to 62%
1993 Levin improved the prediction level by using multiple sequence alignments. The reasoning is as follows. Conserved regions in a multiple sequence alignment provides a strong evolutionary indicator of a role in the function of the protein. Those regions are also likely to have conserved structure, including secondary structure and strengthen the prediction by their joint propensities. This improved the success rate to 69%.
1994 Rost and Sander combined neural networks with multiple sequence alignments. The idea of a neural net is to create a complex network of interconnected nodes, where progress from one node to the next depends on satisfying a weighted function that has been derived by training the net with data of known results, in this case protein sequences with known secondary structures. The success rate is 72%.
It is not expected that better predictions will be possible without introducing more known input parameters to which there is a shown dependency on primary sequence. This implies the inclusion of observations from longer range interactions and combinations of other properties.
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |
F. Eisenhaber, B. Persson and P. Argo (1995) Critical Reviews in Biochemistry and Molecular Biology 30(1), 1-94. Relevant section will be available as a handout.
P.Y. Chou and G Fasman (1974) Biochemistry, 13 211-222.
J. Garnier, D.J. Osguthorpe and B. Robson (1978) J. Mol. Biol., 120, 97-120.
B. Rost and C. Sander Proteins 19, 55-72.
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |
The lecture assignment is to use several Internet resources which offer the various methodologies introduced above and compare the prediction results using the sequence of your pet protein. The resources are drawn from a more complete list found in the CMS resource.
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |
Here is an example approach.
| TOC & Lecture Outline | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |