Bioinformatics I / PHARM 201
Biological Data Representation and Analysis

Lecturer: Prof. Philip E. Bourne mail | page Office Hours 2-3 daily 2111 SSPPS

TA: Stan Luban [MAIL]

Format: 2-one hour lectures and six hours of practicum per week

Time/Location: Wed. 12:30-2:00pm ; Friday 12:30-2:00pm 2109 Skaggs School of Pharmacy & Pharm Sci. 2nd Floor Conf. Room

Last Update November 30, 2007

Overview

Bioinformatics is driven by the need to understand complex biological systems for which data are accumulating at exponential or near exponential rates. Such an understanding relies of the effective representation of these data and the ability to analyze these data. This is a broad topic and we focus on macromolecular structure data, which is suitably complex, to introduce the principles of formal data representation, reductionism, comparison, classification, visualization and biological inference. As such the course also serves as an introduction to Structural Bioinformatics.

Assessment

Weekly assignments: On Friday of each week students will receive a question paper on that weeks work which will be due 5pm the following Wed. in class. (50%).

Final Exam: Students will be assigned one or more papers which cover a significant amount of the material covered in the course. They will be expected to critique that paper based on what they have learned and propose the next set of experiments (50%).

Course Text

P.E.Bourne and H. Weissig Eds. Structural Bioinformatics. Wiley 2002 [from Amazon]

The text is on reserve in the Biomedical Sciences Library.

Printed copies of the slides in notes form will be distributed with each lecture.

Schedule

Topic & Date

Content

Workflow Overview of Course [slides]

Lecture 1: 09/28

Know Your Data - Principles of Protein Structure [slides]

To model and analyze biological data it must first be understood from a biological perspective. Goal: Refresh or achieve a better understanding of primary, secondary, tertiary and quaternary protein structure. Reading: Chapters 1 and 2.

Lecture 2: 10/03

Know Your Data -Principles of DNA and RNA structure [slides]

To model and analyze biological data it must first be understood from a biological perspective. Goal: Refresh or achieve a better understanding of the features of DNA and RNA structure and its interaction with proteins. Reading: Chapters 3

 

Lecture 3:  10/05

Know the Limitations of Your Data - Experimental Methods of Structure Determination [slides]

To effectively utilize biological data it is necessary to understand the limitations of the experiments used to determine that data. Structure determination is a relatively quantitative science and good statistical measures exist. Goal: Explore the quantitative and qualitative measures of data quality with respect to data from X-ray crystallography, NMR and electron microscopy. Reading: Chapters 4-6.

 

Lecture 4: 10/10

Know How Best to Represent Your Data - Data Representation [slides]

Historically the PDB format expresses the Lingua Franca of structural bioinformatics, but it has serious flaws. These will be explored and understood in the context of the replacement - mmCIF. Goal: To understand why good data representation is important. Reading: Chapter 8. and Methods in Enzymology. 1997 277, 571-590. The Macromolecular CIF Dictionary  

 

Lecture 5: 10/12

Data Quality: The Annotation and Validation Process [slides]

Public databases provide rich sources of data for all aspects of bioinformatics study. Goal: To understand the quality of these data through annotation and validation practices using PDB and SwissProt as examples.   Reading: The PDB Uniformity Project NAR (2001) 29(1):214-218 and Assigning Biochemical Information in SwissProt.

Lecture 6: 10/17

More on Data Representation – The Gene Ontology

[slides]

While a slight digression from structural bioinformatics, the Gene Ontology (GO) is having a profound impact on bioinformatics research. Goal: To understand the structure and use of GO. Reading: Creating the Gene Ontology Resource: Design and Implementation Genome Research (2001) 11:1425-1433

 

Lecture 7: 10/19

Applications of GO

Study of research papers and subsequent discussion. Goal: To examine research applications of GO and what it means for biology.

 

Lecture 8: 10/31 Sequence-Structure-Function Relationships and Associated Reductionism [slides]

With so much data available it is necessary to produce non-redundant sets for many bioinformatics tasks. However, given the complex relationship between sequence, structure and function, non-redundancy means different things in each case. Goal: Understand this complex relationship and the associated meaning of reductionism. Reading: Chapter 19

Lecture 09: 11/02

Reductionism and Classification Require Detailed Comparison [slides]

3D structure comparison is a difficult problem when trying to achieve biologically meaning results. Goal: Understand the problem and the methods used to address it and explore the use of the distant sequence alignments arising from structure alignment. Reading: Chapter 16.

 

Lecture 10: 11/07

Alternative Forms of Representation [slides]

Most protein structure analysis is based upon the Cartesian atomic coordinates, but there are other forms of representation. We will consider a representation based on spherical harmonics and how it can be applied. Guest Lecturer: Dr. Apostol Gramada.

Lecture 11: 11/09

From Reductionism Comes Classification [slides]

To understand a complex dataset it helps to classify it in biologically meaningful ways. Goal: Study SCOP and CATH to understand the issues of classification and the value once classified. Reading: Chapters 12 and 13.

Lecture 12: 11/14

Classification is Always Ambiguous [slides]

There is almost never a single answer when classifying biological data. Goal: To understand this statement by the analysis of techniques used to define protein domains from 3D structure. Reading: Chapter 18. Guest Lecturer: Dr. Stella Veretnik

 

Lecture 13: 11/16 From Reductionism comes New Science [slides]

New Science – traditionally protein structure has been studied through looking at evolution. Most recently evolution has been studied through looking at protein structure. Goal: Introduction to a new and exciting area.

Lecture 14: 11/21

Classification is Always Ambiguous – An Exception to the Rule? [slides]

Secondary structure assignment may be an exception to the rule. Goal: To explore the Kabsch-Sander algorithm and the impact it has had on the community. Also to explore other methods of secondary structure assignment. Reading: Chapter 17.

Lecture 15: 11/28 Protein Motion

[slides]

Protein motions can be vital for biological function. Motions range from complete disorder to subtle allosteric interactions. Goal: Understand methods for characterizing and predicting protein motion.

Lecture 16: 11/30

Protein-protein interactions [slides]

Goal: Understand the importance of the study of protein-protein interactions at the structural level. Review a paper in a journal club style that predicts interaction sites.

Lecture 17: 12/05 Bioinformatics in Drug Discovery

Review how bioinformatics is and could be used in the drug discovery process. Guest Lecturer: Dr. Peter Rose.

 

Lecture 18: 12/07 Studying Protein-Ligand Interactions

One specific methodology for describing and searching for protein-ligand binding sites will be described along with how it is being used to study side effects and repositioning of existing pharmaceuticals. Guest Lecturer: Dr. Lei Xie

Finals