Pharm 207/Bio 207 Home Page

Using Internet Resources in Molecular Biology - Lecture 4


Problem Solving with Entrez

Lecturer: Helge Weissig
Date & Time: 10/16/2001 3-5PM

Table of Contents

 

Online Lecture


| TOC & Online Lecture | Introduction | Reading Materials | Examples |

Introduction

WWW Entrez makes it possible to retrieve molecular biology data and bibliographic citations from the NCBI's integrated databases. These include:

The lecture will provide a basic outline of the Entrez database system, its capabilities and some of its advanced features. Emphasis will be put on practical examples and hands-on training. Large parts of the lecture's goals and assignment will be achieved during the second part of the lecture.


| TOC & Online Lecture | Introduction | Reading Materials | Examples |

Reading Materials

The introductory lecture on Entrez will follow the description by Schuler et al. (1996). This and additional, optional references are listed below (all publications are available at the UCSD Biomedical Library). Feel free to click on the "related articles" links to get to some more interesting publications!

  1. Schuler, GD; Epstein, JA; Ohkawa, H; Kans, JA. (1996)  Entrez: molecular biology database and retrieval system. Methods in Enzymology, 266:141-62. [Entrez]

    General description of the Entrez Database and Retrieval System. This paper is required for the course.

  2. Lowe, HJ; Barnett, GO. (1994)  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. Jama 271(14):1103-8. [Entrez]
  3. Darmoni, SJ; Thirion, B. (1995)  Understanding MeSH for literature searches [letter] Jama, 273(3):184; discussion 184-5. [Entrez]
  4. Fremer, E. (1995)  Understanding MeSH for literature searches [letter] Jama, 273(3):184; discussion 184-5. [Entrez]

    Introduction to the medical subject headings (MeSH) used for the neighbouring of MEDLINE references.

  5. Hogue, CW. (1996)  A dynamic look at structures: WWW-Entrez and the Molecular Modeling Database. Trends Biochem Sci, 21(6), 226-9 [Entrez]

    Introduces the structural aspects of Entrez.

  6. Gibrat, JF; Madej, T; Bryant, SH. (1996)  Surprising similarities in structure comparison. Current Opinion in Structural Biology, 6(3):377-85. [Entrez]

    A description of different techniques used for the comparison of macromolecular structures. Relevant to the neighbouring technique used in Entrez. Compare also:

  7. Benson, DA. (1994)  GenBank. Nucleic Acids Res 22(17): 3441-3444 [Entrez]

    Describes the nucleotide database used by Entrez. (Have a look at the other database papers in that issue of NAR!)

  8. Other online materials:


| TOC & Online Lecture | Introduction | Reading Materials | Examples |

Examples

Protein of choice: cAMP-dependent protein kinase

Entrez Searches and Results:

a) Primary Publication for the cAMP-dependent protein kinase structure:
Using the Structure Query part of the Entrez browser and camp dependent protein kinase as the search term with the default search field All Fields selected, these 10 documents are obtained. To find out the original (i.e. oldest) publication from the results, the

button on the result page was used with none of the structures selected (results in all of the structures selected by default). From the resulting page (the button above works as well!), it appears that 2CPK was the first cAMP-dependent kinase structure to be published (PDB deposition date: 21-Oct-92, [coincidentally, 2CPK actually replaced 1CPK, a detail only apparent from looking at the PDB information since there are no references to 1CPK in MMDB. Both structures are results of the same physical experiment at the same resolution but they represent different models of the data. 1CPK is less refined than 2CPK]). Examination of the 3 Medline links for 2CPK reveals that actually two structures were published in 1991: the structure of the catalytic subunit alone (Knighton et al., 1991a) and the structure of an peptide inhibitor bound to the catalytic subunit (Knighton et al., 1991b). [Just using the neighbour option to these Medline references, it is not clear which PDB structure actually corresponds to the second reference.]

b) Primary publication for the cAMP-dependent protein kinase cDNA/gene:
The attempt to locate any nucleotide sequences for the cAMP-dependent protein kinase through the

button on the first results page from above failed, no documents are found. This is a somewhat unexpected result, especially since a reference for the expression of the cDNA in E. coli was included with the search results for the structure's publication (compare this MEDLINE result from above).

However, using the nucleotide query pages of Entrez with the first query string on All Fields leads to 625 possible candidates, clearly too many to browse. Adding the search term mus musculus with the field specifier Organism already narrows the result to 150 candidates. Lastly, restricting the search to complete sequences by adding complete cds as a Title Word query produces only 5 documents. [Note, that browsing the first few of very many documents often leads to some useful leads and/or the desired documents..] Again, finding the original publication involves looking at the dates for all the publications that are returned by using the

button. One of the two oldest reference is Uhler et al. 1986 and indeed described the isolation and sequence of the mouse cAMP-dependent protein kinase cDNA.


| TOC & Online Lecture | Introduction | Reading Materials | Lecture Goals and Assignment | Examples |