![]() |
Pharm 207/Bio 207 Home PageUsing Internet Resources in Molecular Biology - Lecture 4 Problem Solving with Entrez |
|
Lecturer: Helge Weissig Date & Time: 10/16/2001 3-5PM
Table of Contents
Online Lecture
|
WWW Entrez makes it possible to retrieve molecular biology data and bibliographic citations from the NCBI's integrated databases. These include:
The lecture will provide a basic outline of the Entrez database system, its capabilities and some of its advanced features. Emphasis will be put on practical examples and hands-on training. Large parts of the lecture's goals and assignment will be achieved during the second part of the lecture.
The introductory lecture on Entrez will follow the description by Schuler et al. (1996). This and additional, optional references are listed below (all publications are available at the UCSD Biomedical Library). Feel free to click on the "related articles" links to get to some more interesting publications!
General description of the Entrez Database and Retrieval System. This paper is required for the course.
Introduction to the medical subject headings (MeSH) used for the neighbouring of MEDLINE references.
Introduces the structural aspects of Entrez.
A description of different techniques used for the comparison of macromolecular structures. Relevant to the neighbouring technique used in Entrez. Compare also:
Describes the nucleotide database used by Entrez. (Have a look at the other database papers in that issue of NAR!)
Protein of choice: cAMP-dependent protein kinase
Entrez Searches and Results:
a) Primary Publication for the cAMP-dependent protein kinase structure:
Using the Structure Query part of the Entrez browser and camp dependent protein kinase as the search term with the default search field All Fields selected, these 10 documents are obtained. To find out the original (i.e. oldest) publication from the results, the
button on the result page was used with none of the structures selected (results in all of the structures selected by default). From the resulting page (the button above works as well!), it appears that 2CPK was the first cAMP-dependent kinase structure to be published (PDB deposition date: 21-Oct-92, [coincidentally, 2CPK actually replaced 1CPK, a detail only apparent from looking at the PDB information since there are no references to 1CPK in MMDB. Both structures are results of the same physical experiment at the same resolution but they represent different models of the data. 1CPK is less refined than 2CPK]). Examination of the 3 Medline links for 2CPK reveals that actually two structures were published in 1991: the structure of the catalytic subunit alone (Knighton et al., 1991a) and the structure of an peptide inhibitor bound to the catalytic subunit (Knighton et al., 1991b). [Just using the neighbour option to these Medline references, it is not clear which PDB structure actually corresponds to the second reference.]
b) Primary publication for the cAMP-dependent protein kinase cDNA/gene:
The attempt to locate any nucleotide sequences for the cAMP-dependent protein kinase through the
button on the first results page from above failed, no documents are found. This is a somewhat unexpected result, especially since a reference for the expression of the cDNA in E. coli was included with the search results for the structure's publication (compare this MEDLINE result from above).
However, using the nucleotide query pages of Entrez with the first query string on All Fields leads to 625 possible candidates, clearly too many to browse. Adding the search term mus musculus with the field specifier Organism already narrows the result to 150 candidates. Lastly, restricting the search to complete sequences by adding complete cds as a Title Word query produces only 5 documents. [Note, that browsing the first few of very many documents often leads to some useful leads and/or the desired documents..] Again, finding the original publication involves looking at the dates for all the publications that are returned by using the
button. One of the two oldest reference is Uhler et al. 1986 and indeed described the isolation and sequence of the mouse cAMP-dependent protein kinase cDNA.