| Gribskov & Smith |
BIMM 141 Laboratory |
Spring 2001 |
Introduction to Bioinformatics
Exercise 5 focuses on use of Web facilities to learn more about organisms which have been extensively studied and annotated: genome sequence has been determined, organismal databases have been established, genetic disease is under study.
The focus will be on Human, since the first draft of the genome sequence was recently released and a variety of Web applications have been developed to permit User access to this information. Other organisms, eg yeast, Drosophilia, Dictyostelium, and the Archae, Methanococcus janaschii, will also be included.
Much of this Exercise will permit you to examine the Web resources discussed in Lecture 3. The Web pages for Lecture 3 have many links that may be useful in execution of this Exercise 5.
The Main objectives in Exercise 5 are:
If you have any problems with understanding what is asked or
with execution of a task or with whatever, please send email with
questions to either Michael
Gribskov , Doug Smith,
or Hiren Patel, TA for BIMM
140 / 141.
Baxevanis-Ouellette, 2nd Edition, Textbook Relevant Chapters:
One could argue several others are relevant as well ...
{Answer the Questions at the end as you
proceed through the Exercise}
{A. TIGR - The Institute
for Genomic Research}
TIGR has determined the sequence of many prokaryotic organisms,
including the first three to be sequenced. TIGR has also been
involved as a major player in other sequencing efforts, including
those of the mustard plant, Arabidopsis thaliana, Rice,
the fungus Aspergillus, the malaria parasite Plasmodium
falciparum, and the Human Genome sequencing efforts
{1. Peruse the TIGR site to see what is there and
get a feel for what TIGR does}
Examine links from the TIGR Home page. See "What's new".
Note the "Hot Links". Note the types of efforts TIGR
is involved with: Databases, Gene Indices, Software, etc.
Include a few comments about what you did in your Notebook. Answer
Exercise Questions 1-3
{2. TIGR Microbial Database: Completed
Genomes}
Go to the TIGR Microbial Database, choose one of the organisms
whose genome is completely sequenced, and include some information
here about the organism: what it is, why it is interesting, what
kingdom it is from, who did the sequencing, size of the genome,
a few comments about the genome.
Work some with the Genome Browser from the TIGR CMR (Comprehensive
Microbial Resource).
Answer Exercise Questions 4-8.
Note: the TIGR Genome Browser is new since Lecture 3 of 6 April 2001 !!
{3. TIGR Microbial Database: Incomplete
Genomes}
Examine one or two of the Eucaryotic genomes whose sequencing
is incomplete and for which links are provided. State in your
Notebook which organisms you examined.
{4. Other TIGR Databases}
Go to one of the other TIGR databases, for example, the TIGR Arabidopsis
thaliana database, and briefly describe in your Notebook the
types of information available. Answer Exercise Question 9.
{5. TIGR Gene Indices}
Go to the TIGR Gene Indices web pages, and check out some of the
organisms for which Gene Indices are available. Comment in your
Notebook what you did. Answer Exercise Questions 10-11
{6. TIGR Software Tools}
Go to the TIGR Software Tools web pages, and check out some of
the programs available. Answer Exercise Questions 12-13.
NCBI has taken a lead role in providing Web page access to all information related to Genome Sequences. The completed "Genome Sequences" includes those of true organisms (eukaryotes, eubacteria, archae), as well as viruses, organelles, and plasmids. Graphic resources are available as well as text, with links provided between NCBI sources of information and to sites elsewhere.
{1. Entrez Genomes}
Go to the NCBI Entrez Genome site and examine some of what
is there. Comment on how the three columns or tables of
the Web page are used. Answer Exercise Question 14
{2. NCBI Microbial Genomes}
Go the the NCBI Microbial Genome Web page and find the organism
you used in A3 above. Peruse some of the information and
Web pages available. Comment on how the information and
display at NCBI compares with such at TIGR. Answer Exercise Questions
15-16
{3. Examine the "Graphical view"
and "Coding regions" of a region of the chosen Genome}
Choose appropriate links (in the left table) to do this.
Comment on these views. Answer Exercise Questions 17-18.
{4. NCBI Specific Organism Resources}
From the NCBI Entrez Genome Web page, navigate to the
NCBI "Prominant Organisms" Web page. Look at the Home
Page for a few of these organisms, eg Arabidopsis, C. elegans,
D. melanogaster, and S. cerevisiae Answer Exercise Questions 19-20.
Both NCBI and dedicated specific Web sites elsewhere have developed resources for viewing and obtaining information on the sequence and genomics of the major "model" organisms, in addition to Human.
{1. NCBI Yeast Genome Site}
From the NCBI Entrez Genome Web page, navigate to the
NCBI Yeast Genome Site and click on one of the Yeast Chromosomes.
Answer Exercise Question 21. Examine the REFSEQ link and answer
Exercise Question 22.
{2. Stanford Saccharomyces Genome Database
- SGD}
Connect from the NCBI Yeast Genome Site to SGD and examine some
of what is there. Comment on how the three columns or tables of
the Web page are used. Answer Exercise Question 23.
{3. Munich Information center for Protein
Sequences- MIPS - for Yeast}
Link to the Yeast MIPS site in Germany; this is the other main
Yeast Web site. Examine some of what is there. Answer Exercise
Questions 24 and 25.
{4. NCBI Fly Genome Site}
Go to the NCBI Drosophila melanogaster Genome Site. Note
the "Entrez Map Viewer"; answer Exercise Questions 26
and 27
{5. Examine Features of a Fly Chromosome
using the Map View Graphics}
Click on one of the Fly Chromosomes. Examine some of what is present
on the resulting Graphic of the Chromosome; answer Exercise Questions
28 and 29. Note the Vertical Maps displayed.
Click on "Display Settings" and try some other
combinations.
Select a region of "9M" in upper box, "10M"
in lower box of "Select Region:" options
briefly describe what you get.
{6. Select a Fly Gene and Examine Information
Available}
Select a Fly gene either via a Search box or by displaying the
"Genes_seq" vertical map display in the Fly Map View
graphic, zooming in, and clicking on a Gene symbol. Answer Exercise
Question 30.
{Click on "LocusID" for your Fly Gene} This
brings you to the LocusLink page for your gene. Briefly describe
the types of information present.
{7. Examine your Fly Gene in FlyBase}
Link to FlyBase from the LocusLink page for your fly gene. Examine
and briefly describe some of the information available. Answer
Exercise Question 31.
{Find the link to "The Interactive Fly" and examine
this site}
{8. Examine your Fly Gene in GadFly
at BDGP}
Find a link for your gene to GadFly, and then go to the original
site at Berkeley (BDGP); answer Exercise Questions 32 and 33.
{Try using the "GeneSeen" Map Viewer}
What happened?
With the publication in Science and Nature of the First Draft of the Human Genome Sequence, from the Celera and Consortium efforts, respectively, the tip of the iceberg for an incredible amount of information on the Human Genome has been breached. This genomic information is currently found at three main sites: NCBI, EBI via Ensembl, and at UC Santa Cruz. Here we examine some of these information sources for a human gene of your choice.
{1. Find a Human Gene to work with ...
Gene Ontology terms}
Go to NCBI, choose the Genomic Biology link, and go to Human;
answer Exercise Question 34.
1) Now click on the LocusLink link, and then on the Gene Ontology
link under "New Features", to take you to the Home Page
for the Gene Ontology Consortium. This is a good source of "keywords"
that you can use to find a gene of interest.
Examine the "TEXT" links from the "Molecular Function",
"Biological Process", and "Cellular Component"
sections; answer Exercise Question 35.
Use these lists to find a keyword or two, to use to find a Human
Gene to work with.Return to the LocusLink Home Page and do a Search
from the Search box using your keyword
2) If you have a favorite keyword or two in mind, eg adrenergic,
you may of course use these.
3) Alternatively, if no words are attractive, go to the "Map
Viewer" and link to one of the chromosomes.
Find a gene of interest via descriptive words available.
LocusLink returns a list of genes satisfying your keyword criteria;
answer Exercise Question 36.
{2. NCBI Resources available for your
Human Gene}
Click on the LocusID link to go to the LocusLink description for
your chosen gene.
Briefly describe some of the information available for your Gene.
Note the GO words used. Note the MapViewer (mv) links. Answer
Exercise Questions 37-39.
{3. GeneCard for your Human Gene}
Go to the GeneCard link for your gene. Answer Exercise Questions
40-41.
{4. Ensembl Resources available for your Human
Gene}
The Ensembl resources are those created at EBI in Europe for annotation
and information retrieval on the Human Genome Project. Examine
some of the links and information available at the Ensembl site,
and record what you did in your Notebook.
{a. Search Ensembl for your Gene using the keyword previously
used}
Search in the Ensembl Main Page search facility using
"All" using the keyword you previously used.
Does Ensembl report the same genes as did NCBI LocusLink? What
information did you need to use to find the same gene as used
above? Answer Exercise Questions 42-43.
{b. View your Gene at Ensembl}
In the "Genome Location" part of the "Ensembl Gene
Report", click on the sequence Accession Number given, eg
an AP****** number. This brings up a Graphic view of your Gene
using the ContigView viewer; answer Exercise Question 44.
{5. UCSC Resources available for your Human Gene}
UC Santa Cruz is the third primary source of Human Genome Project
information at the current time, do largely to the efforts of
Jim Kent in the Hausler group in developing contig assembly and
Web viewer tools for the HGP Consortium effort. Click on the link
above to go to the UCSD Human Genome Project home page. Comment
on how this Web page compares with the main Human Genome Home
pages at NCBI and EBI.
{a. Search UCSC for your Gene using the keyword previously
used}
Search in the UCSC "Genome Browser" facility
using the keyword you previously used.
Does this "Genome Browser" report the same genes as
did NCBI LocusLink? What information did you need to use to find
the same gene as used above?
{b. View your Gene at UCSC}
Once you have located your same gene, click on the Gene ID number.
You should see the UCSC Graphic Viewer for your gene on its chromosome,
with many other objects and information; answer Exercise Question
45.
{c. Direct links between Ensembl and UCSC}
Note the direct links between displays of a given human
gene using EBI Ensembl and the UCSC Graphic Viewer; answer Exercise
Question 46.
Latest modification: 26 April, 2001
If you have problems or questions, send email to Michael
Gribskov or Doug Smith
or Hiren Patel