Exercise 6 - Folding of RNA or ssDNA Molecules


The objectives of this exercise are to examine nucleic acid secondary structure prediction programs:

  1. Zuker's MFOLD program is emphasized along with the GCG programs STEMLOOP and PLOTFOLD, together with the graphics display programs SQUIGGLES, MOUNTAINS, CIRCLES, DOMES, and DOTPLOT
  2. MFOLD and PLOTFOLD are used to investigate suboptimal RNA secondary structures.
  3. Comparison of optimal folding program MFOLD made to simple stem-loop programs: GCG STEMLOOP, GCG DOTPLOT.

Main Specific Tasks to Perform in Exercise 6:

  1. Select an RNA sequence/structure with which to work.
  2. MFOLD - multiple sub-optimal RNA structures
    1. Learn about the GCG program MFOLD
    2. Use PLOTFOLD to examine the squiggles, mountains, circles, domes, and dotplot output from MFOLD.
    3. Use PLOTFOLD to examine sub-optimal folds produced by MFOLD.
  3. MFOLD parameters: constraining the folding pattern.
    1. Based on your knowledge of the biology of your molecule, use some of the other parameters of MFOLD:

    2. /REMOVE /PREVENT /FORCE.
  4. Using the GCG programs STEMLOOP and DOTPLOT for RNA structures.
    1. Learn about the GCG program STEMLOOP.  Analyse your sequence using STEMLOOP and display your results using DOTPLOT.
    2. Compare these results with those from MFOLD as displayed using PLOTFOLD.
    3. Attempt to find a set of parameters for STEMLOOP which yields output identical to MFOLD
  5. Questions


BIMM 140: | Main | 140_Info | Syllabus | Lectures | Exams | DNASYSTEM | CMS MBR |
BIMM 141: | Main | 141_Info | Syllabus | Exercises | DNASYSTEM | CMS MBR |



Folding Servers

Zuker's RNA Mfold Server
Zuker's DNA Mfold Server
tRNAScan Server

RNA Structure Databases


tRNA gene database
rRNA SSU and LSU
RNase P Database
rRNA WWW server
Signal Recognition Particle Database
tmRDB
uRNA Database (e.g. U1 etc.)
 

Many other links can be found at

RNA World at Jena


BIMM 140: | Main | 140_Info | Syllabus | Lectures | Exams | DNASYSTEM | CMS MBR |
BIMM 141: | Main | 141_Info | Syllabus | Exercises | DNASYSTEM | CMS MBR |



{A. Locate an RNA sequence of interest .}

Choose a sequence that has a region of biologically significant intrastrand structure. Examples of such include: Using one of the databases above, or another of your choice, identify a molecule with known secondary structure.  Use a molecule of your choice.  Based on your knowledge of the biology of your molecule (and/or the annotation available at the database), choose an appropriate region of the sequence for intrastrand structure analysis  Because RNA folding is a CPU intensive process, limit your analysis to a sequence no longer than 400 bases.  If the sequence you are interested in is longer, simply cut out part of the molecule for analysis.

{Enter a description of your molecule and the known base-pairing into your notebook.}

{B. MFOLD - multiple sub-optimal RNA structures.}

{1. Learn about the GCG programs MFOLD in GENHELP.} You may use either the GCG Mfold program or Zuker's Mfold server to fold your RNAs.  Note that Zuke's server is likely to be somewhat mre up-to-date, and may in fact be faster.  If you use Zuker's server, you must copy the "GCG connect" file back to your local machine in order to view mountains and domes plots, below.

{2. Use PLOTFOLD to examine the squiggles, mountains, circles, domes, and dotplot output from MFOLD.} If you ran your analysis on the Zuker server, you may use the "GCG connect" file with the programs SQUIGGLES, DOMES, CIRCLES, and MOUNTAINS to make the plots.

{3. Use PLOTFOLD to examine at least five of the sub-optimal folds produced by MFOLD. What graphics output is most useful in this comparison?  Which one of these structures, if any, corresponds to the known structure of your RNA? }
 

{C. MFOLD parameters: constraining the folding pattern.}

{1. Based on your knowledge of the biology of your molecule (and/or the annotation in the sequence file), use some of the parameters of MFOLD: /REMOVE /PREVENT /FORCE.} These parameters constrain the predicted folding; by adding in constraints to force base pairs that you know should be present, or to prevent base pairs that you know are absent, you can get the minimum free-energy strucure that contains the biologically relevant structures. For this exercise, use information from the known folded structure of your RNA. In a real situation, this information might come from laboratory experiments such as nuclease digestion studies.  Note that these options are available both in the GCG version and on Zuker's server.

{D. Using the GCG programs STEMLOOP and DOTPLOT for RNA structures.}

Stemloop is a very simplistic RNA structure finder.  In this section you will compare the results from this simplistic approach with the more accurate analysis you did is section B.

{1. Learn about the GCG STEMLOOP program using GENHELP.  Analyse your sequence using STEMLOOP and display your results using DOTPLOT.}

{2. Compare these results with those from MFOLD as displayed using PLOTFOLD.}

{3. Attempt to find a set of parameters for STEMLOOP which yields output in which you can see the stems predicted using MFOLD.}



BIMM 140: | Main | 140_Info | Syllabus | Lectures | Exams | DNASYSTEM | CMS MBR |
BIMM 141: | Main | 141_Info | Syllabus | Exercises | DNASYSTEM | CMS MBR |



{E. Questions.}

  1. What are the problems with the SQUIGGLES representation for RNA structures?
  2. Name a representation that corrects these problems?
  3. What is the difference between an internal loop and a bulge loop?
  4. What are the advantages of MFOLD relative to STEMLOOP?
  5. What are the disadvantages of MFOLD relative to STEMLOOP?
  6. What kinds of structures cannot be found by folding programs such as MFOLD?
  7. In general, which will destabilize a structure more: an internal loop or a bulge loop. Explain your
  8. answer and include energies.
  9. What are the main energy terms considered in computer prediction of RNA folding?
  10. List the favorable energy terms (those that stabilize structures) in RNA folding?
  11. When is a GU basepair less stable that of a AU basepair?
  12. What laboratory methods might one use to test the correctness of a predicted structure?
  13. What non-laboratory (e.g., computational) methods might on use to test the correctness of a predicted structure?
  14. What is a P-num plot and why would one use it?