 |
 |
| |
|
| |
|
 |
| |
|
| |
 |
 |
| |
|
| |
Home
> Archives > Additional
Archives > Pavel Pevzner |
| |
Pavel Pevzner
Dept of Computer Science & Engineering
UCSD
La Jolla, CA 92093
olson@scripps.edu
http://www.scripps.edu/pub/olson-web/
Abstract:
Children like puzzles and they usually assemble them by trying all
possible pairs of pieces and putting together pieces that match. Biologists
assemble genomes in a surprisingly similar way, the major difference
being that the number of pieces is larger. For the last twenty years
fragment assembly in DNA sequencing followed the "overlap - layout
- consensus" paradigm that is used in all currently available
assembly tools. Trying all pairs of pieces corresponds to the overlap
step while putting the pieces together corresponds to the layout step
of the fragment assembly. Our new EULER algorithm is very different
from this natural approach - we never even try to match the pairs
of fragments together and we don't have the overlap step at all. Instead
we do a very counter-intuitive (some would say childish) thing: we
cut the existing pieces of a puzzle into even smaller pieces of regular
shape. Although it indeed looks childish and irresponsible, we do
it on purpose rather than for the fun of it. This operation leads
to the new EULER algorithm that resolves the 20-years-old "repeat
problem" in fragment assembly and provides some important advantages
over the Celera assembler.
The major improvement of EULER over other algorithms is that it resolves
all repeats except long perfect repeats that are theoretically impossible
to resolve without additional PCR experiments. EULER, in contrast
to the Celera assembler, does not mask such repeats but uses them
instead as a powerful fragment assembly tool.
This is a joint work with Haixu Tang and Michael Waterman at USC.
For more information, see http://www-cse.ucsd.edu/users/ppevzner/
|
|
| |
|
 |
|
| |
|
| |
| |
The San Diego Supercomputer
Center (SDSC) is a research unit of the University of California,
San Diego, and the leading-edge site of the National Partnership for
Advanced Computational Infrastructure. SDSC researchers conduct studies
in computational science, develop high-performance computing and networking
technologies, and participate in NPACI activities.
SDSC -- UC San Diego, MC 0505 -- 9500 Gilman Drive -- La Jolla, CA
92093-0505 -- 858-534-5000 -- 858-534-5152 (fax)
info@sdsc.edu © 2001, The Regents of the University of California

|
|