Skip to content

News Center

Home > News Center > Publications > EnVision



Student Researchers at PACI Campuses
Make the Most of Their Summer Experiences


few doors down the hallway from SDSC’s supercomputers, a student pored over research papers, probed online databases, and experienced the thrill of cutting-edge biological research under the guidance of a skilled mentor. Christopher Peabody, a participant in the Research Experiences for Undergraduates (REU) program, spent the summer channeling. He hunted for evidence of tiny channels on proteins that act like slippery conduits, moving molecules rapidly from one enzyme to another. Pharmaceutical companies are exploring the channels as potential targets for new drugs.

Double Icosahedron

Figure 1. Double Icosahedron

Chan discovered that 78 neutral atoms could theoretically settle into this double icosahedral shape. Image courtesy of Robert Leary.

Like a detective, Peabody investigated computer databases–virtual haystacks of the results of millions of biological experiments and analyses–in search of a needle: the identity of a particular channel.

Peabody, a junior at UC Berkeley, is one of many students who participated in the REU program at SDSC and PACI partner universities in Kentucky, Maryland, Montana, Rhode Island, and Tennessee. The REU program, a national effort funded by the National Science Foundation, is designed to give undergraduates, particularly women, minorities, and people with disabilities–students underrepresented in the sciences and computing–rewarding research experiences. The REU students worked with scientists involved in computational projects ranging from Web-server software to a database on the brain.


Image of Double Icosahedron

Christopher Peabody

Image of Kevin Chan
Kevin Chan

Scientists at UCSD and elsewhere used computer modeling to discover channels. The theoretical structures explain how enzymatic reactions can occur up to 20 times faster than if the substrate molecules simply moved from one enzyme to another by diffusion.

The homepage of the National Center for Biotechnology Information was Peabody’s doorway to one of the largest collections of biological data ever assembled. It is organized so that anybody can quickly find the amino acid sequence, three-dimensional structure, and other characteristics of thousands of proteins. Dihydrofolate reducatase (DHFR) is the scientific name of the particular channeling enzyme Peabody investigated.

DHFR is important because each cell capable of making DNA, including microorganisms, plant cells, and most cells in the human body (even cancer cells), uses this enzyme. Unfortunately, scientists don’t know which amino acids out of dozens that make up DHFR form its channel.

"We don’t even know how many amino acid residues are needed to make a channel," said Peabody’s mentor, SDSC associate staff scientist Chris Smith. "Is it one, two, three, four, or more? What are the spaces between them?"

Peabody tried to find out. He compared the amino acid sequences of more than 800 DHFR enzymes in the NCBI database. He examined enzymes from bacteria to fruit flies, looking for homologous motifs, or short stretches of matching amino acids that could be candidates for channels. "I didn’t know anything about computers until I started here this summer," said Peabody, who is majoring in molecular and cellular biology and political science at Berkeley. "I’m learning a lot–that’s the whole benefit for me."


While Peabody was channeling, REU student Kevin Chan, a sophomore at Harvard University majoring in mathematics, was "hopping" in another room at SDSC. Chan used a variation of a well-known mathematical technique to discover a novel arrangement of atoms missed by other scientists. He found that 78 neutral atoms could theoretically settle into the shape of a particular double icosahedron.

The structure Chan discovered (Figure 1) looks like two 55-atom icosahedrons, 20-sided objects, fused together, with a few atoms missing. "I enjoy studying concepts such as symmetry in mathematics," said Chan "but I’m also fascinated with concrete examples of them. I find that very interesting."

Computer simulations such as the one Chan performed generate geometries that correlate well with the actual 3-D shapes of clusters found in nature. For example, molecules in supercooled liquids and simulated molecules in supercomputer modeling often form icosahedral clusters and polytetrahedral clusters–collectives of pyramid-shaped pieces.

Chan made his discovery while working under the direction of Robert Leary, an SDSC applied mathematician. Leary described Chan’s cluster in summer talks at SDSC. The results have been electronically published in The Cambridge Cluster Database, and Leary and Chan also are planning to submit the results to a scientific journal.

"Kevin’s result is really quite surprising when you consider the considerable attention given to this problem in previous computational studies," said Leary.

Leary and his colleagues use a variety of modeling techniques to examine how 10 to more than 100 neutral atoms arrange themselves into the lowest energy states possible. Their method works like an explorer scanning new countryside. An algorithm hops from one possible energetic state to another, each of which corresponds to a different cluster geometry, looking for the one with the deepest valley.


Such global-optimization research, conducted primarily with supercomputer simulations, may help solve a pressing problems in chemical physics–how proteins fold into their biologically active, low-energy shapes out of many trillions of possibilities. Mad cow disease involves a misfolded protein. Several other human diseases also involve aggregations of misfolded proteins, including "new variant" Creutzfeldt-Jacob disease, the human form of mad cow disease.

Chan used a mathematical expression called the Morse potential to describe the forces between all the possible pairs of atoms in atomic clusters. He accomplished the huge mathematical task with a Sun Micro-systems Enterprise 10000 machine, a 32-processor system at SDSC.

Chan started with a few atoms, then larger aggregations. His basin-hopping results had been agreeing with those of other researchers until he reached the 78-atom cluster. Scientists had previously reported that the lowest energy state for this kind of cluster is derived from a Mackay icosahedron. To make that 78-atom structure, 23 atoms are added as a partial layer to a perfect, 55-atom, 20-sided icosahedron.

Traditional basin-hopping techniques search energy landscapes by going up and down, looking for the lowest valley–the global minimum. Leary’s technique only goes down. When it finds the lowest valley in a local energy landscape, the algorithm finds a random new starting place and searches anew for a different valley, repeating the cycle over and over.

"This new double-icosahedral structure has a lower potential energy than the other structure," said Chan. This means that his structure, not the single icosahedral structure previously reported, may represent the actual 3-D structure of 78-atom clusters in nature.


Preserving the mushrooming data sets of the information age is a central focus of research in SDSC’s Data-Intensive Computing Environments (DICE) group. REU student Michelle Schumaker worked on software that addresses this problem under the supervision of SDSC researchers Bertram Ludäescher and Richard Marciano.

Schumaker helped develop the prototype of a data-packaging and archival toolkit that can bundle a collection of files or digital objects, and attach descriptions called metadata. The toolkit enables a person putting information into the collection to add metadata for each object. The toolkit will eventually work with plug-ins to automatically extract metadata, and to wrap, or translate, the file into the versatile exchange format, the Extensible Markup Language (XML). "This facilitates later access and migration to new technologies," said Schumaker. In this prototype, the metadata can be anything from free text to classic metadata or knowledge representations involving logic rules that state properties of the digital objects.

The ability to access and move records with new hardware and software as it evolves, requires archival formats that are as infrastructure-independent and as self-contained as possible. "We need to store not only the digital objects in a self-describing format like XML, but also to preserve metadata, that is, information about the context of the data and its processing history, as well as knowledge about how to interpret and navigate the data," said Ludäescher.

"We met often and evaluated what we were doing, so it’s constantly evolving. I really liked the fast-moving nature of this research," said Schumaker. "It’s a stimulating environment. I learned a great deal every day from conversations with the researchers. It’s very helpful for my career choices, since I learned first-hand what research is like."


Chris Harper, a 2001 computer science graduate of San Diego State University (SDSU), said his REU experience was invaluable. "I probably learned more as an REU student than I did from my college classes," he said. "The experience was irreplaceable."

At the NPACI Education Center on Computational Science and Engineering (EdCenter) at SDSU, Harper worked on a Web-based project, collecting interactive learning materials, assignments, reviews, and the names of people with an interest in computational science. Harper used Java, HTML, and SQL to update the Computational Science Resources Community website. The EdCenter also learned from Harper and REU student Lindsay Stocks. "It has provided fresh perspectives on our projects," said Kirsten Barber, an EdCenter computer applications specialist.

Once Harper became involved, he realized the positive impact that his work could have on his career. "I told other students about it and they said things like, ‘You have such a cool internship.’ "

Kris Stewart, founder and director of the EdCenter, said the REU program gives her program the resources to try new things.

"The REU experience is an excellent opportunity for our students to explore interesting areas that are new to them, and to interact closely with leading professionals in the field in a collaborative manner whenever possible," said EdCenter staff scientist Jeff Sale. "This goes well beyond what they might ordinarily get exposed to in their other undergraduate coursework." –CF, RG, PT


REU Principal Investigator
Ann Redelfs

Greg Moses
University of Wisconsin Madison

Participating NPACI Investigators
Kim Baldridge
Chaitan Baru
Kimberly Claffy
Tony Fountain
Greg Johnson
Tim Kaiser
Robert Leary
Bertram Ludäescher
Richard Marciano
Chris Smith
Mary Thomas

Jack Dongarra
University of Tennessee

Raquell Holmes
Brown University

Gwen Jacobs
Montana State University

Matt Kirschenbaum
University of Kentucky

Maryann Martone
Barbara Sawrey

J. Tilak Ratnanther
Johns Hopkins University

Kris Stewart
San Diego State University

REU Students
Cathy Agostino
Matt Browne
Athena Chuang
Ben Donahue
Desiree El-Chebeir
Christine Gould
Brandon Leong
Jennifer Mann
Ray Regno
Michelle Schumaker
Peter Shin
Ray Smith
Elena Toulson

Cynthia Browne
University Of Tennessee

Kevin Chan
Harvard University

Stephanie Creech
Johns Hopkins University

Chris Harper
Tuan Le
Annalisa Ruskievicz
Lindsay Stocks
Nicole Wolter
San Diego State University

Christopher Peabody
UC Berkeley

Bonnie Kirkpatrick
Montana State University

Laura Reynolds
University of Kentucky

Dameon Shaw
Brown University