eeing the folding patterns and detailed interactions between atoms reveals much about how and why a protein acts as it does, and knowing a molecule's structure is critical for designing drugs that bind to the protein and change its activity. With an accurate picture of the 3-D structure of proteins, DNA, and other molecules, biologists can better understand how these molecules work.
Most technologies that support the exploration of biological structure are limited in various ways. For example, 3-D structure is often shown on a 2-D screen or at best a stereoscopic display, and collaborating researchers are often required to crowd around a single computer display. To explore how new visualization techniques can benefit molecular modeling, improve methods for building molecular structures, and let remote researchers work together on the same 3-D model, researchers from NPACI's Molecular Science and Interaction Environments thrust areas are working together on the Molecular Interactive Collaborative Environment (MICE).
Myohemerythrin (2mhr in the Protein Data Bank) is an oxygen-carrying protein from a sipunculan worm. The four-helix, up-and-down bundle motif is common to a number of functionally unrelated proteins. This commonality indicates that this is an energetically favorable arrangement.
For molecular scientists, the MICE project, funded by a $450,000 award from the National Science Foundation, has highlighted gaps in the ability of researchers to take advantage of the wide array of information on biological structure. "Biologists need improved methods for constructing and defining 3-D structures from text-based databases, invoking external applications to manipulate data, and accessing databases across the Internet," said Phil Bourne, SDSC senior staff scientist and principal investigator for the MICE project.
MICE defines several components that are designed to address these gaps and build on other SDSC efforts in biological data representation and query. In MICE, biologists view a molecular scene, a representation of available information on not only the structure, but also the biological function of a molecule. To capture this information, the MICE team is developing a Molecular Scene Description Language (MSDL) that extends the well-established International Union of Crystallography STAR/CIF format.
To collect MSDL scenes, MICE will also provide a Molecular Scene Gallery that can be queried with respect to features of the scene, not just text-based descriptions of the molecule. The scenes stored in the gallery have high semantic content--they have been annotated by experts in the field to express as completely as possible the structure-function relationships in a specific biological system. In this sense, the images represent a visual interface to the database of knowledge.
With a mouse click on the molecular scene, a biologist can invoke behaviors that include initiating compute- and memory-intensive applications on NPACI's supercomputers and retrieving associated data from one of SDSC's biological databases. For example, the MEME (Multiple EM for Motif Elicitation) application from UC San Diego and SDSC searches for common subsequences that match a protein sequence associated with the molecular scene. MEME is an example of a transparent supercomputing application, which provides access to the highest-performance computers while hiding the details of their operation.
Likewise, the databases available at SDSC for interacting with MICE include MOOSE for finding property patterns in protein structures, PDBObs for exploring the temporal characteristics of biological structure data, the Protein Kinase Resource for all information specific to the protein kinases, and a mirror of the Nucleic Acid Database for information on DNA and RNA.
"All this structural information has been gathered through long and often tedious experiments," Bourne said. "Because of the time already invested, it's important to the biological community that this data be put to widespread use."
In MICE, the needs of an NPACI application thrust area--in this case, Molecular Science--are driving the development of tools in the Interaction Environments thrust area that support collaboration over the Internet. By harnessing technologies at the cutting edge of networking and virtual reality, MICE represents a new vision of how scientists, from biology and many other disciplines, will take advantage of the Internet to conduct research.
Originally a prototype project for the Supercomputing '96 conference, MICE spans the computational science process--from computing data to visualizing and analyzing results to producing output. SDSC senior programmer/analysts Greg Johnson and John Moreland are leading the efforts to create the interactive collaborative environment (ICE) upon which MICE is layered. The ICE infrastructure opens up new research opportunities: viewing and sharing data, invoking external applications, accessing data from distributed databases, and using novel visualization techniques, including solid modeling and immersive exploration.
A typical user starts with a Web browser at his or her desktop computer, linking to a server running the collaborative components of MICE. On screen, the user can rotate, zoom, and move with the mouse a 3-D representation of a molecule described in the Virtual Reality Modeling Language (VRML). Clicking on certain parts of the model causes annotations, such as sequence information, to appear in a separate browser window.
"The real advantages occur when more users link to the same molecular scene," Johnson said. "In principle, any Internet user worldwide with a VRML browser can join the collaboration. The server tracks who has connected to that scene." When multiple users are viewing the same scene, one user can take control--moving, rotating, or zooming the molecule--and the view in all of the collaborators' screens is updated by commands passed through the server. A user can also place a pointer to indicate a particular feature to other users and then relinquish control to another user.
Other research at SDSC is already demonstrating the potential of VRML for other disciplines--from engineering to psychotherapy. By extending this work to the ICE environment, engineers might jointly explore the results of finite-element model simulations, oceanographers might virtually visit the sea floor, and architects might collaborate on building designs.
While collaborating with the VRML worlds does not require high network bandwidth, the full spectrum of MICE components, particularly transparent supercomputing and database discovery, does require significant bandwidth as the results of computations and database searches are passed between computers, archival storage, and the virtual worlds. Incorporating audio and video for collaboration will consume even greater amounts of bandwidth.
Because researchers and educators can collaborate in real time, they save enormous costs in time and expense to travel to a common location. This can significantly increase the productivity for all participants. The combination of individual and collaborative exploration with a graphical interface has also suggested that MICE holds promise to be a powerful classroom tool.
As part of the NSF grant, MICE will be evaluated as a distance learning tool in a high school classroom. Steve Wavra, a teacher at Southwest High School in San Diego, has been training juniors and seniors in the basics of protein structure. These students will have the opportunity to interact with an expert at UC San Diego using the MICE system. The success of this prototype experiment will help in the project's design and future development.
"Evaluation is the key," Bourne said. "What encourages people to collaborate? It is part personality, part common interest. A virtual collaboration must not be hindered by technological impediments. With MICE, we have a common interest in the shared molecular scene, which is compelling and offers a rich environment in which to collaborate."--DH