Toward Computational Cell Biology 
PROJECT LEADERS
Nathan A. Baker Michael J. Holst, UC San Diego 
PARTICIPANTS
Burak Aksoylu 
Feng Wang
UCLA and UC San Diego 
omputation in molecular biology is changing as computers grow in power. Molecular simulation programs that elucidate the structures and interactions of individual protein molecules must now be extended to simulate the more varied biology of molecular complexes within the cell that participate in processes like metabolism and signal transduction. Work on such intracellular processes has become more important as the overall pace of biological discovery accelerates along with the need for new drugs to combat disease. Many scientists are developing new methods for this oncoming era of computational cell biology. At UC San Diego, graduate student Nathan Baker and math professor Michael Holst have been using NPACI resources at SDSC to explore the potential of new algorithms that aid in the realistic simulation of protein activity in complex subcellular structuresincluding such tiny structures as the microtubules to which an anticancer drug like taxol must bind to do its work in treating breast and ovarian cancer. 
SOLVING MOLECULAR PUZZLES

Figure 1. Actin Filament Electrostatic Potential
The solution for the negative (red) and positive (blue) electrostatic potential of a 30mer actin filament (top) and a 10mer filament (bottom), obtained by David Sept of UC San Diego's Department of Chemistry and Biochemistry using a structure supplied in K. Holmes, D. Popp, W. Gebhard, and W. Kabsch (1990): Atomic model of the actin filament. Nature 347, 4449. 

The algorithms are an adaptive finiteelement computer program called the Manifold Code, developed by Holst, an associate professor in the Scientific Computation Group of the Department of Mathematics, and an Adaptive PoissonBoltzmann Solver (APBS), developed by Baker. The codes permit the calculation of the electric potential of arbitrarily complex structures. Baker and colleagues in the group of J. Andrew McCammon, Department of Chemistry and Biochemistry, collaborated with Holst and other members of the Scientific Computation Group to adapt the Manifold Code via APBS for biological problems. 
SOLVING MOLECULAR PUZZLESIntermolecular interaction is strongly constrained by the electrostatics: the way in which the landscape of electrical charge is laid out in the molecular environment. Knowledge about the electric potential is needed across the full domain of interest, which may contain one or many molecules, usually in solvent. "To find the potential around a typical charged biological structure immersed in a salt solution, computational biochemists must solve a nonlinear secondorder partial differential equation called the PoissonBoltzmann equation," Baker said. "This equation has been a formidable challenge, and many techniques for approximating the solutions have been developed over the past decade, with widely varying degrees of accuracy and efficiency." The equation (PBE for short) is difficult to solve both speedily and accurately because of several numerical problems it presents, including sharp, nonlinear discontinuities of its parameters at interfaces between charged structure and solvent. APBS was written by Baker to allow the Manifold Code to cope with the specifics of the PBE. "Work on this problem constitutes my dissertation," Baker said, "but the method is so promising and has such wide application that I've been joined by postdoctoral researcher David Sept to extend the research further." In addition to Holst, the mathematicians include professor Randolph E. Bank, UCLA postdoctoral fellow Feng Wang, and UC San Diego math graduate student Burak Aksoylu. Baker is supported by a predoctoral fellowship from the Howard Hughes Medical Institute (HHMI), and he and Aksoylu are members of the BurroughsWellcome Fund La Jolla Interfaces in Science (LJIS) Training Program. The melding of APBS and the Manifold Code into a powerful engine of multimolecular simulation took nearly a year. "It was a beautiful example of the value of interdisciplinary programs like LJIS," Holst said. "LJIS support let us bat the problems back and forth between biomathematicians and mathematical biologists until they were solved." With Wang, Baker and Holst first used the APBSManifold combination to examine four biomolecules having distinct electrostatic properties: a neutral HIV integrase dimer, mouse acetylcholinesterase (which has an overall negative charge), fasciculin2 (a component of snake venom with a net positive charge), and a 36unit polymer of DNA. In each case, Baker said, "the codes provided a rapid and extraordinarily accurate approximation technique." In fact, the solutions were obtained sequentially on a 500MHz Intel Pentium III computer running Linux. Two papers describing the application of the codes to these benchmark biomolecular problems will appear in the Journal of Computational Chemistry. "The properties of the codes have now enabled us to take the next step in the scale of biomolecular systems we can investigate," Baker said. 
Top Contents  Next REFERENCES N. Baker, D. Sept, M. Holst, and J.A. McCammon (2000): The adaptive multilevel finite element solution of the PoissonBoltzmann equation on massively parallel computers, IBM Systems Journal (submitted). R. Bank and M. Holst (2000): A new paradigm for parallel adaptive meshing algorithms, SIAM J. Sci. Computing (in press). M. Holst, N. Baker, and F. Wang (2000): Adaptive multilevel finite element solution of the PoissonBoltzmann equation I: Algorithms and examples, J. Comput. Chem. (in press). M. Holst, N. Baker, and F. Wang (2000): Adaptive multilevel finite element solution of the PoissonBoltzmann equation II: Refinement at solvent accessible surfaces in biomolecular systems, J. Comput. Chem. (in press). 
THE MANIFOLD CODE AND THE MULTIMOLECULARThe Manifold Code, developed by Holst over several years at Caltech and UC San Diego (the latter under an NSF CAREER grant), is an adaptive, multilevel finiteelement method, applicable to a general class of problems that include the PBE. It is similar in spirit to a code developed earlier by Randolph Bank that solves problems in two dimensions, but Holst's code can also handle 3D problems. The finite elements used are the simplest (and are called, collectively, "simplices"): triangles in 2D problems and tetrahedra in 3D problems. It is here that Baker's APBS code is essential for setting up the problem, Holst notes. The code first solves a small version of the problem on a coarse mesh of simplices. A procedure to estimate the errors in the solution is then used to partition the problem into subdomains whose error is about the same. The subdomains are then further divided into new meshes of tetrahedra. Solutions are calculated on the new meshes, the errors are reestimated, and, if necessary, the problem is further parceled out. The iterative procedure ensures, as Baker puts it, "that the calculation takes place where the action iswhere there are large gradients or discontinuities in the potential, where proteins are folded or overlaid on one another." Thus the adaptive part of both codes enables them to calculate on as refined a mesh as needed, just where it is needed and nowhere else. Each iteration produces a system of nonlinear finiteelement equations to be solved, which is done using a variant of the wellknown Newton method (in this case, a global inexactNewton algorithm), via an algebraic multilevel algorithm. When the solutions have reached a desired degree of accuracy, the subdomains are knit back together in an overall solution. A parallel version of the multilevel adaptive finiteelement method was recently developed by Holst and Bank. "We have immediately gone on to construct larger problems with APBS that can make use of parallel machines like NPACI's Blue Horizon at SDSC," Baker said. "With the Linux box able to handle structures containing 200,000 atoms, we wanted to tackle much bigger problems using the parallel version." 
Top  Contents  Nextmccammon.ucsd.edu 

Figure 2. Microtubule Electrostatic Potential
The protein backbone of a 15protofilament microtubule (top), the structure of which is based on coordinates from Eva Nogales of UC Berkeley and HHMI and Ken Downing of Lawrence Berkeley National Laboratory. Four views (bottom) of the negative (red) and positive (blue) electrostatic potential contours obtained by solving the PoissonBoltzmann equation for the same microtubule. 
ACTIN AND MICROTUBULESMuch bigger problems presented themselves in the form of actin and tubulin. Actin monomers, the most abundant protein in the cell, and tubulin dimers, which constitute the microtubules that transport proteins throughout a cell (among many other functions), are both highly conserved globular proteins. "Since they are such important components of all eukaryotic cells, we wanted to take a computational look at them," Baker said. Actin filaments, together with associated proteins, are important in controlling cell shape, motility, and transport. They were first found in muscle tissue and, together with myosin and other proteins, supply the mechanism of muscle contraction. Baker and Sept used APBS and the Manifold Code to investigate the electrostatic properties of an actin filament (Figure 1). By contrast with actin filaments, microtubules appear to be constituted to resist compressive forces, and they perform dynamic intracellular functions. Microtubules, for example, pull at the spindles formed during cell division (mitosis). "A drug like taxol operates by binding to microtubules and disrupting the division of cancer cells," Sept said. "A more complete understanding of these structures should be helpful in structurebased drug discovery." Sept and Baker used Blue Horizon to calculate the electrostatic potential of a 15protofilament microtubule (Figure 2). Some of these calculations involved nearly a million atoms, and runs on up to 32 processors permitted the use of a mesh refined into more than 6 million simplices. "The calculational box was 90 nanometers on a side," Baker said, "and the smallest simplex had a longest edge of 0.088 nanometers." "These are very exciting calculations," McCammon said. "We are now analyzing them with a view to pushing the envelope further and tackling a number of largescale problems in intracellular activity." 
Top Contents  Next 
ASSESSING THE POTENTIAL"A faster PBE solver that can maintain solution accuracy has certainly been needed," Baker said, "especially as we extend our domains to accommodate multiple molecules, multimolecular interactions, and processes occurring on subcellular scales." "Many investigators have been using scaledup molecular simulation codes to look at multimolecular interactions," said McCammon, "and now we can contribute mightily to the realism and accuracy of such studies, with rapid and accurate determinations of the electrostatic potential of structures and processes of major biological and pharmacological importance." The approach taken by Bank and Holst to the parallelization reduced interprocessor communication to a minimum and improved load balancing. "Our codes were written as sequential codes, but they can run in a parallel environment without a large investment in recoding," Holst said. Most important, according to Holst and Baker, is that the use of multilevel finiteelement methods yields orders of magnitude reductions in solution time compared to uniform mesh approaches. "We had been working for years on new and better computational solutions for the elliptical partial differential equations that arise in many scientific problems," said Holst, "and we were just delighted to find that Andy McCammon's group was pursuing exactly the right problems to test our ability to deliver accurate numerical solutions in a computationally efficient fashiondespite the wellknown difficulty of the problems." MM * 
Top Contents  Next 