Molecular Motion
- Binding of substrates by proteins
- Folding of proteins and peptides
- Chemical reactions
Simulating Flexibility
Normal mode analysis
Harmonic motion about equilibrium
Vibrational entropy/free energy
Cannot cross energy barriers (e.g., local motion around minima)
Monte Carlo (MC)
Random movement (study thermal ensemble of structures)
Time independent
Consistent with thermal energy
Molecular Dynamics (MD)
Newton's equations of motion
Time dependent
Diffusion and folding
Limited to picosecs to nanosecs
Not restricted to harmonic motion
Classical Approach
- Empirical Fit
Correlation with Quantum Mechanics is serendipitous
- There are events that cannot be observed
Electronic transitions
Electron transport phenomena
Bond breaking/formation
Proton transfer (acid/base reactions)
Molecular Dynamics
History-
Molecular Dynamics methods are nearly as old as the Metropolis MC method.
The first applications of MD techniques for molecular simulation were made
for simple fluids:
Alder & Wainwright, J. Chem. Phys.,27, 1208 (1957)
Alder & Wainwright, J. Chem. Phys.,31, 459 (1959)
Simulations of complex liquids followed
Rahman & Stillinger, J. Chem. Phys., 60,1545 (1974) (also, 1974, 1974)
and
the first MD simulation of a biomacromolecule was performed over 10 years
ago (BPTI):
Definitions
A specification of -
position & velocity of each particle of the system at some instant of time,
- forces acting on the particles
Newton's 2nd Law: Given the state of a system at any time, its
future state and future motions are exactly determined.
- MD (unlike MC) is a deterministic procedure. Atoms move according to
the laws of Newtonian mechanics.
- A MD simulation is performed by integrating Newton's equations of
motion over time for the molecular system.
- Crash course on stat mech(!)
- Partition function(Z)
- Free Energy(logZ)
- Entropy (log Z at T; kTlnZ)
- Average E (derivative of Z wrt1/kT0
,li>Heat Capacity - Size of typical variations in E (derivative of E wrt T)
Used to find S
- Average property
- Atoms defined by position and velocity in MD: (x,y,z,vx,vy,vz)
- Kinetic Energy: 1/2Emi(vx2 + vy2 +vz2) (sum over atoms)
- Temperature: 3/2Nkt (equipartition theorem)
- Solving Newton's Equations of motion:
Schematic flow chart for dynamics:
- Calculating the new position at the end of the time step completes
one cycle of the following cycle, and the procedure is iterated to produce
the total trajectory.
Atomic positions---------->
(coordinate file)
Covalent Structure------------->............---------->Total Potential
Energy
Potential Energy File----------->====> -------------> Forces on each atom
Additional Atoms-------------->...........----------->Effective Temperature
(H's, heteroatoms, solvent, counterions)
Special Features----------------->
Periodic Boundary Conditions
Constraints
Restraints
Atomic Velocities-------------->
- A simulation as outlined here is performed in the
microcanonical ensemble (constant number of Particles, V and total E).
Taylor Series Expansion
- Although the Taylor series is exact, it must be truncated to
determine x(t + dt). Fundamental differences in MD algorithms is
-
How this truncation is handled
- The particular numerical technique used to solve it
- Simple test to determine how well algorithm performs: by Newton's third
law, the net force acting on the total system of particles is zero, so the
total E of the system (PE + KE) as well as momentum is conserved. Small
fluctuations or slow drifts in the Emay occur due to the numerical round
so E conservation is an important test of algorithm performance.
- Simplest approach would be to truncate after the second term, producing
a equation of motion with constant acceleration. This is poor since it
attempts to project motion forward in time based only on position, velocity
and acceleration at the beginning of the interval.
- Improvements:
- Verlet Algorithm(Verlet, 1967)
- Beeman Algorithm, (Beeman, 1976; Levitt & Meirovitch, 1983)
- Fifth Order Gear Method (Gear, 1971; McCammon et al., 1977, 1979)
- Constrained Verlet (SHAKE) Method (Rychaert et al., 1977; Van Gunsteren
& Berendsen, 1977).
MD Protocol- Minimization
- Heating
- Slow heating to the desired temperature
- Trajectory essentially slowly learns the surface of the well, feeling
its way out.
- Do not want to carry this out quickly/high T since you will pop out to
some unknown place on the PES.
- Gaussian Equilibrium
- Hot (near minimum) and cold (near tails)
regions within macromolecule. A repetitive redistribution of velocities is
done to equalize hot/cold regions over time. The redistribution, via
Gaussian reassignment of velocities, will allow molecule to feel a more
uniform velocity (T). No dependence on initial state this way.
- Length of time of this stage ( in terms of ps) depends on KE relaxation
(Length of time for KE (velocity) to exchange and distribute through the
molecule (~1-2 ps for proteins)
- Usually do about 4 ps Gaussian
Equilibration, reassigning velocities every 1 ps.
- Gaussian random number distribution is much easier to work with than
Maxwell Boltzmann, and is a good approximation.
- Equilibrium
- In this stage, you want T to settle to the desired value.
Velocities are scaled, and not reassigned through this stage.
- Scale factor is
-
looking for PE and KE to settle down with no more scaling necessary. Time
dependent analysis can not be done between scaling segments. The
configurations sampled within differently scaled periods do not come from
the same ensemble.
- Atoms themselves can serve as bath (absorbing part of the translational
E, thus preventing transitions).
- Want to specify as small of a T window as possible (10K or 5K). Like to
see at least 25 ps without scaling.
- Trajectory
- Begun when scaling is no longer needed. Sample
every 5 - 10 ps for as long as you can do the simulation (100's of ps
usually).
- Questionable how reliable this can be because of the limit in the
amount of time one can usually sample. Activated processes can take on the
order of nanoseconds and milliseconds, and so, doing MD for small numbers
of ps may give misleading results.
- Such things as restraints may be employed to carry out specific
analysis of activated processes.
Monitoring Variables:
- Energy
- RMS
- Pseudo Temperature
Factors important in convergence:
- Integration time step:
For biomolecular systems, the iteration time step typically used is one of
the order of 1 femptosecond (1.0x10-15 sec.). This small step size is
necessary to ensure numerical stability for integration of the equations of
motion.
Thus, the equations of motion for systems with 3N d.o.f. must be integrated
in an iterative fashion for 10(5) time steps to generate a trajectory of
100 psec (1 psec = 1.0x10-12 sec).
- As in minimization techniques, frequently internal d.o.f.,
especially bond stretching modes are constrained. The most common restraint
methodology is the SHAKE procedure, which applies additional forces to keep
bond lengths fixed at equilibrium values. The restraint of high frequency
modes like bond vibrations permits a slightly larger integration time step
in MD simulations.
- In principle, all equilibrium and dynamic properties of the system, can
be calculated from an MD trajectory. Derivative properties (e.g., heat
capacity) will converge more slowly than average properties (e.g. internal
energy). Unlike MC, properties fromMD calculations are computer as time
averages rather than ensemble averages. The ergodic hypothesis states that
an ensemble average is equivalent to time average in the limits of adequate
sampling and sufficient time, respectively. Properties that require larger
ensemble averaging in MC will require longer trajectories in MD. The larger
number of d.o.f. for complex biomolecules make s evaluation of potential
energy and energy derivatives computationally intense. Typically
simulations are limited to a few hundred psec. In those cases, caution is
required when computing average properties for the system since it is not
clear that this is a long enough simulation to yield significant time
averages.
Periodic Boundary Conditions
Size of molecular system also plays an important role in convergence.
Simulations performed on systems with too few particles (e.g., too few
solvent molecules to properly solvate a solute) may give misleading
results. At the same time, the size of the system must be limited to
computational facilities.
For biomolecules in liquid or crystal environments, these contradictory
requirements can be satisfied through the use of PBC's.
Nonbonded cutoffs
The exact nature of the truncation of long-range forces can have a dramatic
effect on the behavior of a molecular system in a simulation. A sharp
atom-based truncation will lead to large discontinuities in forces and can
cause serious instabilities in simulations. A good solution is the use of
smoothing functions. Special long-range corrections for electrostatic
forces may be needed to compensate fro the truncation of Coulombic
potentials.<.ol>
Modified MD methods:- Size reduction for 'active region'
dynamics
For many problems, it may be necessary (satisfactory) to
study only a small portion of the biomolecular system. For example, only in
a region around the active site, including solvent molecules and
counterions.
- Simplest method is to define atoms within the region of interest as
mobile, and restrain all others to their initial position. The influence of
the environment surrounding the active region is retained in some average
way, but the computational demands are greatly reduced. PBC's are not used.
- Stochastic BC's: Here, the active region is defined as before. Atoms
outside the active region are discarded, and the effect of the discarded
atoms compensated for with stochastic forces. An outer shell of the active
region is defined as a buffer zone in these calculations. As active region
atoms move into the buffer zone, they experience stochastic forces that
induce mean atomic fluctuations and heat dissipation that would normally
occur if the environment atoms were present explicitly. Static BC are also
included to represent the average interaction of active region atoms with
the environment. These boundary forces help maintain the average structure
of the active region in the absence of the discarded environmental atoms.
The reliability of the stochastic method on the quality of the stochastic
and boundary forces used (from expt).
'Rare Events- Many interesting processes in
biological systems occur rapidly, but infrequently because an energy
barrier must be overcome. Ligand binding and aromatic amino acid side chain
rotation are two common examples. Standard MD methods are not suited to
study activated processes because the time scale of occurrence for these
events is often in the nanosecond or even microsecond time regime.
- Umbrella Sampling Technique
Valleau & Torrie, in Statistical Mechanics Part A: Equilibrium Techniques
.
In this method, one defines a reaction coordinate which is the pathway that
carries the system from the initial to the final state. Simulations are
then carried out with a set of constraining potentials that biases the
conformational sampling to various overlapping regions along the reaction
coordinate. The effect of the biasing potential can be removed by piecing
together the overlapping windows of the umbrella sampling to yield a
continuous function that reasonably approximates the free energy as a
function of the reaction coordinate (the potential of mean force) for the
process.
Applications
- Analysis of
trajectories
In the simplest applications, trajectories are
generated for biological systems and analyzed for- structural
- dynamic
- thermodynamic properties
- The average motions of atoms
over time may be observed, as well as
- Time correlations for atomic positions and velocities.
- RMS position fluctuations and other RMS geometric fluctuations can be
computed and compared with expt observables.
- Spectral densities, order parameters S, dipole vector correlation
functions, and other properties may be computed from MD trajectories, and
compared with NMR data. As a result, MD simulations can be useful in
interpretation and analysis of NMR spectra.
- Vibrational spectral properties relating to IR, Raman, and inelastic
neutron scattering spectra may also be obtained from MD simulation,
although harmonic dynamics is often used for this.
- MD simulations are used for energetic and structural features of
biological systems, yielding qualitative information about binding
energetics, conformational transitions, and induced fit in a complex
formation.
- Useful as a tool in the refinement of 3D biomolecular structures from
Xray diffraction and 3D NMR data. In the refinement, MD trajectories are
run at elevated temperatures (several thousand degrees Kelvin) to enhance
conformational sampling. The systems are cooled periodically to permit the
trajectories to settle into local minimum E conformations. Constraint terms
based on Xray structure factors or NMR NOE distances are added to the
standard PEF so that the MD trajectories relax to conformations that
satisfy the expt data as the system is cooled. The use of simulated
annealing MD calculations together with distance geometry (for NMR data)
and conventional Xray refinement (for Xray data) provides a powerful tool
for the analysis of exptal structural data for biomolecules.
This information can give insight regarding structure-function relations
and mechanistic details.