This tutorial is about the use of experimental data, in particular NMR data, either as collective variables or as replica-averaged restraints in MD simulations. While the first is a just a simple extension of what we have been already doing in previous tutorials, the latter is an approach that can be used to increase the quality of a force-field in describing the properties of a specific system.
Once this tutorial is completed students will:
The tarball for this project contains the following:
In the former tutorials it has been often discussed the possibility of measuring a distance with respect to a structure representing some kind of state for a system, i.e. Belfast tutorial: Out of equilibrium dynamics. An alternative possibility is to use as a reference a set of experimental data that represent a state and measure the current deviation from the set. In plumed there are currently implemented the following NMR experimental observables: Chemical Shifts (only for proteins) CS2BACKBONE and CH3SHIFTS, NOE distances and Residual Dipolar couplings RDC. In addition NOE collective variable can be also used for PRE distances and 3J Couplings can be implemented using TORSION and MATHEVAL. Among the above listed collective variables those based on chemical shifts make use of an external library, ALMOST, that must be downloaded and compiled separately. In addition plumed must be configured in such a way to link ALMOST. Detailed instructions on how to compile PLUMED with ALMOST can be found in CS2BACKBONE.
In the following we will write the CS2BACKBONE collective variable that has been used in Gratana et al. (2013).
prot: GROUP ATOMS=1-862 WHOLEMOLECULES ENTITY0=prot cs: CS2BACKBONE ATOMS=prot DATA=data FF=a03_gromacs.mdb NRES=56 FLAT=1.0 WRITE_CS=50 PRINT ARG=cs FILE=COLVAR STRIDE=100 ENDPLUMED
In this case the chemical shifts are those measured for the native state of the protein and can be used, together with other CVs and Bias-Exchange Metadynanics, to guide the system back and forth from the native structure. The experimental chemical shifts are in six files inside the "data/" folder (see first example in the resources tarball), one file for each nucleus. A 0 chemical shift is used where a chemical shift doesn't exist (i.e. CB of GLY) or where it has not been assigned. Additionally the data folder contains:
This example can be executed as
mdrun_mpi -s topol -plumed plumed
NMR data, as all the equilibrium experimental data, are the result of a measure over an ensemble of structures and over time. In principle a "perfect" molecular dynamics simulations, that is a simulations with a perfect force-field and a perfect sampling can predict the outcome of an experiments in a quantitative way. Actually in most of the cases obtaining a qualitative agreement is already a fortunate outcome. In order to increase the accuracy of a force field in a system dependent manner it is possible to add to the force-field an additional term based on the agreement with a set of experimental data. This agreement is not enforced as a simple restraint because this would mean to ask the system to be always in agreement with all the experimental data at the same time, instead the restraint is applied over an AVERAGED COLLECTIVE VARIABLE where the average is performed over multiple identical simulations. In this way the is not a single replica that must be in agreement with the experimental data but they should be in agreement on average. It has been shown that this approach is equivalent in solving the problem of finding a modified version of the force field that will reproduce the provided set of experimental data withouth any additional assumption on the data themselves.
Currently ENSEMBLE AVERAGING of a collective variable can be performed only using the NMR variables (CS2BACKBONE, CH3SHIFTS, NOE and RDC).
The second example included in the resources show how the amber force field can be improved in the case of protein domain GB3 using the native state chemical shifts a replica-averaged restraint. By the fact that replica-averaging needs the use of multiple replica simulated in parallel in the same conditions it is easily complemented with BIAS-EXCHANGE or MULTIPLE WALKER metadynamics to enhance the sampling.
prot: GROUP ATOMS=1-862 WHOLEMOLECULES ENTITY0=prot cs: CS2BACKBONE ATOMS=prot DATA=data FF=a03_gromacs.mdb NRES=56 FLAT=0.0 WRITE_CS=500 ENSEMBLE cse: RESTRAINT ARG=cs AT=0. KAPPA=0. SLOPE=24 PRINT ARG=cs FILE=COLVAR STRIDE=10 ENDPLUMED
with respect to the case in which chemical shifts are used to define a standard collective variable, in this case the keyword ENSEMBLE tells plumed to calculate all the chemical shifts from the replicas (i.e. 4 replicas) average them and only after the averaging calculate the difference with respect to the experimental ones. On this difference that is the AVERAGED Collective Variable it is possible to apply a linear RESTRAINT (because the variable is already a sum of squared differences) that is the new term we are adding to the underlying force field.
This example can be executed as
mpiexec -np 4 mdrun_mpi -s topol -plumed plumed -multi 4
The third example show how RDC (calculated with the theta-methods) can be employed in the same way, in this case to describe the native state of Ubiquitin. In particular it is possible to observe how the RDC averaged restraint applied on the correlation between the calculated and experimental N-H and CA-HA RDCs result in the increase of the correlation of the RDCs for other bonds already on a very short time scale.
RDC ... ENSEMBLE CORRELATION GYROM=-72.5388 SCALE=0.001060 ATOMS1=20,21 COUPLING1=8.17 ATOMS2=37,38 COUPLING2=-8.271 ATOMS3=56,57 COUPLING3=-10.489 ATOMS4=76,77 COUPLING4=-9.871 #continue....
In this input the first four N-H RDCs are defined.
This example can be executed as
mpiexec -np 8 mdrun_mpi -s topol -plumed plumed -multi 8