This Masterclass describes how to bias simulations to agree with experimental data using experiment directed simulation.
Once this Masterclass is completed, you will know how to:
We assume that you are familiar with PLUMED and enhanced sampling calculations. If you are not, the 2021 PLUMED Masterclass is a great place to start. In particular, you should be familiar with specifying collective variables.
Furthermore, some Python or other programming knowledge is required for this masterclass to generate plots and perform some analysis calculations.
Experiment directed simulation (EDS) is a maximum entropy method for biasing specific collective variables (CVs) to agree with set values. These are typically from experimental data, like a known radius of gyration or NMR chemical shift. Biasing a CV is an under-determined problem because there are many ways to change the systems' potential energy to agree with a set point. If we further maximize ensemble entropy while matching the set point, the problem has a unique solution of
\[ U'(x) = U(x) + \lambda f(x) \]
where \(U(x)\) is the potential energy of the system, \(f(x)\) is the CV, and \(\lambda\) is a fit parameter. EDS is a time-dependent method that finds \(\lambda\) while the simulation is running. It typically converges much faster than free-energy methods, but comes with the same caveats that insufficient sampling or rare events can affect the method. Another important detail is that EDS/maximum entropy biasing is for matching the set point on average (in expectation), rather than at every frame.
The only data needed to complete the exercises of this Masterclass can be found on an the following github page data, with alanine inputs being borrowed from an earlier one GitHub-22-03. Simulations will be performed using PLUMED's pesmd module, and gromacs.
The exercises are presented below.
Set up a plumed file to use with pesmd that has a harmonic potential. Then run using the 1d input provided in the github. For example:
d1: DISTANCEATOMS=1,2the pair of atom that we are calculating the distance between.COMPONENTSff: MATHEVAL( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,ARG=d1.xthe input for this action is the scalar output from one or more other actions.PERIODIC=NOcompulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.FUNC=0.5*10*(x^2) bb: BIASVALUEcompulsory keyword the function you wish to evaluateARG=ff PRINTthe input for this action is the scalar output from one or more other actions.ARG=d1.xthe input for this action is the scalar output from one or more other actions.STRIDE=25compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be outputFILE=_PREFIX_.colvars.datthe name of the file on which to output these quantities
plumed pesmd < harmonic_1d.in
Histogram your data, and see if it fits the ideal Boltzmann distribution for this harmonic oscillator. Make sure to normalize your distribution properly!
Now try adding a harmonic bias using the RESTRAINT function, so that the data becomes centered near \(x=1\).
Finally, we will add an EDS bias. Make your EDS bias also be centered at 1.
d1: DISTANCEATOMS=1,2the pair of atom that we are calculating the distance between.COMPONENTSff: MATHEVAL( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,ARG=d1.xthe input for this action is the scalar output from one or more other actions.PERIODIC=NOcompulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.FUNC=0.5*10*(x^2) bb: BIASVALUEcompulsory keyword the function you wish to evaluateARG=ff eds: EDSthe input for this action is the scalar output from one or more other actions.ARG=d1.xthe input for this action is the scalar output from one or more other actions.CENTER=__FILL__The desired centers (equilibrium values) which will be sought during the adaptive linear biasing.PERIOD=__FILL__Steps over which to adjust bias for adaptive or rampingOUT_RESTART=__FILL__Output file for all information needed to continue EDS simulation.TEMP=1.0The system temperature.BIAS_SCALE=1 PRINTA divisor to set the units of the bias.ARG=*the input for this action is the scalar output from one or more other actions.STRIDE=100compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be outputFILE=_PREFIX_.colvars.datthe name of the file on which to output these quantities
Note, we have to add TEMP=1 because pesmd does not provide the temperature to the EDS module.
Now look at how the bias factor changes with time. Can you also compute the bias from this value, and the value of the position versus time? Look in the colvar file and the restart file to see what values are there.
Now we will move on to a 2d harmonic oscillator. This way we can see the effect of biasing x, y, or a combination. Note, there is a different pesmd input file for this.
d1: DISTANCEATOMS=1,2the pair of atom that we are calculating the distance between.COMPONENTSff: MATHEVAL( default=off ) calculate the x, y and z components of the distance separately and store them as label.x,ARG=__FILL__the input for this action is the scalar output from one or more other actions.PERIODIC=NOcompulsory keyword if the output of your function is periodic then you should specify the periodicity of the function.FUNC=__FILL__ bb: BIASVALUEcompulsory keyword the function you wish to evaluateARG=ff PRINTthe input for this action is the scalar output from one or more other actions.ARG=__FILL__the input for this action is the scalar output from one or more other actions.STRIDE=25compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be outputFILE=_PREFIX_.colvars.datthe name of the file on which to output these quantities
Try to bias either the x or y directions separately. Try biasing a combination of them. We biased the average of x*y to be = 1. Use the CUSTOM command to try that out. What do you think it should look like? Write a python program to histogram the x and y data and see. It should look like below.
Bias alanine dipeptide using gromacs and the input files provided in the github.
MOLINFOSTRUCTURE=./input.ala2.pdb phi: TORSIONcompulsory keyword a file in pdb format containing a reference structure.ATOMS=@phi-2 psi: TORSIONthe four atoms involved in the torsional angleATOMS=@psi-2 eds: EDSthe four atoms involved in the torsional angleARG=__FILL__the input for this action is the scalar output from one or more other actions.CENTER=__FILL__The desired centers (equilibrium values) which will be sought during the adaptive linear biasing.PERIOD=__FILL__Steps over which to adjust bias for adaptive or rampingOUT_RESTART=_PREFIX_.restart.datOutput file for all information needed to continue EDS simulation.BIAS_SCALE=1 PRINTA divisor to set the units of the bias.ARG=*the input for this action is the scalar output from one or more other actions.STRIDE=500compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be outputFILE=_PREFIX_.colvars.datthe name of the file on which to output these quantities
Try biasing phi and psi independently. Try biasing them at the same time. What is a good PERIOD to choose?
Take a protein of interest, or alanine 2, and bias more than one variable. To try CGDS, choose the center of mass of some different parts of your protein. Can you bias the distance between these COMs and the angle between them at the same time? Does using COVAR or LM affect the speed of convergence?