The aim of this tutorial is to introduce the users to the plumed syntax. We will go through the writing of simple collective variable and we will use them to analyse a trajectory in terms of probabilty distributions and free energy.
Once this tutorial is completed students will:
The tarball for this project contains the following files:
PLUMED2 is a library that can be accessed by multiple codes adding a relatively simple and well documented interface. Once PLUMED is installed you can run a plumed executable that can be used for multiple purposes:
plumed --help
some of the listed options report about the plumed available functionalities, other can be used to tell plumed to do something: from analysing a trajectory to patch the source code of a MD code and so on. All the commands have further options that can be seen using plumed command –help, i.e.:
plumed driver --help
In the following we are going to see how to write an input file for plumed2 that can be used to analyse a trajectory.
By default the PLUMED inputs and outputs quantities in the following units:
If you want to change these units you can do this using the UNITS keyword.
A typical input file for PLUMED input is composed by specification of one or more CVs, the printout frequency and a termination line. Comments are denoted with a # and the termination of the input for PLUMED is marked with the keyword ENDPLUMED. Whatever it follows is ignored by PLUMED. You can introduce blank lines. They are not interpreted by PLUMED.
In the following input we will analyse the DISTANCE between the two terminal carbons of a 16 residues peptide, and we will PRINT the results in file named COLVAR.
#my first plumed input: DISTANCE ATOMS=2,253 LABEL=e2edist #printout frequency PRINT ARG=e2edist STRIDE=1 FILE=COLVAR #endofinput ENDPLUMED here I can write what I want it won't be read.
Now we can use this simple input file to analyse the trajectory included in the RESOURCES:
plumed driver --plumed plumed.dat --ixyz trajectory-short.xyz --length-units 0.1
NOTE: –length-units 0.1, xyz files, as well as pdb files, are in Angstrom.
You should have a file COLVAR, if you look at it (i.e. more COLVAR) the first two lines should be:
#! FIELDS time e2edist 0.000000 2.5613161
NOTE: the first line of the file COLVAR tells you what is the content of each column.
In PLUMED2 the commands defined in the input files are executed in the same order in which they are written, this means that the following input file is wrong:
#printout frequency PRINT ARG=cvdist STRIDE=1 FILE=COLVAR #my first plumed input: DISTANCE ATOMS=2,253 LABEL=e2edist #endofinput ENDPLUMED here I can write what I want it won't be read.
Try to run it.
Sometimes, when calculating a collective variable, you may not want to use the positions of a number of atoms directly. Instead you may wish to use the position of a virtual atom whose position is generated based on the positions of a collection of other atoms. For example you might want to use the center of mass of a group of atoms (COM):
Since PLUMED executes the input in order you need to define the new Virtual Atom before using it:
first: COM ATOMS=1,2,3,4,5,6 last: COM ATOMS=251-256 e2edist: DISTANCE ATOMS=2,253 comdist: DISTANCE ATOMS=first,last PRINT ARG=e2edist,comdist STRIDE=1 FILE=COLVAR ENDPLUMED
NOTE: an action (i.e. COM or DISTANCE here) can be either label using LABEL as we did before or as label: ACTION as we have just done here.
With the above input this is what happen inside PLUMED with a STRIDE=1:
In the above input we have used to different ways of writing the atoms used in COM calculation:
ranges of atoms can be defined with a stride which can also be negative:
Now by plotting the content of the COLVAR file we can compare the behaviour in this trajectory of both the terminal carbons as well as of the centre of masses of the terminal residues.
gnuplot
What do you expect to see now by looking at the trajectory? Let's have a look at it
vmd template.pdb trajectory-short.xyz
Virtual atoms can be used in place of standard atoms everywhere an atom can be given as input, they can also be used together with standard atoms. So for example we can analyse the TORSION angle for a set of Virtual and Standard atoms:
first: COM ATOMS=1-6 last: COM ATOMS=251-256 cvtor: TORSION ATOMS=first,102,138,last PRINT ARG=cvtor STRIDE=1 FILE=COLVAR ENDPLUMED
The above CV don't look smart to learn something about the system we are looking at. In principle CV are used to reduce the complexity of a system by looking at a small number of properties that could be enough to rationalise its behaviour.
Now try to write a collective variable that measures the Radius of Gyration of the system: GYRATION.
NOTE: if what you need for one or more variables is a long list of atoms and not a virtual atom one can use the keyword GROUP. A GROUP can be defined using ATOMS in the same way we saw before, in addition it is also possible to define a GROUP by reading a GROMACS index file.
ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238
Now 'ca' is not a virtual atom but a simple list of atoms.
Sometimes it can be useful to calculate properties of many similar collective variables at the same time, for example one can be interested in calculating the properties of the distances between a group of atoms, or properties linked to the distribution of the dihedral angles of a chain and so on. In PLUMED2 this kind of collective variables fall under the name of MULTICOLVAR (cf. MultiColvar Documentation.) Here we are going to analyse the distances between CA carbons along the chain:
ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238 dd: DISTANCES GROUP=ca MEAN MIN={BETA=50} MAX={BETA=0.02} MOMENTS=2 PRINT ARG=dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR ENDPLUMED
The above input tells PLUMED to calculate all the distances between CA carbons and then look for the mean distance, the minimum distance, the maximum distance and the variance. In this way we have defined four collective variables that are calculated using the distances. These four collective variables are stored as components of the defined action 'dd': dd.mean, dd.min, dd.max, dd.moment-2.
The infrastracture of multicolvar has been used to develop many PLUMED2 collective variables as for example the set of Secondary Structure CVs (ANTIBETARMSD, PARABETARMSD and ALPHARMSD).
MOLINFO STRUCTURE=template.pdb abeta: ANTIBETARMSD RESIDUES=all TYPE=DRMSD LESS_THAN={RATIONAL R_0=0.08 NN=8 MM=12} STRANDS_CUTOFF=1 PRINT ARG=abeta.lessthan STRIDE=1 FILE=COLVAR ENDPLUMED
We have now seen how to write the input some of the many CVs available in PLUMED. More complex CVs will be discussed in the next workshop, Belfast tutorial: Adaptive variables I.
Collective variables are usually used to visualize the Free Energy of a system. Given a system evolving at fixed temperature, fixed number of particles and fixed volume, it will explore different conformations with a probability
\[ P(q)\propto e^{-\frac{U(q)}{kb_BT}} \]
where \( q \) are the microscopic coordinates and \( k_B \) is the Boltzmann constant.
It is possible to analyse the above probabilty as a function of one or more collective variable \( s(q)\):
\[ P(s)\propto \int dq e^{-\frac{U(q)}{kb_BT}} \delta(s-s(q)) \]
where the \( \delta \) function means that to for a given value \( s\) of the collective variable are counted only those conformations for which the CV is \( s\). The probability can be recast to a free energy by taking its logarithm:
\[ F(s)=-k_B T \log P(s) \]
This means that by estimating the probability distribution of a CV it is possible to know the free energy of a system along that CV. Estimating the probability distribution of the conformations of a system is what is called 'sampling'.
In order to estimate a probability distribution one needs to make HISTOGRAM from the calculated CVs. PLUMED2 includes the possibility of histogramming data both on the fly as well as a posteriori as we are going to do now.
MOLINFO STRUCTURE=template.pdb abeta: ANTIBETARMSD RESIDUES=all TYPE=DRMSD LESS_THAN={RATIONAL R_0=0.08 NN=8 MM=12} STRANDS_CUTOFF=1 ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238 DISTANCES ... GROUP=ca MEAN MIN={BETA=50} MAX={BETA=0.02} MOMENTS=2 LABEL=dd ... DISTANCES PRINT ARG=abeta.lessthan,dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR HISTOGRAM ... ARG=abeta.lessthan,dd.mean USE_ALL_DATA KERNEL=discrete GRID_MIN=0,0.8 GRID_MAX=4,1.2 GRID_BIN=40,40 GRID_WFILE=histo ... HISTOGRAM ENDPLUMED
NOTE: HISTOGRAM ... means that what follow is part of the HISTOGRAM function, the same can be done for any action in PLUMED.
The above input tells PLUMED to accumulate the two collective variables on a GRID. In addition the probability can be converted to a free-energy using the flag FREE-ENERGY and setting the temperature using TEMP (i.e. 300K). Histograms can be accumulated in a smoother way by using a KERNEL function, a kernel is a normalised function, for example a normalised gaussian is the default kernel in PLUMED, that is added to the histogram centered in the position of the data. Estimating a probability density using kernels can in principle give more accurate results, on the other hand in addition to the choice of the binning one has to choose a parameter that is the WIDTH of the kernel function. As a rule of thumb: the grid spacing should be smaller (i.e. one half or less) than the BANDWIDTH and the BANDWIDTH should be smaller (i.e. one order of magnitude) than the variance observed/expected for the variable.
HISTOGRAM ... ARG=abeta.lessthan,dd.mean USE_ALL_DATA GRID_MIN=0,0.8 GRID_MAX=4,1.2 GRID_SPACING=0.04,0.004 BANDWIDTH=0.08,0.008 GRID_WFILE=histo ... HISTOGRAM ENDPLUMED
If you have time less at the end of the session read the manual and look for alternative collective variables to analyse the trajectory. Furthemore try to play with the HISTOGRAM parameters to see the effect of using KERNEL in analysing data.