Learning Outcomes

Once this tutorial is completed students will learn to:

Perform metadynamics simulations using PLUMED 2 and LAMMPS
Construct a bias potential on 1 and 2 collective variables (CVs)
Assess the convergence of the free energy surface
Distinguish between good and bad CVs
Reweight with more than one bias potential

Resources

The tarball for this project contains the following folders:

Example1 : Contains the input file for the unbiased simulation.
Example2 : Contains the input files for one of the biased simulations. The rest of the biased simulations inputs should be created by modifying this one.

Instructions

The system

We consider the association/dissociation of NaCl in aqueous solution. The dissociation barrier is expected to be around 2.5 \(k_\mathrm{B} T\). One interesting aspect of the ion dissociation problem is that collective solvent motions play an important role in the transition. This problem has been considered in the original metadynamics paper [80] and also in reference [17] . We will use the potential developed in ref. [104] for NaCl and TIP3P water with parameters corrected to be used with long-range Coulomb solvers [115]. The system contains 1 Na, 1 Cl, and 106 water molecules (total 320 atoms).

NaCl in water

Perform an unbiased simulation and control the distance Na-Cl

We first perform a standard MD simulation and control the distance Na-Cl. All the files needed for this example are contained in the folder Example1 . The distance Na-Cl can be calculated in Plumed 2 using:

Click on the labels of the actions for more information on what each action computes

d1: DISTANCE ATOMSthe pair of atom that we are calculating the distance between. =319,320 The DISTANCE action with label d1 calculates a single scalar value

The coordination number of Na with respect to O in water will also be calculated for later use. This variable will represent the collective motion of the solvent.

Click on the labels of the actions for more information on what each action computes

coord: COORDINATION ...
   GROUPAFirst list of atoms. =319 
   GROUPBSecond list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted). =1-318:3 
   SWITCHThis keyword is used if you want to employ an alternative to the continuous switching
function defined above. ={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}  
   NLIST( default=off ) Use a neighbor list to speed up the calculation  
   NL_CUTOFFThe cutoff for the neighbor list =0.55 
   NL_STRIDEThe frequency with which we are updating the atoms in the neighbor list =10 
   
...The COORDINATION action with label coord calculates a single scalar value

To run LAMMPS you can use the run.sh script:

#!/bin/bash

############################################################################
# Definition of variables
############################################################################
EXE=lmp_mpi
totalCores=2
############################################################################

mpirun -np ${totalCores} ${EXE} < start.lmp > out.lmp

This command runs LAMMPS using 2 MPI threads. The use of partitions will be discussed when using multiple walkers

Once the simulation is launched, the so called COLVAR file is written. In this case it contains the following:

#! FIELDS time d1 coord
 0.200000 0.568067 5.506808
 0.400000 0.500148 4.994588
 0.600000 0.449778 4.931140
 0.800000 0.528272 5.105816
 1.000000 0.474371 5.089863
 1.200000 0.430620 5.091551
 1.400000 0.470374 4.993886
 1.600000 0.458768 4.940097
 1.800000 0.471886 4.952868
 2.000000 0.489058 4.897593
.
.
.

If you plot the time (column 1) vs the distance (column 2), for instance in gnuplot:

pl  "./COLVAR" u 1:2 w lp,

you will see that the ion pair is stuck in the dissociated state during the 1 ns simulation. It is unable to cross the \(\sim5k_\mathrm{B}T\) barrier located at a distance of approximately 0.4 nm. You can also observe this behavior in the trajectory using VMD:

vmd out.dcd -psf nacl.psf

The trajectory has been saved in unwrapped format in order to avoid bonds stretching from one side to the box to the other due to periodic boundary conditions. In VMD we can wrap the atoms without breaking the bonds and show the box using the commands:

pbc wrap -compound res -all
pbc box

You can play with different visualization styles and options that VMD has. Therefore, if we want the system to go back and forth between the associated and dissociated state, we will need enhanced sampling.

Construct a bias potential on the distance Na-Cl

We now construct a bias potential \( V(\mathbf{s}) \) on the distance Na-Cl using well-tempered metadynamics. The files for this example are contained in the directory Example2. As argument for the construction of the potential we will use the distance Na-Cl (label d1). We choose a gaussian height of 1 kJ/mol which is slightly less than 0.5 \(k_\mathrm{B} T \). The gaussian width is 0.02 nm, in the same order of the features in the FES. A rule of thumb for choosing the gaussian width is to use the standard deviation of the unbiased fluctuations of the CV. The bias factor is set to 5 since the largest barrier in the FES is expected to be roughly 5 \(k_\mathrm{B}T\). Once the metadynamics simulation is converged, the bias will be (up to an arbitrary constant):

\[ V(\mathbf{s})= - \left ( 1- \frac{1}{\gamma} \right ) F(\mathbf{s}) \]

and therefore the system will evolve under an effective free energy:

\[ \tilde{F}(\mathbf{s})=F(\mathbf{s})+V(\mathbf{s})= \frac{F(\mathbf{s})}{\gamma}, \]

that is to say, the largest barrier will be of around 1 \(k_\mathrm{B} T\). The input is:

Click on the labels of the actions for more information on what each action computes

d1: DISTANCE ATOMSthe pair of atom that we are calculating the distance between. =319,320 The DISTANCE action with label d1 calculates a single scalar value
metad: METAD ...
   
   ARGthe input for this action is the scalar output from one or more other actions. =d1 
   SIGMAcompulsory keyword 
the widths of the Gaussian hills =0.02 
   HEIGHTthe heights of the Gaussian hills. =1. 
   BIASFACTORuse well tempered metadynamics and use this bias factor. =5 
   TEMPthe system temperature - this is only needed if you are doing well-tempered metadynamics
=300.0 
   PACEcompulsory keyword 
the frequency for hill addition =500 
   GRID_MINthe lower bounds for the grid =0.2 
   GRID_MAXthe upper bounds for the grid =1.0 
   GRID_BINthe number of bins for the grid =300 
   CALC_RCT( default=off ) calculate the c(t) reweighting factor and use that to obtain the
normalized bias [rbias=bias-rct].This  
...The METAD action with label metad calculates the following quantities:

 Quantity    Description  
metad.bias the instantaneous value of the bias potential
metad.rbias the instantaneous value of the bias normalized using the c(t) reweighting factor [rbias=bias-rct].This component can be used to obtain a reweighted histogram.
metad.rct the reweighting factor \f$c(t)\f$.

Here the CALC_RCT keyword turns on the calculation of the time dependent constant \( c(t) \) that we will use below when reweighting the simulations.

We will also limit the exploration of the CV space by introducing an upper wall bias \( V_{wall}(\mathbf{s}) \):

\[ V_{wall}(s) = \kappa (s-s_0)^2 \mathrm{\: if \:} s>s_0 \mathrm{ \: and \: 0 \: otherwise}. \]

The wall will focus the sampling in the most interesting region of the free energy surface. The effect of this bias potential will have to be corrected later in order to calculate ensemble averages. The syntax in Plumed 2 is:

Click on the labels of the actions for more information on what each action computes

d1: DISTANCE ATOMSthe pair of atom that we are calculating the distance between. =319,320 The DISTANCE action with label d1 calculates a single scalar value
uwall: UPPER_WALLS ...
   ARGthe input for this action is the scalar output from one or more other actions. =d1 
   ATcompulsory keyword 
the positions of the wall. =0.6 
   KAPPAcompulsory keyword 
the force constant for the wall. =2000.0 
   EXPcompulsory keyword ( default=2.0 )
the powers for the walls. =2 
   EPScompulsory keyword ( default=1.0 )
the values for s_i in the expression for a wall =1 
   OFFSETcompulsory keyword ( default=0.0 )
the offset for the start of the wall. =0. 
   
...The UPPER_WALLS action with label uwall calculates the following quantities:

 Quantity    Description  
uwall.bias the instantaneous value of the bias potential
uwall.force2 the instantaneous value of the squared force due to this bias potential

You can run the simulation with the run.sh script as done in the previous example.

It is possible to try different bias factors to check the effect that it has on the trajectory and the effective FES.

In principle the cost of a metadynamics simulation should increase as the simulation progresses due to the need of adding an increasing number of Gaussian kernels to calculate the bias. However, since a grid is used to build up the bias this effect is not observed. You can check what happens if you do not use the GRID_* keywords. Remember that the bins in the grid should be small enough to describe the changes in the bias created by the Gaussian kernels. Normally a bin of size \( \sigma / 5 \) (with \( \sigma \) the gaussian width) is small enough.

Assess convergence

One way to identify if a well tempered metadynamics simulation has converged is observing the estimated free energy surface at different times. The FES is estimated by using the relation (again up to an arbitrary constant):

\[ F(\mathbf{s}) = - \left ( \frac{\gamma}{\gamma-1} \right ) V(\mathbf{s},t) \]

and the bias potential is calculated as a sum of Gaussian kernels. This can be done with Plumed 2 using for instance:

plumed sum_hills --hills ../HILLS --min 0.1 --max 0.8 --bin 300 --stride 100

Most of the flags are self explanatory. The flag –stride 100 will result in the FES been written every 100 Gaussian kernels, i.e. 100 ps. Inside the folder FES_calculation you will find a script run.sh that executes the sum_hills command and a gnuplot script plot.gpi that can be used typing:.

gnuplot plot.gpi

After roughly 3 ns the free energy surface does not change significantly with time except for an immaterial constant \( c(t) \) that grows in time. This is in line with well-tempered metadynamics asymptotic behavior:

\[ V(\mathbf{s},t)= - \left ( 1- \frac{1}{\gamma} \right ) F(\mathbf{s}) + c(t). \]

The behavior of \( c(t) \) will be studied with greater detail later.

Evolution of the estimated free energy

It should be stressed that we are actually not calculating the free energy \( F(\mathbf{s}) \) but rather \( F(\mathbf{s})+V_{wall}(\mathbf{s}) \). For this reason for distances higher than 0.6 nm the free energy increases sharply.

An alternative way to observe the evolution of the bias is plotting the final free energy plus the instantaneous bias. The scripts for this example can be found in the folder Bias_calculation. In this case we observe only the first 200 ps of the simulation. The (negative) bias can be calculated using:

plumed sum_hills --hills ../HILLS --min 0.1 --max 0.8 --bin 300 --stride 10 --negbias

This plot illustrates clearly how the bias is constructed to progressively "fill" the FES.

Evolution of the bias potential

It is also possible to track convergence by controlling the evolution of some quantity connected to the free energy surface. In this case we will calculate the dissociation barrier, e.g. the height of the barrier that separates the associated state from the dissociated one. The scripts for this example are found in the folder Barrier_calculation. Using the python script calculate_barrier.py we can compute the barrier, for instance:

import numpy as np
 
# Total number of fes files in folder
total_files=101
# Min and max initial guesses
min_min=50
min_max=90
max_min=90
max_max=130
 
for i in range(total_files):
        file_name="fes_" + str(i) + ".dat"
        matrix=np.genfromtxt(file_name)
        minimum=np.amin(matrix[min_min:min_max,1])
        maximum=np.amax(matrix[max_min:max_max,1])
        print(str(i) + " " + str(minimum) + " " + str(maximum) + " " + str(maximum-minimum))

The script can be executed using:

python barrier_calculation.py > barrier.txt

and the results can be plotted using the gnuplot script plot.gpi. After roughly 4 ns the barrier stabilizes around 3 and 3.5 \(k_\mathrm{B} T\).

Dissociation barrier as a function of simulation time

It is important to stress that it is only possible to calculate the free energy difference between two points if the system has gone back and forth between these points several times. This applies both for the calculation of a barrier and the difference in free energy between two basins. It is also important to understand that none of the free energy methods described in this series of tutorials will be able to calculate free energies of regions that have not been sampled, i.e. visited.

Reweighting the simulation on the same CV that was used for biasing can also be used as a test of convergence. We will show that in the next section.

Reweight the simulation

We first reweight the simulation on the distance Na-Cl, the same CV used for biasing. This reweighting is useful to check convergence and to have an estimate of the free energy that does not rely on using kernels. For instance if some features of the FES could not be captured by the kernels, the reweighting procedure will show them. This will become clearer in the VES tutorial. The scripts to perform this calculation are found in the folder ReweightDistance. In metadynamics quasi-stationary limit the weight assigned to a given configuration is [130] :

\[ w(\mathbf{R}) \propto e^{\beta ( V(\mathbf{s},t) - c(t) )}. \]

By plotting time (column 1) versus \( c(t) \) (column 6) using the COLVAR file, the importance of taking \( c(t) \) into account becomes clear. \( c(t) \) keeps growing even after long times, reflecting the approximately rigid shift of the bias with time. Normally the first part of the trajectory is not used for reweighting since during this period the simulation has not reached the quasi-stationary limit. In this case we can discard the first 2 or 3 ns of simulation. To disregard the first 3 ns of simulation we can use sed to delete the first 15000 lines from the COLVAR file:

sed '2,15000d' ../COLVAR > COLVAR

We then use Plumed to calculate two histograms, one taking into account the wall bias and the other one neglecting it. The weights for the reweighting involving only the metadynamics bias have already been discussed while the weights considering both biases are:

\[ w(\mathbf{R}) \propto e^{\beta ( V(\mathbf{s},t) - c(t) + V_{wall}(\mathbf{s}) )}. \]

The header of the COLVAR file should be:

#! FIELDS time d1 coord metad.bias metad.rbias metad.rct metad.work uwall.bias

The input script for Plumed is:

Click on the labels of the actions for more information on what each action computes

# Read COLVAR file
distance: READ FILEcompulsory keyword 
the name of the file from which to read these quantities =COLVAR IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword 
the values to read from the file =d1 The READ action with label distance calculates a single scalar value
metad: READ FILEcompulsory keyword 
the name of the file from which to read these quantities =COLVAR IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword 
the values to read from the file =metad.rbias The READ action with label metad calculates a single scalar value
uwall: READ FILEcompulsory keyword 
the name of the file from which to read these quantities =COLVAR IGNORE_TIME( default=off ) ignore the time in the colvar file.  VALUEScompulsory keyword 
the values to read from the file =uwall.bias The READ action with label uwall calculates a single scalar value
# Define weights
weights1: REWEIGHT_METAD TEMPthe system temperature. =300 The REWEIGHT_METAD action with label weights1 calculates a single scalar value
weights2: REWEIGHT_BIAS TEMPthe system temperature. =300 ARGcompulsory keyword ( default=*.bias )
the biases that must be taken into account when reweighting =metad.rbias,uwall.bias The REWEIGHT_BIAS action with label weights2 calculates a single scalar value
# Calculate histograms
hh1: HISTOGRAM ...
   ARGthe input for this action is the scalar output from one or more other actions. =distance 
   GRID_MINcompulsory keyword 
the lower bounds for the grid =0.2 
   GRID_MAXcompulsory keyword 
the upper bounds for the grid =0.8 
   GRID_BINthe number of bins for the grid =100 
   BANDWIDTHcompulsory keyword 
the bandwidths for kernel density estimation =0.002 
   LOGWEIGHTSlist of actions that calculates log weights that should be used to weight configurations
when calculating averages =weights1 
   
...The HISTOGRAM action with label hh1
hh2: HISTOGRAM ...
   ARGthe input for this action is the scalar output from one or more other actions. =distance 
   GRID_MINcompulsory keyword 
the lower bounds for the grid =0.2 
   GRID_MAXcompulsory keyword 
the upper bounds for the grid =0.8 
   GRID_BINthe number of bins for the grid =100 
   BANDWIDTHcompulsory keyword 
the bandwidths for kernel density estimation =0.002 
   LOGWEIGHTSlist of actions that calculates log weights that should be used to weight configurations
when calculating averages =weights2 
   
...The HISTOGRAM action with label hh2
# Print histograms to file
DUMPGRID GRIDcompulsory keyword 
the action that creates the grid you would like to output =hh1 FILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo FMTthe format that should be used to output real numbers =%24.16e The DUMPGRID action with label 
DUMPGRID GRIDcompulsory keyword 
the action that creates the grid you would like to output =hh2 FILEcompulsory keyword ( default=density )
the file on which to write the grid. =histo_wall FMTthe format that should be used to output real numbers =%24.16e The DUMPGRID action with label

This example can be run with:

plumed --no-mpi driver --plumed plumed.dat --noatoms > plumed.out

and will generate the files histo and histo_wall. The histograms represent the probability \(p(\mathbf{s})\) of observing a given value of the CV \( \mathbf{s} \) . From the histograms the FES can be calculated using:

\[ \beta F(\mathbf{s})= - \log p(\mathbf{s}) \]

and therefore we plot the FES in gnuplot using for instance:

pl  "./histo" u 1:(-log($2)) w lp

The next plot compares the estimations of the FES from sum_hills, reweighting with metadynamics bias, and reweighting using both the metadynamics bias and the upper wall bias You will find a gnuplot script plot.gpi to make this plot inside the ReweightDistance folder.

FES estimated from sum_hills, reweighting using only the metadynamics bias, and reweighting using both the metadynamics bias and the upper wall bias

We can obtain important information of the system by reweighting on 2 CVs: The distance Na-Cl and the coordination of Na with O. This reweighting is similar to the one already done and the files that you will need are located in the ReweightBoth folder. Additionally the COLVAR file with the omitted first steps is required. The plot of the FES as a function of these 2 CVs provides important information of the association/dissociation mechanism. In the dissociated state, Na can have a coordination of 5 or 6, though it is more likely to find a coordination number of 6. However, in order to associate Na must have a coordination with O of 5. In the associated state Na can have a coordination of 3, 4 or 5. The transition state is characterized by a coordination number of ~5.

Free energy as a function of distance Na-Cl and the coordination number of Na and O.

Construct a bias potential on the coordination Na-O

As an exercise, you can write the input files for a simulation in which a bias potential is constructed on the coordination Na-0, i.e. the solvent degree of freedom. You can use the same gaussian height as before and \( \sigma = 0.1 \).

You will find that the exploration of the CV space is not efficient. The reason is that there is a slow degree of freedom that it is not being biased: the distance Na-Cl. Furthermore you can see in the 2 CV reweighting that the coordination Na-O shows significant overlap between the associated and dissociated states.

Bear in mind that this is a rather trivial example since the existing barriers are relatively low. Real problems in materials science usually involve large barriers and are not as forgiving as this example; a bad CV may lead to huge hysteresis and problems in convergence.

Construct a bias potential on both CVs

We will now construct a bias potential on both CVs. We have already calculated the FES as a function of both CVs through reweighting. In this example the FES will be calculated using the metadynamics bias potential. You can use the input files from Example2.tar and changed the plumed.dat file. To construct a 2 dimensional bias with metadynamics use the following input:

Click on the labels of the actions for more information on what each action computes

__FILL__  You cannot view the components that are calculated by each action for this input file. Sorry 
metad: METAD ...
   
   ARGthe input for this action is the scalar output from one or more other actions. =d1,coord 
   SIGMAcompulsory keyword 
the widths of the Gaussian hills =0.02,0.1 
   HEIGHTthe heights of the Gaussian hills. =1. 
   BIASFACTORuse well tempered metadynamics and use this bias factor. =5 
   TEMPthe system temperature - this is only needed if you are doing well-tempered metadynamics
=300.0 
   PACEcompulsory keyword 
the frequency for hill addition =500 
   GRID_MINthe lower bounds for the grid =0.15,2. 
   GRID_MAXthe upper bounds for the grid =0.9,9. 
   GRID_BINthe number of bins for the grid =400,400 
   CALC_RCT( default=off ) calculate the c(t) reweighting factor and use that to obtain the
normalized bias [rbias=bias-rct].This  
... You cannot view the components that are calculated by each action for this input file. Sorry

Once that the simulation is completed you can run plumed sum_hills to calculate the FES:

plumed sum_hills --hills HILLS --mintozero

and plot the results using the following lines in gnuplot:

set pm3d map
set zr [0:15]
spl "fes.dat" u 1:2:3

Final remarks

Some valuable tools for metadynamics simulations will be discussed in the VES tutorial. These include:

Restarting a simulation.
Using Plumed driver to calculate a CV that was not calculated during the simulation. A reweighting can then be performed on this CV.
Constructing biased histograms, i.e. histograms without weights to calculate the effective FES \( \tilde{F}(\mathbf{s}) = F(\mathbf{s}) + V(\mathbf{s}) \).
Use multiple walkers to improve the exploration of CV space.

Quantity	Description
metad.bias	the instantaneous value of the bias potential
metad.rbias	the instantaneous value of the bias normalized using the c(t) reweighting factor [rbias=bias-rct].This component can be used to obtain a reweighted histogram.
metad.rct	the reweighting factor \f$c(t)\f$.

Quantity	Description
uwall.bias	the instantaneous value of the bias potential
uwall.force2	the instantaneous value of the squared force due to this bias potential