Shortcut: PAMM

Module pamm
Description Usage
Probabilistic analysis of molecular motifs. used in 0 tutorialsused in 0 eggs

Output components

This action can calculate the values in the following table when the associated keyword is included in the input for the action. These values can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the value required from the list below.

Name Type Keyword Description
lessthan scalar LESS_THAN the number of colvars that have a value less than a threshold
morethan scalar MORE_THAN the number of colvars that have a value more than a threshold
altmin scalar ALT_MIN the minimum value of the cv
min scalar MIN the minimum colvar
max scalar MAX the maximum colvar
between scalar BETWEEN the number of colvars that have a value that lies in a particular interval
highest scalar HIGHEST the largest of the colvars
lowest scalar LOWEST the smallest of the colvars
sum scalar SUM the sum of the colvars
mean scalar MEAN the mean of the colvars

Further details and examples

Probabilistic analysis of molecular motifs.

Probabilistic analysis of molecular motifs (PAMM) was introduced in the papers in the bibliography. The essence of this approach involves calculating some large set of collective variables for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model. The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or secondary structure types.

The assumption within this implementation is that the fitting of the Gaussian mixture model has been done elsewhere by a separate code. You thus provide an input file to this action which contains the means, covariance matrices and weights for a set of Gaussian kernels, {ϕ}. The values and derivatives for the following set of quantities is then computed:

sk=ϕkiϕi

Each of the ϕk is a Gaussian function that acts on a set in quantities calculated that might be calculated using a TORSION, DISTANCE or ANGLE action for example. These quantities are then inserted into the set of n kernels that are in the the input file. This will be done for multiple sets of values for the input quantities and a final quantity will be calculated by summing the above sk values or some transformation of the above. This sounds less complicated than it is and is best understood by looking through the example given below, which can be expanded to show the full set of operations that PLUMED is performing.

\warning Mixing input variables that are periodic with variables that are not periodic has not been tested

Examples

In this example I will explain in detail what the following input is computing:

Click on the labels of the actions for more information on what each action computes
tested on2.11
#SETTINGS MOLFILE=regtest/pamm/rt-pamm-periodic/M1d.pdb INPUTFILES=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp
MOLINFOThis command is used to provide information on the molecules that are present in your system. More details MOLTYPE what kind of molecule is contained in the pdb file - usually not needed since protein/RNA/DNA are compatible=protein STRUCTUREa file in pdb format containing a reference structure=
regtest/pamm/rt-pamm-periodic/M1d.pdb
Click here to see an extract from this file.
×

FILE: regtest/pamm/rt-pamm-periodic/M1d.pdb

ATOM      1  CH3 ACE     1      -0.595   0.734   0.353  1.00  0.00           C  
ATOM      2  C   ACE     1       0.767   0.829  -0.308  1.00  0.00           C  
ATOM      3  O   ACE     1       0.968  -0.029  -1.182  1.00  0.00           O  
ATOM      4 1HH3 ACE     1      -0.561   0.034   1.174  1.00  0.00           H  
ATOM      5 2HH3 ACE     1      -0.893   1.700   0.729  1.00  0.00           H  
...
ATOM    380  H   NME    23      14.888 -21.273  -9.959  1.00  0.00           H  
ATOM    381 HH31 NME    23      17.241 -22.202 -11.437  1.00  0.00           H  
ATOM    382 HH32 NME    23      15.946 -23.378 -11.638  1.00  0.00           H  
ATOM    383 HH33 NME    23      15.734 -21.723 -12.191  1.00  0.00           H  
END
psi: TORSIONCalculate one or multiple torsional angles. More details ATOMS1the four atoms involved in the torsional angle=@psi-2the four atoms that are required to calculate the psi dihedral for residue 2. Click here for more information. ATOMS2the four atoms involved in the torsional angle=@psi-3the four atoms that are required to calculate the psi dihedral for residue 3. Click here for more information. ATOMS3the four atoms involved in the torsional angle=@psi-4the four atoms that are required to calculate the psi dihedral for residue 4. Click here for more information. phi: TORSIONCalculate one or multiple torsional angles. More details ATOMS1the four atoms involved in the torsional angle=@phi-2the four atoms that are required to calculate the phi dihedral for residue 2. Click here for more information. ATOMS2the four atoms involved in the torsional angle=@phi-3the four atoms that are required to calculate the phi dihedral for residue 3. Click here for more information. ATOMS3the four atoms involved in the torsional angle=@phi-4the four atoms that are required to calculate the phi dihedral for residue 4. Click here for more information. p: PAMMProbabilistic analysis of molecular motifs. This action is a shortcut and it has hidden defaults. More details ARGthe vectors from which the pamm coordinates are calculated=phi,psi CLUSTERSthe name of the file that contains the definitions of all the clusters=
regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp
Click here to see an extract from this file.
×

FILE: regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp

#! FIELDS height phi psi sigma_phi_phi sigma_phi_psi sigma_psi_phi sigma_psi_psi 
#! SET kerneltype von-misses
      2.97197455E-0001     -1.91983118E+0000      2.25029540E+0000      2.45960237E-0001     -1.30615381E-0001     -1.30615381E-0001      2.40239117E-0001  
      2.29131448E-0002      1.39809354E+0000      9.54585380E-0002      9.61755708E-0002     -3.55657919E-0002     -3.55657919E-0002      1.06147253E-0001  
      5.06676398E-0001     -1.09648066E+0000     -7.17867907E-0001      1.40523052E-0001     -1.05385552E-0001     -1.05385552E-0001      1.63290557E-0001  
MEAN calculate the mean of all the quantities
PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=p-1_mean,p-2_mean FILEthe name of the file on which to output these quantities=colvar

The best place to start our explanation is to look at the contents of the 2D-testc-0.75.pammp file, which you can do by clicking on the links in the annotated input above. This files contains the parameters of two two-dimensional Gaussian functions. Each of these Gaussian kernels has a weight, wk, a vector that specifies the position of its center, ck, and a covariance matrix, Σk. The ϕk functions that we use to calculate our PAMM components are thus:

ϕk=wkNkexp((sck)TΣ1k(sck))

In the above Nk is a normalization factor that is calculated based on Σ. The vector s is a vector of quantities that are calculated by the input TORSION actions. This vector must be two dimensional and in this case each component is the value of a torsion angle. If we look at the two TORSION actions in the above we are calculating the ϕ and ψ backbone torsional angles in a protein (Note the use of MOLINFO to make specification of atoms straightforward). We thus calculate the values of our 2 {ϕ} kernels 3 times. The first time we use the ϕ and ψ angles in the second residue of the protein, the second time it is the ϕ and ψ angles of the third residue of the protein and the third time it is the ϕ and ψ angles of the fourth residue in the protein. The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages over these three residues for the quantities:

s1=ϕ1ϕ1+ϕ2

and

s2=ϕ2ϕ1+ϕ2

There is a great deal of flexibility in this input. We can work with, and examine, any number of components, we can use any set of collective variables and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums. Furthermore, by expanding the shortcuts in the example above we can obtain insight into how the PAMM method operates.

References

More information about how this action can be used is available in the following articles: - P. Gasparotto, M. Ceriotti, Recognizing molecular patterns by machine learning: An agnostic structural definition of the hydrogen bond. The Journal of Chemical Physics. 141 (2014) - P. Gasparotto, R. H. Meißner, M. Ceriotti, Recognizing Local and Global Structural Motifs at the Atomic Scale. Journal of Chemical Theory and Computation. 14, 486–498 (2018)

Syntax

The following table describes the keywords and options that can be used with this action

Keyword Type Default Description
ARG compulsory none the vectors from which the pamm coordinates are calculated
CLUSTERS compulsory none the name of the file that contains the definitions of all the clusters
REGULARISE compulsory 0.001 don't allow the denominator to be smaller then this value
KERNELS compulsory all which kernels are we computing the PAMM values for
LESS_THAN optional not used calculate the number of variables that are less than a certain target value. Options for this keyword are explained in the documentation for LESS_THAN.
MORE_THAN optional not used calculate the number of variables that are more than a certain target value. Options for this keyword are explained in the documentation for MORE_THAN.
ALT_MIN optional not used calculate the minimum value
MIN optional not used calculate the minimum value
MAX optional not used calculate the maximum value
BETWEEN optional not used calculate the number of values that are within a certain range. Options for this keyword are explained in the documentation for BETWEEN.
HIGHEST optional false this flag allows you to recover the highest of these variables
HISTOGRAM optional not used calculate a discretized histogram of the distribution of values
LOWEST optional false this flag allows you to recover the lowest of these variables
SUM optional false calculate the sum of all the quantities
MEAN optional false calculate the mean of all the quantities