LCOV - code coverage report
Current view: top level - pamm - PAMM.cpp (source / functions) Hit Total Coverage
Test: plumed test coverage Lines: 47 53 88.7 %
Date: 2025-03-25 09:33:27 Functions: 2 3 66.7 %

          Line data    Source code
       1             : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       2             :    Copyright (c) 2015-2020 The plumed team
       3             :    (see the PEOPLE file at the root of the distribution for a list of names)
       4             : 
       5             :    See http://www.plumed.org for more information.
       6             : 
       7             :    This file is part of plumed, version 2.
       8             : 
       9             :    plumed is free software: you can redistribute it and/or modify
      10             :    it under the terms of the GNU Lesser General Public License as published by
      11             :    the Free Software Foundation, either version 3 of the License, or
      12             :    (at your option) any later version.
      13             : 
      14             :    plumed is distributed in the hope that it will be useful,
      15             :    but WITHOUT ANY WARRANTY; without even the implied warranty of
      16             :    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      17             :    GNU Lesser General Public License for more details.
      18             : 
      19             :    You should have received a copy of the GNU Lesser General Public License
      20             :    along with plumed.  If not, see <http://www.gnu.org/licenses/>.
      21             : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
      22             : #include "core/ActionRegister.h"
      23             : #include "core/ActionShortcut.h"
      24             : #include "multicolvar/MultiColvarShortcuts.h"
      25             : #include "tools/IFile.h"
      26             : #include "core/ActionSetup.h"
      27             : 
      28             : //+PLUMEDOC MCOLVARF PAMM
      29             : /*
      30             : Probabilistic analysis of molecular motifs.
      31             : 
      32             : Probabilistic analysis of molecular motifs (PAMM) was introduced in the papers in the bibliography.
      33             : The essence of this approach involves calculating some large set of collective variables
      34             : for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model.
      35             : The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or
      36             : secondary structure types.
      37             : 
      38             : The assumption within this implementation is that the fitting of the Gaussian mixture model has been
      39             : done elsewhere by a separate code.  You thus provide an input file to this action which contains the
      40             : means, covariance matrices and weights for a set of Gaussian kernels, $\{\phi\}$.  The values and
      41             : derivatives for the following set of quantities is then computed:
      42             : 
      43             : $$
      44             : s_k = \frac{ \phi_k}{ \sum_i \phi_i }
      45             : $$
      46             : 
      47             : Each of the $\phi_k$ is a Gaussian function that acts on a set in quantities calculated that might be calculated
      48             : using a [TORSION](TORSION.md), [DISTANCE](DISTANCE.md) or [ANGLE](ANGLE.md) action for example.
      49             : These quantities are then inserted into the set of $n$ kernels that are in the the input file.   This will be done for multiple sets of values
      50             : for the input quantities and a final quantity will be calculated by summing the above $s_k$ values or
      51             : some transformation of the above.  This sounds less complicated than it is and is best understood by
      52             : looking through the example given below, which can be expanded to show the full set of operations that PLUMED is performing.
      53             : 
      54             : \warning Mixing input variables that are periodic with variables that are not periodic has not been tested
      55             : 
      56             : ## Examples
      57             : 
      58             : In this example I will explain in detail what the following input is computing:
      59             : 
      60             : ```plumed
      61             : #SETTINGS MOLFILE=regtest/pamm/rt-pamm-periodic/M1d.pdb INPUTFILES=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp
      62             : MOLINFO MOLTYPE=protein STRUCTURE=regtest/pamm/rt-pamm-periodic/M1d.pdb
      63             : psi: TORSION ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
      64             : phi: TORSION ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
      65             : p: PAMM ARG=phi,psi CLUSTERS=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp MEAN
      66             : PRINT ARG=p-1_mean,p-2_mean FILE=colvar
      67             : ```
      68             : 
      69             : The best place to start our explanation is to look at the contents of the `2D-testc-0.75.pammp` file, which you can do
      70             : by clicking on the links in the annotated input above.  This files contains the parameters of two two-dimensional Gaussian functions.
      71             : Each of these Gaussian kernels has a weight, $w_k$, a vector that specifies the position of its center, $\mathbf{c}_k$, and a covariance matrix, $\Sigma_k$.
      72             : The $\phi_k$ functions that we use to calculate our PAMM components are thus:
      73             : 
      74             : $$
      75             : \phi_k = \frac{w_k}{N_k} \exp\left( -(\mathbf{s} - \mathbf{c}_k)^T \Sigma^{-1}_k (\mathbf{s} - \mathbf{c}_k) \right)
      76             : $$
      77             : 
      78             : In the above $N_k$ is a normalization factor that is calculated based on $\Sigma$.  The vector $\mathbf{s}$ is a vector of quantities
      79             : that are calculated by the input [TORSION](TORSION.md) actions.  This vector must be two dimensional and in this case each component is the value of a
      80             : torsion angle.  If we look at the two TORSION actions in the above we are calculating the $\phi$ and $\psi$ backbone torsional
      81             : angles in a protein (Note the use of [MOLINFO](MOLINFO.md) to make specification of atoms straightforward).  We thus calculate the values of our
      82             : 2 $ \{ \phi \} $  kernels 3 times.  The first time we use the $\phi$ and $\psi$ angles in the second residue of the protein,
      83             : the second time it is the $\phi$ and $\psi$ angles of the third residue of the protein and the third time it is the $\phi$ and $\psi$ angles
      84             : of the fourth residue in the protein.  The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages
      85             : over these three residues for the quantities:
      86             : 
      87             : $$
      88             : s_1 = \frac{ \phi_1}{ \phi_1 + \phi_2 }
      89             : $$
      90             : 
      91             : and
      92             : 
      93             : $$
      94             : s_2 = \frac{ \phi_2}{ \phi_1 + \phi_2 }
      95             : $$
      96             : 
      97             : There is a great deal of flexibility in this input.  We can work with, and examine, any number of components, we can use any set of collective variables
      98             : and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.  Furthermore,
      99             : by expanding the shortcuts in the example above we can obtain insight into how the PAMM method operates.
     100             : */
     101             : //+ENDPLUMEDOC
     102             : 
     103             : namespace PLMD {
     104             : namespace pamm {
     105             : 
     106             : class PAMM : public ActionShortcut {
     107             : public:
     108             :   static void registerKeywords( Keywords& keys );
     109             :   explicit PAMM(const ActionOptions&);
     110             : };
     111             : 
     112             : PLUMED_REGISTER_ACTION(PAMM,"PAMM")
     113             : 
     114           4 : void PAMM::registerKeywords( Keywords& keys ) {
     115           4 :   ActionShortcut::registerKeywords( keys );
     116           4 :   keys.add("compulsory","ARG","the vectors from which the pamm coordinates are calculated");
     117           4 :   keys.add("compulsory","CLUSTERS","the name of the file that contains the definitions of all the clusters");
     118           4 :   keys.add("compulsory","REGULARISE","0.001","don't allow the denominator to be smaller then this value");
     119           4 :   keys.add("compulsory","KERNELS","all","which kernels are we computing the PAMM values for");
     120           4 :   multicolvar::MultiColvarShortcuts::shortcutKeywords( keys );
     121           4 :   keys.needsAction("KERNEL");
     122           4 :   keys.needsAction("COMBINE");
     123           4 :   keys.addDOI("10.1063/1.4900655");
     124           4 :   keys.addDOI("10.1021/acs.jctc.7b00993");
     125           4 : }
     126             : 
     127           2 : PAMM::PAMM(const ActionOptions& ao) :
     128             :   Action(ao),
     129           2 :   ActionShortcut(ao) {
     130             :   // Must get list of input value names
     131             :   std::vector<std::string> valnames;
     132           4 :   parseVector("ARG",valnames);
     133             :   // Create input values
     134           2 :   std::string argstr=" ARG=" + valnames[0];
     135           3 :   for(unsigned j=1; j<valnames.size(); ++j) {
     136           2 :     argstr += "," + valnames[j];
     137             :   }
     138             : 
     139             :   // Create actions to calculate all pamm kernels
     140             :   unsigned nkernels = 0;
     141             :   double h;
     142             :   std::string fname;
     143           2 :   parse("CLUSTERS",fname);
     144           2 :   IFile ifile;
     145           2 :   ifile.open(fname);
     146           2 :   ifile.allowIgnoredFields();
     147             :   for(unsigned k=0;; ++k) {
     148          22 :     if( !ifile.scanField("height",h) ) {
     149             :       break;
     150             :     }
     151             :     // Create a kernel for this cluster
     152             :     std::string num, wstr, ktype;
     153           9 :     Tools::convert( k+1, num );
     154           9 :     Tools::convert(h,wstr);
     155           9 :     ifile.scanField("kerneltype",ktype);
     156          18 :     readInputLine( getShortcutLabel() + "_kernel-" + num + ": KERNEL NORMALIZED" + argstr  + " NUMBER=" + num + " REFERENCE=" + fname + " WEIGHT=" + wstr + " TYPE=" + ktype );
     157           9 :     nkernels++;
     158           9 :     ifile.scanField();
     159           9 :   }
     160           2 :   ifile.close();
     161             : 
     162             :   // And add on the regularization
     163             :   std::string regparam;
     164           4 :   parse("REGULARISE",regparam);
     165             :   // Now combine all the PAMM objects with the regparam
     166           2 :   std::string paramstr, cinput = getShortcutLabel() + "_ksum: COMBINE PERIODIC=NO";
     167          11 :   for(unsigned k=0; k<nkernels; ++k) {
     168             :     std::string num;
     169           9 :     Tools::convert( k+1, num );
     170           9 :     if( k==0 ) {
     171             :       cinput += " ARG=";
     172           4 :       paramstr=" PARAMETERS=-" + regparam;
     173             :     } else {
     174             :       cinput += ",";
     175             :       paramstr += ",0";
     176             :     }
     177          18 :     cinput += getShortcutLabel() + "_kernel-" + num;
     178             :   }
     179           4 :   readInputLine( cinput + paramstr );
     180             : 
     181             :   // And now compute all the pamm kernels
     182             :   std::string kchoice;
     183           4 :   parse("KERNELS",kchoice);
     184             :   std::map<std::string,std::string> keymap;
     185           2 :   multicolvar::MultiColvarShortcuts::readShortcutKeywords( keymap, this );
     186           2 :   if( kchoice=="all" ) {
     187          11 :     for(unsigned k=0; k<nkernels; ++k) {
     188             :       std::string num;
     189           9 :       Tools::convert( k+1, num );
     190          18 :       readInputLine( getShortcutLabel() + "-" + num + ": CUSTOM ARG=" + getShortcutLabel() + "_kernel-" + num + "," + getShortcutLabel() + "_ksum FUNC=x/y PERIODIC=NO");
     191          18 :       multicolvar::MultiColvarShortcuts::expandFunctions( getShortcutLabel() + "-" + num, getShortcutLabel() + "-" + num, "", keymap, this );
     192             :     }
     193             :   } else {
     194           0 :     std::vector<std::string> awords=Tools::getWords(kchoice,"\t\n ,");
     195           0 :     Tools::interpretRanges( awords );
     196           0 :     for(unsigned k=0; k<awords.size(); ++k) {
     197           0 :       readInputLine( getShortcutLabel() + "-" + awords[k] + ": CUSTOM ARG=" + getShortcutLabel() + "_kernel-" + awords[k] + "," + getShortcutLabel() + "_ksum FUNC=x/y PERIODIC=NO");
     198           0 :       multicolvar::MultiColvarShortcuts::expandFunctions( getShortcutLabel() + "-" + awords[k], getShortcutLabel() + "-" + awords[k], "", keymap, this );
     199             :     }
     200           0 :   }
     201           4 : }
     202             : 
     203             : }
     204             : }

Generated by: LCOV version 1.16