LCOV - code coverage report
Current view: top level - pamm - PAMM.cpp (source / functions) Hit Total Coverage
Test: plumed test coverage Lines: 38 43 88.4 %
Date: 2024-10-18 13:59:31 Functions: 2 3 66.7 %

          Line data    Source code
       1             : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       2             :    Copyright (c) 2015-2020 The plumed team
       3             :    (see the PEOPLE file at the root of the distribution for a list of names)
       4             : 
       5             :    See http://www.plumed.org for more information.
       6             : 
       7             :    This file is part of plumed, version 2.
       8             : 
       9             :    plumed is free software: you can redistribute it and/or modify
      10             :    it under the terms of the GNU Lesser General Public License as published by
      11             :    the Free Software Foundation, either version 3 of the License, or
      12             :    (at your option) any later version.
      13             : 
      14             :    plumed is distributed in the hope that it will be useful,
      15             :    but WITHOUT ANY WARRANTY; without even the implied warranty of
      16             :    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      17             :    GNU Lesser General Public License for more details.
      18             : 
      19             :    You should have received a copy of the GNU Lesser General Public License
      20             :    along with plumed.  If not, see <http://www.gnu.org/licenses/>.
      21             : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
      22             : #include "core/ActionRegister.h"
      23             : #include "core/ActionShortcut.h"
      24             : #include "multicolvar/MultiColvarShortcuts.h"
      25             : #include "tools/IFile.h"
      26             : #include "core/ActionSetup.h"
      27             : 
      28             : //+PLUMEDOC MCOLVARF PAMM
      29             : /*
      30             : Probabilistic analysis of molecular motifs.
      31             : 
      32             : Probabilistic analysis of molecular motifs (PAMM) was introduced in this paper \cite pamm.
      33             : The essence of this approach involves calculating some large set of collective variables
      34             : for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model.
      35             : The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or
      36             : secondary structure types.
      37             : 
      38             : The assumption within this implementation is that the fitting of the Gaussian mixture model has been
      39             : done elsewhere by a separate code.  You thus provide an input file to this action which contains the
      40             : means, covariance matrices and weights for a set of Gaussian kernels, \f$\{ \phi \}\f$.  The values and
      41             : derivatives for the following set of quantities is then computed:
      42             : 
      43             : \f[
      44             : s_k = \frac{ \phi_k}{ \sum_i \phi_i }
      45             : \f]
      46             : 
      47             : Each of the \f$\phi_k\f$ is a Gaussian function that acts on a set of quantities calculated within
      48             : a \ref mcolv .  These might be \ref TORSIONS, \ref DISTANCES, \ref ANGLES or any one of the many
      49             : symmetry functions that are available within \ref mcolv actions.  These quantities are then inserted into
      50             : the set of \f$n\f$ kernels that are in the the input file.   This will be done for multiple sets of values
      51             : for the input quantities and a final quantity will be calculated by summing the above \f$s_k\f$ values or
      52             : some transformation of the above.  This sounds less complicated than it is and is best understood by
      53             : looking through the example given below.
      54             : 
      55             : \warning Mixing \ref mcolv actions that are periodic with variables that are not periodic has not been tested
      56             : 
      57             : \par Examples
      58             : 
      59             : In this example I will explain in detail what the following input is computing:
      60             : 
      61             : \plumedfile
      62             : #SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
      63             : MOLINFO MOLTYPE=protein STRUCTURE=M1d.pdb
      64             : psi: TORSIONS ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
      65             : phi: TORSIONS ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
      66             : p: PAMM DATA=phi,psi CLUSTERS=clusters.pamm MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
      67             : PRINT ARG=p.mean-1,p.mean-2 FILE=colvar
      68             : \endplumedfile
      69             : 
      70             : The best place to start our explanation is to look at the contents of the clusters.pamm file
      71             : 
      72             : \auxfile{clusters.pamm}
      73             : #! FIELDS height phi psi sigma_phi_phi sigma_phi_psi sigma_psi_phi sigma_psi_psi
      74             : #! SET multivariate von-misses
      75             : #! SET kerneltype gaussian
      76             :       2.97197455E-0001     -1.91983118E+0000      2.25029540E+0000      2.45960237E-0001     -1.30615381E-0001     -1.30615381E-0001      2.40239117E-0001
      77             :       2.29131448E-0002      1.39809354E+0000      9.54585380E-0002      9.61755708E-0002     -3.55657919E-0002     -3.55657919E-0002      1.06147253E-0001
      78             :       5.06676398E-0001     -1.09648066E+0000     -7.17867907E-0001      1.40523052E-0001     -1.05385552E-0001     -1.05385552E-0001      1.63290557E-0001
      79             : \endauxfile
      80             : 
      81             : This files contains the parameters of two two-dimensional Gaussian functions.  Each of these Gaussian kernels has a weight, \f$w_k\f$,
      82             : a vector that specifies the position of its center, \f$\mathbf{c}_k\f$, and a covariance matrix, \f$\Sigma_k\f$.  The \f$\phi_k\f$ functions that
      83             : we use to calculate our PAMM components are thus:
      84             : 
      85             : \f[
      86             : \phi_k = \frac{w_k}{N_k} \exp\left( -(\mathbf{s} - \mathbf{c}_k)^T \Sigma^{-1}_k (\mathbf{s} - \mathbf{c}_k) \right)
      87             : \f]
      88             : 
      89             : In the above \f$N_k\f$ is a normalization factor that is calculated based on \f$\Sigma\f$.  The vector \f$\mathbf{s}\f$ is a vector of quantities
      90             : that are calculated by the \ref TORSIONS actions.  This vector must be two dimensional and in this case each component is the value of a
      91             : torsion angle.  If we look at the two \ref TORSIONS actions in the above we are calculating the \f$\phi\f$ and \f$\psi\f$ backbone torsional
      92             : angles in a protein (Note the use of \ref MOLINFO to make specification of atoms straightforward).  We thus calculate the values of our
      93             : 2 \f$ \{ \phi \} \f$  kernels 3 times.  The first time we use the \f$\phi\f$ and \f$\psi\f$ angles in the second residue of the protein,
      94             : the second time it is the \f$\phi\f$ and \f$\psi\f$ angles of the third residue of the protein and the third time it is the \f$\phi\f$ and \f$\psi\f$ angles
      95             : of the fourth residue in the protein.  The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages
      96             : over these three residues for the quantities:
      97             : \f[
      98             : s_1 = \frac{ \phi_1}{ \phi_1 + \phi_2 }
      99             : \f]
     100             : and
     101             : \f[
     102             : s_2 = \frac{ \phi_2}{ \phi_1 + \phi_2 }
     103             : \f]
     104             : There is a great deal of flexibility in this input.  We can work with, and examine, any number of components, we can use any set of collective variables
     105             : and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.
     106             : */
     107             : //+ENDPLUMEDOC
     108             : 
     109             : namespace PLMD {
     110             : namespace pamm {
     111             : 
     112             : class PAMM : public ActionShortcut {
     113             : public:
     114             :   static void registerKeywords( Keywords& keys );
     115             :   explicit PAMM(const ActionOptions&);
     116             : };
     117             : 
     118             : PLUMED_REGISTER_ACTION(PAMM,"PAMM")
     119             : 
     120           4 : void PAMM::registerKeywords( Keywords& keys ) {
     121           4 :   ActionShortcut::registerKeywords( keys );
     122           8 :   keys.add("compulsory","ARG","the vectors from which the pamm coordinates are calculated");
     123           8 :   keys.add("compulsory","CLUSTERS","the name of the file that contains the definitions of all the clusters");
     124           8 :   keys.add("compulsory","REGULARISE","0.001","don't allow the denominator to be smaller then this value");
     125           8 :   keys.add("compulsory","KERNELS","all","which kernels are we computing the PAMM values for");
     126           4 :   multicolvar::MultiColvarShortcuts::shortcutKeywords( keys );
     127           8 :   keys.needsAction("KERNEL"); keys.needsAction("COMBINE");
     128           4 : }
     129             : 
     130           2 : PAMM::PAMM(const ActionOptions& ao) :
     131             :   Action(ao),
     132           2 :   ActionShortcut(ao)
     133             : {
     134             :   // Must get list of input value names
     135           4 :   std::vector<std::string> valnames; parseVector("ARG",valnames);
     136             :   // Create input values
     137           2 :   std::string argstr=" ARG=" + valnames[0];
     138           3 :   for(unsigned j=1; j<valnames.size(); ++j) argstr += "," + valnames[j];
     139             : 
     140             :   // Create actions to calculate all pamm kernels
     141             :   unsigned nkernels = 0;  double h;
     142           2 :   std::string fname; parse("CLUSTERS",fname);
     143           2 :   IFile ifile; ifile.open(fname); ifile.allowIgnoredFields();
     144             :   for(unsigned k=0;; ++k) {
     145          22 :     if( !ifile.scanField("height",h) ) break;
     146             :     // Create a kernel for this cluster
     147           9 :     std::string num, wstr, ktype; Tools::convert( k+1, num ); Tools::convert(h,wstr); ifile.scanField("kerneltype",ktype);
     148          18 :     readInputLine( getShortcutLabel() + "_kernel-" + num + ": KERNEL NORMALIZED" + argstr  + " NUMBER=" + num + " REFERENCE=" + fname + " WEIGHT=" + wstr + " TYPE=" + ktype );
     149           9 :     nkernels++; ifile.scanField();
     150           9 :   }
     151           2 :   ifile.close();
     152             : 
     153             :   // And add on the regularization
     154           4 :   std::string regparam; parse("REGULARISE",regparam);
     155             :   // Now combine all the PAMM objects with the regparam
     156           2 :   std::string paramstr, cinput = getShortcutLabel() + "_ksum: COMBINE PERIODIC=NO";
     157          11 :   for(unsigned k=0; k<nkernels; ++k) {
     158           9 :     std::string num; Tools::convert( k+1, num );
     159           9 :     if( k==0 ) {
     160           4 :       cinput += " ARG="; paramstr=" PARAMETERS=-" + regparam;
     161             :     } else {
     162             :       cinput += ","; paramstr += ",0";
     163             :     }
     164          18 :     cinput += getShortcutLabel() + "_kernel-" + num;
     165             :   }
     166           4 :   readInputLine( cinput + paramstr );
     167             : 
     168             :   // And now compute all the pamm kernels
     169           4 :   std::string kchoice; parse("KERNELS",kchoice);
     170           2 :   std::map<std::string,std::string> keymap; multicolvar::MultiColvarShortcuts::readShortcutKeywords( keymap, this );
     171           2 :   if( kchoice=="all" ) {
     172          11 :     for(unsigned k=0; k<nkernels; ++k) {
     173           9 :       std::string num; Tools::convert( k+1, num );
     174          18 :       readInputLine( getShortcutLabel() + "-" + num + ": CUSTOM ARG=" + getShortcutLabel() + "_kernel-" + num + "," + getShortcutLabel() + "_ksum FUNC=x/y PERIODIC=NO");
     175          18 :       multicolvar::MultiColvarShortcuts::expandFunctions( getShortcutLabel() + "-" + num, getShortcutLabel() + "-" + num, "", keymap, this );
     176             :     }
     177             :   } else {
     178           0 :     std::vector<std::string> awords=Tools::getWords(kchoice,"\t\n ,"); Tools::interpretRanges( awords );
     179           0 :     for(unsigned k=0; k<awords.size(); ++k) {
     180           0 :       readInputLine( getShortcutLabel() + "-" + awords[k] + ": CUSTOM ARG=" + getShortcutLabel() + "_kernel-" + awords[k] + "," + getShortcutLabel() + "_ksum FUNC=x/y PERIODIC=NO");
     181           0 :       multicolvar::MultiColvarShortcuts::expandFunctions( getShortcutLabel() + "-" + awords[k], getShortcutLabel() + "-" + awords[k], "", keymap, this );
     182             :     }
     183           0 :   }
     184           4 : }
     185             : 
     186             : }
     187             : }

Generated by: LCOV version 1.16