LCOV - code coverage report
Current view: top level - pamm - PAMM.cpp (source / functions) Hit Total Coverage
Test: plumed test coverage Lines: 50 65 76.9 %
Date: 2024-10-11 08:09:47 Functions: 8 11 72.7 %

          Line data    Source code
       1             : /* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       2             :    Copyright (c) 2015-2023 The plumed team
       3             :    (see the PEOPLE file at the root of the distribution for a list of names)
       4             : 
       5             :    See http://www.plumed.org for more information.
       6             : 
       7             :    This file is part of plumed, version 2.
       8             : 
       9             :    plumed is free software: you can redistribute it and/or modify
      10             :    it under the terms of the GNU Lesser General Public License as published by
      11             :    the Free Software Foundation, either version 3 of the License, or
      12             :    (at your option) any later version.
      13             : 
      14             :    plumed is distributed in the hope that it will be useful,
      15             :    but WITHOUT ANY WARRANTY; without even the implied warranty of
      16             :    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      17             :    GNU Lesser General Public License for more details.
      18             : 
      19             :    You should have received a copy of the GNU Lesser General Public License
      20             :    along with plumed.  If not, see <http://www.gnu.org/licenses/>.
      21             : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
      22             : #include "core/ActionRegister.h"
      23             : #include "tools/KernelFunctions.h"
      24             : #include "tools/IFile.h"
      25             : #include "multicolvar/MultiColvarBase.h"
      26             : #include "multicolvar/AtomValuePack.h"
      27             : #include "PammObject.h"
      28             : 
      29             : //+PLUMEDOC MCOLVARF PAMM
      30             : /*
      31             : Probabilistic analysis of molecular motifs.
      32             : 
      33             : Probabilistic analysis of molecular motifs (PAMM) was introduced in this paper \cite pamm.
      34             : The essence of this approach involves calculating some large set of collective variables
      35             : for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model.
      36             : The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or
      37             : secondary structure types.
      38             : 
      39             : The assumption within this implementation is that the fitting of the Gaussian mixture model has been
      40             : done elsewhere by a separate code.  You thus provide an input file to this action which contains the
      41             : means, covariance matrices and weights for a set of Gaussian kernels, \f$\{ \phi \}\f$.  The values and
      42             : derivatives for the following set of quantities is then computed:
      43             : 
      44             : \f[
      45             : s_k = \frac{ \phi_k}{ \sum_i \phi_i }
      46             : \f]
      47             : 
      48             : Each of the \f$\phi_k\f$ is a Gaussian function that acts on a set of quantities calculated within
      49             : a \ref mcolv .  These might be \ref TORSIONS, \ref DISTANCES, \ref ANGLES or any one of the many
      50             : symmetry functions that are available within \ref mcolv actions.  These quantities are then inserted into
      51             : the set of \f$n\f$ kernels that are in the the input file.   This will be done for multiple sets of values
      52             : for the input quantities and a final quantity will be calculated by summing the above \f$s_k\f$ values or
      53             : some transformation of the above.  This sounds less complicated than it is and is best understood by
      54             : looking through the example given below.
      55             : 
      56             : \warning Mixing \ref mcolv actions that are periodic with variables that are not periodic has not been tested
      57             : 
      58             : \par Examples
      59             : 
      60             : In this example I will explain in detail what the following input is computing:
      61             : 
      62             : \plumedfile
      63             : #SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
      64             : MOLINFO MOLTYPE=protein STRUCTURE=M1d.pdb
      65             : psi: TORSIONS ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
      66             : phi: TORSIONS ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
      67             : p: PAMM DATA=phi,psi CLUSTERS=clusters.pamm MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
      68             : PRINT ARG=p.mean-1,p.mean-2 FILE=colvar
      69             : \endplumedfile
      70             : 
      71             : The best place to start our explanation is to look at the contents of the clusters.pamm file
      72             : 
      73             : \auxfile{clusters.pamm}
      74             : #! FIELDS height phi psi sigma_phi_phi sigma_phi_psi sigma_psi_phi sigma_psi_psi
      75             : #! SET multivariate von-misses
      76             : #! SET kerneltype gaussian
      77             :       2.97197455E-0001     -1.91983118E+0000      2.25029540E+0000      2.45960237E-0001     -1.30615381E-0001     -1.30615381E-0001      2.40239117E-0001
      78             :       2.29131448E-0002      1.39809354E+0000      9.54585380E-0002      9.61755708E-0002     -3.55657919E-0002     -3.55657919E-0002      1.06147253E-0001
      79             :       5.06676398E-0001     -1.09648066E+0000     -7.17867907E-0001      1.40523052E-0001     -1.05385552E-0001     -1.05385552E-0001      1.63290557E-0001
      80             : \endauxfile
      81             : 
      82             : This files contains the parameters of two two-dimensional Gaussian functions.  Each of these Gaussian kernels has a weight, \f$w_k\f$,
      83             : a vector that specifies the position of its center, \f$\mathbf{c}_k\f$, and a covariance matrix, \f$\Sigma_k\f$.  The \f$\phi_k\f$ functions that
      84             : we use to calculate our PAMM components are thus:
      85             : 
      86             : \f[
      87             : \phi_k = \frac{w_k}{N_k} \exp\left( -(\mathbf{s} - \mathbf{c}_k)^T \Sigma^{-1}_k (\mathbf{s} - \mathbf{c}_k) \right)
      88             : \f]
      89             : 
      90             : In the above \f$N_k\f$ is a normalization factor that is calculated based on \f$\Sigma\f$.  The vector \f$\mathbf{s}\f$ is a vector of quantities
      91             : that are calculated by the \ref TORSIONS actions.  This vector must be two dimensional and in this case each component is the value of a
      92             : torsion angle.  If we look at the two \ref TORSIONS actions in the above we are calculating the \f$\phi\f$ and \f$\psi\f$ backbone torsional
      93             : angles in a protein (Note the use of \ref MOLINFO to make specification of atoms straightforward).  We thus calculate the values of our
      94             : 2 \f$ \{ \phi \} \f$  kernels 3 times.  The first time we use the \f$\phi\f$ and \f$\psi\f$ angles in the second residue of the protein,
      95             : the second time it is the \f$\phi\f$ and \f$\psi\f$ angles of the third residue of the protein and the third time it is the \f$\phi\f$ and \f$\psi\f$ angles
      96             : of the fourth residue in the protein.  The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages
      97             : over these three residues for the quantities:
      98             : \f[
      99             : s_1 = \frac{ \phi_1}{ \phi_1 + \phi_2 }
     100             : \f]
     101             : and
     102             : \f[
     103             : s_2 = \frac{ \phi_2}{ \phi_1 + \phi_2 }
     104             : \f]
     105             : There is a great deal of flexibility in this input.  We can work with, and examine, any number of components, we can use any set of collective variables
     106             : and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.
     107             : */
     108             : //+ENDPLUMEDOC
     109             : 
     110             : namespace PLMD {
     111             : namespace pamm {
     112             : 
     113             : class PAMM : public multicolvar::MultiColvarBase {
     114             : private:
     115             :   PammObject mypamm;
     116             : public:
     117             :   static void registerKeywords( Keywords& keys );
     118             :   explicit PAMM(const ActionOptions&);
     119             : /// We have to overwrite this here
     120             :   unsigned getNumberOfQuantities() const override;
     121             : /// Calculate the weight of this object ( average of input weights )
     122             :   using PLMD::multicolvar::MultiColvarBase::calculateWeight;
     123             :   void calculateWeight( multicolvar::AtomValuePack& myatoms );
     124             : /// Actually do the calculation
     125             :   double compute( const unsigned& tindex, multicolvar::AtomValuePack& myatoms ) const override;
     126             : /// This returns the position of the central atom
     127             :   Vector getCentralAtom();
     128             : /// Is the variable periodic
     129          16 :   bool isPeriodic() override { return false; }
     130             : };
     131             : 
     132       10423 : PLUMED_REGISTER_ACTION(PAMM,"PAMM")
     133             : 
     134           3 : void PAMM::registerKeywords( Keywords& keys ) {
     135           3 :   MultiColvarBase::registerKeywords( keys );
     136           6 :   keys.add("compulsory","DATA","the multicolvars from which the pamm coordinates are calculated");
     137           6 :   keys.add("compulsory","CLUSTERS","the name of the file that contains the definitions of all the clusters");
     138           6 :   keys.add("compulsory","REGULARISE","0.001","don't allow the denominator to be smaller then this value");
     139          15 :   keys.use("MEAN"); keys.use("MORE_THAN"); keys.use("SUM"); keys.use("LESS_THAN"); keys.use("HISTOGRAM");
     140          21 :   keys.use("MIN"); keys.use("MAX"); keys.use("LOWEST"); keys.use("HIGHEST"); keys.use("ALT_MIN"); keys.use("BETWEEN"); keys.use("MOMENTS");
     141           3 :   keys.setComponentsIntroduction("When the label of this action is used as the input for a second you are not referring to a scalar quantity as you are in "
     142             :                                  "regular collective variables.  The label is used to reference the full set of quantities calculated by "
     143             :                                  "the action.  This is usual when using \\ref multicolvarfunction. Generally when doing this the set of PAMM variables "
     144             :                                  "will be referenced using the DATA keyword rather than ARG.\n\n"
     145             :                                  "This Action can be used to calculate the following scalar quantities directly from the underlying set of PAMM variables. "
     146             :                                  "These quantities are calculated by employing the keywords listed below and they can be referenced elsewhere in the input "
     147             :                                  "file by using this Action's label followed by a dot and the name of the quantity.  The particular PAMM variable that should "
     148             :                                  "be averaged in a MEAN command or transformed by a switching function in a LESS_THAN command is specified using the COMPONENT "
     149             :                                  "keyword. COMPONENT=1 refers to the PAMM variable in which the first kernel in your input file is on the numerator, COMPONENT=2 refers to "
     150             :                                  "PAMM variable in which the second kernel in the input file is on the numerator and so on.  The same quantity can be calculated "
     151             :                                  "multiple times for different PAMM components by a single PAMM action.  In this case the relevant keyword must appear multiple "
     152             :                                  "times on the input line followed by a numerical identifier i.e. MEAN1, MEAN2, ...  The quantities calculated when multiple "
     153             :                                  "MEAN commands appear on the input line can be reference elsewhere in the input file by using the name of the quantity followed "
     154             :                                  "followed by a numerical identifier e.g. <em>label</em>.lessthan-1, <em>label</em>.lessthan-2 etc.  Alternatively, you can "
     155             :                                  "customize the labels of the quantities by using the LABEL keyword in the description of the keyword.");
     156           3 :   keys.remove("ALL_INPUT_SAME_TYPE");
     157           3 : }
     158             : 
     159           2 : PAMM::PAMM(const ActionOptions& ao):
     160             :   Action(ao),
     161           2 :   MultiColvarBase(ao)
     162             : {
     163             :   // This builds the lists
     164           2 :   buildSets();
     165             :   // Check for reasonable input
     166           5 :   for(unsigned i=0; i<getNumberOfBaseMultiColvars(); ++i) {
     167           3 :     if( getBaseMultiColvar(i)->getNumberOfQuantities()!=2 ) error("cannot use PAMM with " + getBaseMultiColvar(i)->getName() );
     168             :   }
     169             : 
     170           2 :   bool mixed=getBaseMultiColvar(0)->isPeriodic();
     171           2 :   std::vector<bool> pbc( getNumberOfBaseMultiColvars() );
     172           2 :   std::vector<std::string> valnames( getNumberOfBaseMultiColvars() );
     173           2 :   std::vector<std::string> min( getNumberOfBaseMultiColvars() ), max( getNumberOfBaseMultiColvars() );
     174           5 :   for(unsigned i=0; i<getNumberOfBaseMultiColvars(); ++i) {
     175           3 :     if( getBaseMultiColvar(i)->isPeriodic()!=mixed ) warning("mixing of periodic and aperiodic base variables in pamm is untested");
     176           3 :     pbc[i]=getBaseMultiColvar(i)->isPeriodic();
     177           3 :     if( pbc[i] ) getBaseMultiColvar(i)->retrieveDomain( min[i], max[i] );
     178           3 :     valnames[i]=getBaseMultiColvar(i)->getLabel();
     179             :   }
     180             : 
     181           4 :   double regulariser; parse("REGULARISE",regulariser);
     182           2 :   std::string errorstr, filename; parse("CLUSTERS",filename);
     183           2 :   mypamm.setup( filename, regulariser, valnames, pbc, min, max, errorstr );
     184           2 :   if( errorstr.length()>0 ) error( errorstr );
     185           4 : }
     186             : 
     187          86 : unsigned PAMM::getNumberOfQuantities() const {
     188          86 :   return 1 + mypamm.getNumberOfKernels();
     189             : }
     190             : 
     191           0 : void PAMM::calculateWeight( multicolvar::AtomValuePack& myatoms ) {
     192             :   unsigned nvars = getNumberOfBaseMultiColvars();
     193             :   // Weight of point is average of weights of input colvars?
     194           0 :   std::vector<double> tval(2); double ww=0;
     195           0 :   for(unsigned i=0; i<nvars; ++i) {
     196           0 :     getInputData( i, false, myatoms, tval ); ww+=tval[0];
     197             :   }
     198           0 :   myatoms.setValue( 0, ww / static_cast<double>( nvars ) );
     199             : 
     200           0 :   if(!doNotCalculateDerivatives() ) {
     201           0 :     double pref = 1.0 / static_cast<double>( nvars );
     202           0 :     for(unsigned ivar=0; ivar<nvars; ++ivar) {
     203             :       // Get the values of derivatives
     204           0 :       MultiValue& myder=getInputDerivatives( ivar, false, myatoms );
     205           0 :       for(unsigned j=0; j<myder.getNumberActive(); ++j) {
     206           0 :         unsigned jder=myder.getActiveIndex(j);
     207           0 :         myatoms.addDerivative( 0, jder, pref*myder.getDerivative( 0, jder ) );
     208             :       }
     209             :     }
     210             :   }
     211           0 : }
     212             : 
     213          28 : double PAMM::compute( const unsigned& tindex, multicolvar::AtomValuePack& myatoms ) const {
     214             :   unsigned nvars = getNumberOfBaseMultiColvars();
     215          28 :   std::vector<std::vector<double> > tderiv( mypamm.getNumberOfKernels() );
     216         174 :   for(unsigned i=0; i<tderiv.size(); ++i) tderiv[i].resize( nvars );
     217          28 :   std::vector<double> tval(2), invals( nvars ), vals( mypamm.getNumberOfKernels() );
     218             : 
     219          74 :   for(unsigned i=0; i<nvars; ++i) {
     220          46 :     getInputData( i, false, myatoms, tval ); invals[i]=tval[1];
     221             :   }
     222          28 :   mypamm.evaluate( invals, vals, tderiv );
     223             : 
     224             :   // Now set all values other than the first one
     225             :   // This is because of some peverse choices in multicolvar
     226         146 :   for(unsigned i=1; i<vals.size(); ++i) myatoms.setValue( 1+i, vals[i] );
     227             : 
     228          28 :   if( !doNotCalculateDerivatives() ) {
     229          28 :     std::vector<double> mypref( 1 + vals.size() );
     230          74 :     for(unsigned ivar=0; ivar<nvars; ++ivar) {
     231             :       // Get the values of the derivatives
     232          46 :       MultiValue& myder = getInputDerivatives( ivar, false, myatoms );
     233             :       // And calculate the derivatives
     234         318 :       for(unsigned i=0; i<vals.size(); ++i) mypref[1+i] = tderiv[i][ivar];
     235             :       // This is basically doing the chain rule to get the final derivatives
     236          46 :       splitInputDerivatives( 1, 1, 1+vals.size(), ivar, mypref, myder, myatoms );
     237             :       // And clear the derivatives
     238          46 :       myder.clearAll();
     239             :     }
     240             :   }
     241             : 
     242          56 :   return vals[0];
     243          28 : }
     244             : 
     245           0 : Vector PAMM::getCentralAtom() {
     246             :   // Who knows how this should work
     247           0 :   plumed_error();
     248             :   return Vector(1.0,0.0,0.0);
     249             : }
     250             : 
     251             : }
     252             : }

Generated by: LCOV version 1.15