MOLINFO
This is part of the generic module

This command is used to provide information on the molecules that are present in your system.

The information on the molecules in your system can either be provided in the form of a pdb file or as a set of lists of atoms that describe the various chains in your system. If a pdb file is used plumed the MOLINFO command will endeavor to recognize the various chains and residues that make up the molecules in your system using the chainIDs and resnumbers from the pdb file. You can then use this information in later commands to specify atom lists in terms residues. For example using this command you can find the backbone atoms in your structure automatically. Starting with PLUMED 2.7 you can use multiple MOLINFO actions. Every time you perform an atom selection, the last available MOLINFO action will be used. This allows you to provide multiple PDB files, for instance using different naming conventions (see #134).

Warning
Please be aware that the PDB parser in plumed is far from perfect. You should thus check the log file and examine what plumed is actually doing whenever you use the MOLINFO action. Also make sure that the atoms are listed in the pdb with the correct order. If you are using gromacs, the safest way is to use reference pdb file generated with gmx editconf -f topol.tpr -o reference.pdb.

More information of the PDB parser implemented in PLUMED can be found at this page.

Providing MOLTYPE=protein, MOLTYPE=rna, or MOLTYPE=dna will instruct plumed to look for known residues from these three types of molecule. In other words, this is available for historical reasons and to allow future extensions where alternative lists will be provided. As of now, you can just ignore this keyword.

Using MOLINFO extends the possibility of atoms selection using the @ special symbol. The following shortcuts are available that do not refer to one specific residue:

@nucleic : all atoms that are part of a DNA or RNA molecule
@protein : all atoms that are part of a protein
@water : all water molecules
@ions : all the ions
@hydrogens : all hydrogen atoms (those for which the first non-number in the name is a H)
@nonhydrogens : all non hydrogen atoms (those for which the first non-number in the name is not a H)
Warning
Be careful since these choices are based on common names used in PDB files. Always check if the selected atoms are correct.

In addition, atoms from a specific residue can be selected with a symbol in this form:

@"definition"-chain_residuenum
@"definition"-chainresiduenum
@"definition"-residuenum

So for example

@psi-1 will select the atoms defining the psi torsion of residue 1
@psi-C1  or @psi-C_1 will define the same torsion for residue 1 of chain C.
@psi-3_1 will define the same torsion for residue 1 of chain 3.

Using the underscore to separate chain and residue is available as of PLUMED 2.5 and allows selecting chains with a numeric id.

In the following are listed the current available definitions:

For protein residues, the following groups are available:

# quadruplets for dihedral angles
@phi-#
@psi-#
@omega-#
@chi1-#
@chi2-#
@chi3-#
@chi4-#
@chi5-#

# all sidechain atoms (excluding glycine, including all hydrogens)
@sidechain-#
# all backbone atoms (including hydrogens)
@back-#

that select the appropriate atoms that define each dihedral angle for residue #.

For DNA or RNA residues, the following groups are available:

# quadruplets for backbone dihedral angles
@alpha-#
@beta-#
@gamma-#
@delta-#
@epsilon-#
@zeta-#

# quadruplets for sugar dihedral angles
@v0-#
@v1-#
@v2-#
@v3-#
@v4-#

# quadruplet corresponding to the chi torsional angle
@chi-#

# backbone, sugar, and base heavy atoms
@back-#
@sugar-#
@base-#

# ordered triplets of atoms on the 6-membered ring of nucleobases
# namely:
#  C2/C4/C6 for pyrimidines
#  C2/C6/C4 for purines
@lcs-#

Notice that zeta and epsilon groups should not be used on 3' end residue and alpha and beta should not be used on 5' end residue.

Furthermore it is also possible to pick single atoms using the syntax atom-chain_residuenum, @atom-chainresiduenum or @atom-residuenum. As of PLUMED 2.5, this also works when the residue is not a protein/rna/dna residue. For instance, @OW-100 will select oxygen of water molecule with residue number 100.

Finally, notice that other shortcuts are available even when not using the MOLINFO command (see Specifying Atoms).

Warning
If a residue-chain is repeated twice in the reference pdb only the first entry will be selected.
Bug:
At the moment the HA1 atoms in a GLY residues are treated as if they are the CB atoms. This may or may not be true - GLY is problematic for secondary structure residues as it is achiral.
Bug:
If you use WHOLEMOLECULES RESIDUES=1-10 for a 18 amino acid protein ( 18 amino acids + 2 terminal groups = 20 residues ) the code will fail as it will not be able to interpret terminal residue 1.
Advanced atom selection with mdtraj or MDAnalysis

Since PLUMED 2.6 it is possible to use the expressive selection syntax of mdtraj and/or MDAnalysis:

Click on the labels of the actions for more information on what each action computes
tested on v2.9
MOLINFO 
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=helix.pdb
PYTHON_BIN
compulsory keyword ( default=default ) python interpreter
=python g1: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
=@mda:backbone g2: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
={@mda:{resnum 1 or resid 3:5}} g3: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
={@mda:{resid 3:5} @mda:{resnum 1}} g4: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
={@mdt:{protein and (backbone or resname ALA)}} g5: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
={@mdt:{mass 5.5 to 20}} # masses guessed by mdtraj based on atom type! g6: GROUP
ATOMS
the numerical indexes for the set of atoms in the group.
={@mda:{resid 3:5} @mda:{resnum 1} 1-10}

Here @mda: indicates that MDAnalysis language is used, whereas @mdt: indicates that mdtraj language is used. Notice that these languages typically select atoms in order. If you want to specify a different order, you can chain definitions as in g3 above (compare with g2). Selections can be also chained with standard PLUMED selections (see g6).

The double braces are required due to the way PLUMED parses atom lists. In particular:

  • The outer braces are needed to show PLUMED where the ATOMS=... option ends.
  • The inner braces are needed to show PLUMED where each selector ends.

MDAnalysis also supports geometric selectors based on atomic coordinates. These selectors are static and return lists computed using the coordinates stored in the MOLINFO pdb file.

In order to use this syntax you should check the following points at runtime:

  1. plumed --no-mpi config has subprocess prints subprocess on (should be ok on most UNIX systems).
  2. You have a python interpreter with mdtraj and/or MDAnalysis installed. You can check using:

    • python -c "import mdtraj"
    • python -c "import MDAnalysis"

    In order to install these packages refer to their documentation. Pip or conda install should be ok, provided you make sure the correct python interpreter is in the execution PATH at runtime. Notice that you will only need the package(s) related to the syntax that you want to use.

  3. In case you installed these modules on a python with a different name (e.g. python3.6), the correct check is:

    • python3.6 -c "import mdtraj"
    • python3.6 -c "import MDAnalysis"

    If this is the case, you should set the environment variable export PYTHON_BIN=python3.6 or export PLUMED_PYTHON_BIN=python3.6 (higher priority). Alternatively, directly provide the interpreter in the PLUMED input file using MOLINFO PYTHON_BIN=python3.6 (even higher priority).

  4. The PDB file that you provide to MOLINFO should have consecutive atom numbers starting from 1. This is currently enforced since reading atom numbers out of order (as PLUMED does) is not supported by other packages.
Advanced atom selection with VMD (experimental)

Similarly to the @mda: and @mdt: selectors above, you can use the two following selectors in order to access to VMD syntax for atoms selection:

  • @vmdexec:: This selector launches an instance of VMD, so vmd executable should be in your execution path. Might be very slow or even crash your simulation. Notice that even if vmd executable is used, the implementation is still python based and so a working python interpreter should be provided.
  • @vmd:: This selector tries to import the vmd python module. Notice that the best way to obtain this module is not within the standard VMD installer but rather by installing the python module that can be found at this link. The module is also available on conda. You should make sure the module is available in the python interpreter used by MOLINFO (check using the command python -c "import vmd").

These two selectors are experimental and might be removed at some point.

Examples

In the following example the MOLINFO command is used to provide the information on which atoms are in the backbone of a protein to the ALPHARMSD CV.

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
MOLINFO 
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=reference.pdb a: ALPHARMSD
RESIDUES
this command is used to specify the set of residues that could conceivably form part of the secondary structure.
=all
TYPE
compulsory keyword ( default=DRMSD ) the manner in which RMSD alignment is performed.
=DRMSD
LESS_THAN
calculate the number of variables less than a certain target value.
={RATIONAL R_0=0.08 NN=8 MM=12}

The following example prints the distance corresponding to the hydrogen bonds in a GC Watson-Crick pair.

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS MOLFILE=regtest/basic/rt-ermsd/ref.pdb
MOLINFO 
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=reference.pdb
MOLTYPE
compulsory keyword ( default=protein ) what kind of molecule is contained in the pdb file - usually not needed since protein/RNA/DNA are compatible
=dna hb1: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=@N2-2,@O2-15 hb2: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=@N1-2,@N3-15 hb3: DISTANCE
ATOMS
the pair of atom that we are calculating the distance between.
=@O6-2,@N4-15 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=hb1,hb2,hb3

This example use MOLINFO to calculate torsion angles

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
MOLINFO 
MOLTYPE
compulsory keyword ( default=protein ) what kind of molecule is contained in the pdb file - usually not needed since protein/RNA/DNA are compatible
=protein
STRUCTURE
compulsory keyword a file in pdb format containing a reference structure.
=myprotein.pdb t1: TORSION
ATOMS
the four atoms involved in the torsional angle
=@phi-3 t2: TORSION
ATOMS
the four atoms involved in the torsional angle
=@psi-4 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=t1,t2
FILE
the name of the file on which to output these quantities
=colvar
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=10
Glossary of keywords and components
The atoms involved can be specified using
CHAIN (for masochists ( mostly Davide Branduardi ) ) The atoms involved in each of the chains of interest in the structure.. For more information on how to specify lists of atoms see Groups and Virtual Atoms
Compulsory keywords
STRUCTURE a file in pdb format containing a reference structure. This is used to defines the atoms in the various residues, chains, etc . For more details on the PDB file format visit http://www.wwpdb.org/docs.html
MOLTYPE ( default=protein ) what kind of molecule is contained in the pdb file - usually not needed since protein/RNA/DNA are compatible
PYTHON_BIN ( default=default ) python interpreter
Options
WHOLE

( default=off ) The reference structure is whole, i.e. not broken by PBC