CS2BACKBONE
This is part of the isdb module

Calculates the backbone chemical shifts for a protein.

The functional form is that of CamShift [78]. The chemical shift of the selected nuclei can be saved as components. Alternatively one can calculate either the CAMSHIFT score (useful as a collective variable [60] or as a scoring function [119]) or a METAINFERENCE score (using DOSCORE). For these two latter cases experimental chemical shifts must be provided.

CS2BACKBONE calculation can be relatively heavy because it often uses a large number of atoms, it can be run in parallel using MPI and OpenMP.

As a general rule, when using CS2BACKBONE or other experimental restraints it may be better to increase the accuracy of the constraint algorithm due to the increased strain on the bonded structure. In the case of GROMACS it is safer to use lincs-iter=2 and lincs-order=6.

In general the system for which chemical shifts are calculated must be completely included in ATOMS and a TEMPLATE pdb file for the same atoms should be provided as well in the folder DATADIR. The system is made automatically whole unless NOPBC is used, in particular if the system is made by multiple chains it is usually better to use NOPBC and make the molecule whole WHOLEMOLECULES selecting an appropriate order of the atoms. The pdb file is needed to the generate a simple topology of the protein. For histidine residues in protonation states different from D the HIE/HSE HIP/HSP name should be used. GLH and ASH can be used for the alternative protonation of GLU and ASP. Non-standard amino acids and other molecules are not yet supported, but in principle they can be named UNK. If multiple chains are present the chain identifier must be in the standard PDB format, together with the TER keyword at the end of each chain. Termini groups like ACE or NME should be removed from the TEMPLATE pdb because they are not recognized by CS2BACKBONE.

Atoms indices in the TEMPLATE file should be numbered from 1 to N where N is the number of atoms used in ATOMS. This is not a problem for simple cases where atoms goes from 1 to N but is instead something to be carefull in case that a terminal group is removed from the PDB file.

In addition to a pdb file one needs to provide a list of chemical shifts to be calculated using one file per nucleus type (CAshifts.dat, CBshifts.dat, Cshifts.dat, Hshifts.dat, HAshifts.dat, Nshifts.dat), add only the files for the nuclei you need, but each file should include all protein residues. A chemical shift for a nucleus is calculated if a value greater than 0 is provided. For practical purposes the value can correspond to the experimental value. Residues numbers should match that used in the pdb file, but must be positive, so double check the pdb. The first and last residue of each chain should be preceded by a # character.

CAshifts.dat:
#1 0.0
2 55.5
3 58.4
.
.
#last 0.0
#first of second chain
.
#last of second chain

The default behavior is to store the values for the active nuclei in components (ca-#, cb-#, co-#, ha-#, hn-#, nh-# and expca-#, expcb-#, expco-#, expha-#, exphn-#, exp-nh#) with NOEXP it is possible to only store the back-calculated values, where # includes a chain and residue number.

One additional file is always needed in the folder DATADIR: camshift.db. This file includes all the parameters needed to calculate the chemical shifts and can be found in regtest/isdb/rt-cs2backbone/data/ .

Additional material and examples can be also found in the tutorial ISDB: setting up a Metadynamics Metainference simulation as well as in the cs2backbone regtests in the isdb folder.

Examples

In this first example the chemical shifts are used to calculate a collective variable to be used in NMR driven Metadynamics [60] :

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS AUXFOLDER=regtest/isdb/rt-cs2backbone/data
whole: GROUP 
ATOMS
the numerical indexes for the set of atoms in the group.
=2612-2514:-1,961-1:-1,2466-962:-1,2513-2467:-1 WHOLEMOLECULES
ENTITY0
the atoms that make up a molecule that you wish to align.
=whole cs: CS2BACKBONE
ATOMS
The atoms to be included in the calculation, e.g.
=1-2612
DATADIR
compulsory keyword ( default=data/ ) The folder with the experimental chemical shifts.
=data/
TEMPLATE
compulsory keyword ( default=template.pdb ) A PDB file of the protein system.
=template.pdb
CAMSHIFT
( default=off ) Set to TRUE if you to calculate a single CamShift score.
NOPBC
( default=off ) ignore the periodic boundary conditions when calculating distances
metad: METAD
ARG
the input for this action is the scalar output from one or more other actions.
=cs
HEIGHT
the heights of the Gaussian hills.
=0.5
SIGMA
compulsory keyword the widths of the Gaussian hills
=0.1
PACE
compulsory keyword the frequency for hill addition
=200
BIASFACTOR
use well tempered metadynamics and use this bias factor.
=10 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=cs,metad.bias
FILE
the name of the file on which to output these quantities
=COLVAR
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=100

In this second example the chemical shifts are used as replica-averaged restrained as in [37] [38] .

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS AUXFOLDER=regtest/isdb/rt-cs2backbone/data NREPLICAS=2
cs: CS2BACKBONE 
ATOMS
The atoms to be included in the calculation, e.g.
=1-174
DATADIR
compulsory keyword ( default=data/ ) The folder with the experimental chemical shifts.
=data/ encs: ENSEMBLE
ARG
the input for this action is the scalar output from one or more other actions.
=(cs\.hn-.*),(cs\.nh-.*) stcs: STATS
ARG
the input for this action is the scalar output from one or more other actions.
=encs.*
SQDEVSUM
( default=off ) calculates only SQDEVSUM
PARARG
the input for this action is the scalar output from one or more other actions without derivatives.
=(cs\.exphn-.*),(cs\.expnh-.*) RESTRAINT
ARG
the input for this action is the scalar output from one or more other actions.
=stcs.sqdevsum
AT
compulsory keyword the position of the restraint
=0
KAPPA
compulsory keyword ( default=0.0 ) specifies that the restraint is harmonic and what the values of the force constants on each of the variables are
=0
SLOPE
compulsory keyword ( default=0.0 ) specifies that the restraint is linear and what the values of the force constants on each of the variables are
=24
PRINT ARG=(cs\.hn-.*),(cs\.nh-.*) FILE=RESTRAINT STRIDE=100

This third example show how to use chemical shifts to calculate a METAINFERENCE score .

Click on the labels of the actions for more information on what each action computes
tested on v2.9
#SETTINGS AUXFOLDER=regtest/isdb/rt-cs2backbone/data
cs: CS2BACKBONE 
ATOMS
The atoms to be included in the calculation, e.g.
=1-174
DATADIR
compulsory keyword ( default=data/ ) The folder with the experimental chemical shifts.
=data/
SIGMA_MEAN0
could not find this keyword
=1.0
DOSCORE
( default=off ) activate metainference
csbias: BIASVALUE
ARG
the input for this action is the scalar output from one or more other actions.
=cs.score PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=(cs\.hn-.*),(cs\.nh-.*)
FILE
the name of the file on which to output these quantities
=CS.dat
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=1000 PRINT
ARG
the input for this action is the scalar output from one or more other actions.
=cs.score
FILE
the name of the file on which to output these quantities
=BIAS
STRIDE
compulsory keyword ( default=1 ) the frequency with which the quantities of interest should be output
=100
Glossary of keywords and components
Description of components

By default this Action calculates the following quantities. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.

Quantity Description
score the Metainference score
sigma uncertainty parameter
sigmaMean uncertainty in the mean estimate
neff effective number of replicas
acceptSigma MC acceptance for sigma values
ha the calculated Ha hydrogen chemical shifts
hn the calculated H hydrogen chemical shifts
nh the calculated N nitrogen chemical shifts
ca the calculated Ca carbon chemical shifts
cb the calculated Cb carbon chemical shifts
co the calculated C' carbon chemical shifts
expha the experimental Ha hydrogen chemical shifts
exphn the experimental H hydrogen chemical shifts
expnh the experimental N nitrogen chemical shifts
expca the experimental Ca carbon chemical shifts
expcb the experimental Cb carbon chemical shifts
expco the experimental C' carbon chemical shifts

In addition the following quantities can be calculated by employing the keywords listed below

Quantity Keyword Description
acceptScale SCALEDATA MC acceptance for scale value
acceptFT GENERIC MC acceptance for general metainference f tilde value
weight REWEIGHT weights of the weighted average
biasDer REWEIGHT derivatives with respect to the bias
scale SCALEDATA scale parameter
offset ADDOFFSET offset parameter
ftilde GENERIC ensemble average estimator
The atoms involved can be specified using
ATOMS The atoms to be included in the calculation, e.g. the whole protein.. For more information on how to specify lists of atoms see Groups and Virtual Atoms
Compulsory keywords
NOISETYPE ( default=MGAUSS ) functional form of the noise (GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC)
LIKELIHOOD ( default=GAUSS ) the likelihood for the GENERIC metainference model, GAUSS or LOGN
DFTILDE ( default=0.1 ) fraction of sigma_mean used to evolve ftilde
SCALE0 ( default=1.0 ) initial value of the scaling factor
SCALE_PRIOR ( default=FLAT ) either FLAT or GAUSSIAN
OFFSET0 ( default=0.0 ) initial value of the offset
OFFSET_PRIOR ( default=FLAT ) either FLAT or GAUSSIAN
SIGMA0 ( default=1.0 ) initial value of the uncertainty parameter
SIGMA_MIN ( default=0.0 ) minimum value of the uncertainty parameter
SIGMA_MAX ( default=10. ) maximum value of the uncertainty parameter
OPTSIGMAMEAN ( default=NONE ) Set to NONE/SEM to manually set sigma mean, or to estimate it on the fly
WRITE_STRIDE ( default=10000 ) write the status to a file every N steps, this can be used for restart/continuation
DATADIR ( default=data/ ) The folder with the experimental chemical shifts.
TEMPLATE ( default=template.pdb ) A PDB file of the protein system.
NEIGH_FREQ ( default=20 ) Period in step for neighbor list update.
Options
NUMERICAL_DERIVATIVES ( default=off ) calculate the derivatives for these quantities numerically
DOSCORE ( default=off ) activate metainference
NOENSEMBLE ( default=off ) don't perform any replica-averaging
REWEIGHT ( default=off ) simple REWEIGHT using the ARG as energy
SCALEDATA ( default=off ) Set to TRUE if you want to sample a scaling factor common to all values and replicas
ADDOFFSET ( default=off ) Set to TRUE if you want to sample an offset common to all values and replicas
NOPBC ( default=off ) ignore the periodic boundary conditions when calculating distances
SERIAL ( default=off ) Perform the calculation in serial - for debug purpose
CAMSHIFT ( default=off ) Set to TRUE if you to calculate a single CamShift score.
NOEXP

( default=off ) Set to TRUE if you don't want to have fixed components with the experimental values.

ARG the input for this action is the scalar output from one or more other actions. The particular scalars that you will use are referenced using the label of the action. If the label appears on its own then it is assumed that the Action calculates a single scalar value. The value of this scalar is thus used as the input to this new action. If * or *.* appears the scalars calculated by all the proceeding actions in the input file are taken. Some actions have multi-component outputs and each component of the output has a specific label. For example a DISTANCE action labelled dist may have three components x, y and z. To take just the x component you should use dist.x, if you wish to take all three components then use dist.*.More information on the referencing of Actions can be found in the section of the manual on the PLUMED Getting Started. Scalar values can also be referenced using POSIX regular expressions as detailed in the section on Regular Expressions. To use this feature you you must compile PLUMED with the appropriate flag.. You can use multiple instances of this keyword i.e. ARG1, ARG2, ARG3...
AVERAGING Stride for calculation of averaged weights and sigma_mean
SCALE_MIN minimum value of the scaling factor
SCALE_MAX maximum value of the scaling factor
DSCALE maximum MC move of the scaling factor
OFFSET_MIN minimum value of the offset
OFFSET_MAX maximum value of the offset
DOFFSET maximum MC move of the offset
REGRES_ZERO stride for regression with zero offset
DSIGMA maximum MC move of the uncertainty parameter
SIGMA_MEAN0 starting value for the uncertainty in the mean estimate
SIGMA_MAX_STEPS Number of steps used to optimise SIGMA_MAX, before that the SIGMA_MAX value is used
TEMP the system temperature - this is only needed if code doesn't pass the temperature to plumed
MC_STEPS number of MC steps
MC_CHUNKSIZE MC chunksize
STATUS_FILE write a file with all the data useful for restart/continuation of Metainference
SELECTOR name of selector
NSELECT range of values for selector [0, N-1]
RESTART allows per-action setting of restart (YES/NO/AUTO)