This is part of the dimred module | |
It is only available if you configure PLUMED with ./configure –enable-modules=dimred . Furthermore, this feature is still being developed so take care when using it and report any problems on the mailing list. |
Perform principal component analysis (PCA) using either the positions of the atoms a large number of collective variables as input.
Principal component analysis is a statistical technique that uses an orthogonal transformation to convert a set of observations of poorly correlated variables into a set of linearly uncorrelated variables. You can read more about the specifics of this technique here: https://en.wikipedia.org/wiki/Principal_component_analysis
When used with molecular dynamics simulations a set of frames taken from the trajectory, \(\{X_i\}\), or the values of a number of collective variables which are calculated from the trajectory frames are used as input. In this second instance your input to the PCA analysis algorithm is thus a set of high-dimensional vectors of collective variables. However, if collective variables are calculated from the positions of the atoms or if the positions are used directly the assumption is that this input trajectory is a set of poorly correlated (high-dimensional) vectors. After principal component analysis has been performed the output is a set of orthogonal vectors that describe the directions in which the largest motions have been seen. In other words, principal component analysis provides a method for lowering the dimensionality of the data contained in a trajectory. These output directions are some linear combination of the \(x\), \(y\) and \(z\) positions if the positions were used as input or some linear combination of the input collective variables if a high-dimensional vector of collective variables was used as input.
As explained on the Wikipedia page you must calculate the average and covariance for each of the input coordinates. In other words, you must calculate the average structure and the amount the system fluctuates around this average structure. The problem in doing so when the \(x\), \(y\) and \(z\) coordinates of a molecule are used as input is that the majority of the changes in the positions of the atoms comes from the translational and rotational degrees of freedom of the molecule. The first six principal components will thus, most likely, be uninteresting. Consequently, to remedy this problem PLUMED provides the functionality to perform an RMSD alignment of the all the structures to be analyzed to the first frame in the trajectory. This can be used to effectively remove translational and/or rotational motions from consideration. The resulting principal components thus describe vibrational motions of the molecule.
If you wish to calculate the projection of a trajectory on a set of principal components calculated from this PCA action then the output can be used as input for the PCAVARS action.
The following input instructs PLUMED to perform a principal component analysis in which the covariance matrix is calculated from changes in the positions of the first 22 atoms. The TYPE=OPTIMAL instruction ensures that translational and rotational degrees of freedom are removed from consideration. The first two principal components will be output to a file called PCA-comp.pdb. Trajectory frames will be collected on every step and the PCA calculation will be performed at the end of the simulation.
ff: COLLECT_FRAMESATOMS=1-22list of atomic positions that you would like to collect and store for later analysisSTRIDE=1 pca: PCAcompulsory keyword ( default=1 ) the frequency with which data should be stored for analysis.USE_OUTPUT_DATA_FROM=ffcould not find this keywordMETRIC=OPTIMALcould not find this keywordNLOW_DIM=2 OUTPUT_PCA_PROJECTIONcompulsory keyword number of low-dimensional coordinates requiredUSE_OUTPUT_DATA_FROM=pcacould not find this keywordFILE=PCA-comp.pdbcould not find this keyword
The following input instructs PLUMED to perform a principal component analysis in which the covariance matrix is calculated from changes in the six distances seen in the previous lines. Notice that here the TYPE=EUCLIDEAN keyword is used to indicate that no alignment has to be done when calculating the various elements of the covariance matrix from the input vectors. In this calculation the first two principal components will be output to a file called PCA-comp.pdb. Trajectory frames will be collected every five steps and the PCA calculation is performed every 1000 steps. Consequently, if you run a 2000 step simulation the PCA analysis will be performed twice. The REWEIGHT_BIAS action in this input tells PLUMED that rather that ascribing a weight of one to each of the frames when calculating averages and covariance matrices a reweighting should be performed based and each frames' weight in these calculations should be determined based on the current value of the instantaneous bias (see REWEIGHT_BIAS).
d1: DISTANCEATOMS=1,2 d2: DISTANCEthe pair of atom that we are calculating the distance between.ATOMS=1,3 d3: DISTANCEthe pair of atom that we are calculating the distance between.ATOMS=1,4 d4: DISTANCEthe pair of atom that we are calculating the distance between.ATOMS=2,3 d5: DISTANCEthe pair of atom that we are calculating the distance between.ATOMS=2,4 d6: DISTANCEthe pair of atom that we are calculating the distance between.ATOMS=3,4 rr: RESTRAINTthe pair of atom that we are calculating the distance between.ARG=d1the values the harmonic restraint acts upon.AT=0.1compulsory keyword the position of the restraintKAPPA=10 rbias: REWEIGHT_BIAScompulsory keyword ( default=0.0 ) specifies that the restraint is harmonic and what the values of the force constants on each of the variables areTEMP=300 ff: COLLECT_FRAMESthe system temperature.ARG=d1,d2,d3,d4,d5,d6the labels of the values whose time series you would like to collect for later analysisLOGWEIGHTS=rbiaslist of actions that calculates log weights that should be used to weight configurations when calculating averagesSTRIDE=5 pca: PCAcompulsory keyword ( default=1 ) the frequency with which data should be stored for analysis.USE_OUTPUT_DATA_FROM=ffcould not find this keywordMETRIC=EUCLIDEANcould not find this keywordNLOW_DIM=2 OUTPUT_PCA_PROJECTIONcompulsory keyword number of low-dimensional coordinates requiredUSE_OUTPUT_DATA_FROM=pcacould not find this keywordSTRIDE=100could not find this keywordFILE=PCA-comp.pdbcould not find this keyword
Quantity | Description |
.#!value | the projections of the input coordinates on the PCA components that were found from the covariance matrix |
ARG | the arguments that you would like to make the histogram for |
NLOW_DIM | number of low-dimensional coordinates required |
STRIDE | ( default=0 ) the frequency with which to perform this analysis |
FILE | the file on which to output the low dimensional coordinates |
FMT | the format to use when outputting the low dimensional coordinates |