This is part of the isdb module |
Calculate the fit of a structure or ensemble of structures with a cryo-EM density map.
This action implements the multi-scale Bayesian approach to cryo-EM data fitting introduced in Ref. [52] . This method allows efficient and accurate structural modeling of cryo-electron microscopy density maps at multiple scales, from coarse-grained to atomistic resolution, by addressing the presence of random and systematic errors in the data, sample heterogeneity, data correlation, and noise correlation.
The experimental density map is fit by a Gaussian Mixture Model (GMM), which is provided as an external file specified by the keyword GMM_FILE. We are currently working on a web server to perform this operation. In the meantime, the user can request a stand-alone version of the GMM code at massimiliano.bonomi_AT_gmail.com.
When run in single-replica mode, this action allows atomistic, flexible refinement of an individual structure into a density map. Combined with a multi-replica framework (such as the -multi option in GROMACS), the user can model an ensemble of structures using the Metainference approach [16] .
By default this Action calculates the following quantities. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.
Quantity | Description |
score | Bayesian score |
scoreb | Beta Bayesian score |
ATOMS | atoms for which we calculate the density map, typically all heavy atoms. For more information on how to specify lists of atoms see Groups and Virtual Atoms |
GMM_FILE | file with the parameters of the GMM components |
TEMP | temperature |
NL_CUTOFF | The cutoff in overlap for the neighbor list |
NL_STRIDE | The frequency with which we are updating the neighbor list |
SIGMA_MEAN | starting value for the uncertainty in the mean estimate |
NUMERICAL_DERIVATIVES | ( default=off ) calculate the derivatives for these quantities numerically |
NOPBC | ( default=off ) ignore the periodic boundary conditions when calculating distances |
NO_AVER | ( default=off ) don't do ensemble averaging in multi-replica mode |
ANALYSIS | ( default=off ) run in analysis mode |
In this example, we perform a single-structure refinement based on an experimental cryo-EM map. The map is fit with a GMM, whose parameters are listed in the file GMM_fit.dat. This file contains one line per GMM component in the following format:
#! FIELDS Id Weight Mean_0 Mean_1 Mean_2 Cov_00 Cov_01 Cov_02 Cov_11 Cov_12 Cov_22 Beta 0 2.9993805e+01 6.54628 10.37820 -0.92988 2.078920e-02 1.216254e-03 5.990827e-04 2.556246e-02 8.411835e-03 2.486254e-02 1 1 2.3468312e+01 6.56095 10.34790 -0.87808 1.879859e-02 6.636049e-03 3.682865e-04 3.194490e-02 1.750524e-03 3.017100e-02 1 ...
To accelerate the computation of the Bayesian score, one can:
All the heavy atoms of the system are used to calculate the density map. This list can conveniently be provided using a GROMACS index file.
The input file looks as follows:
# include pdb info MOLINFO STRUCTURE=prot.pdb # all heavy atoms protein-h: GROUP NDX_FILE=index.ndx NDX_GROUP=Protein-H # create EMMI score gmm: EMMI NOPBC SIGMA_MEAN=0.01 TEMP=300.0 NL_STRIDE=100 NL_CUTOFF=0.01 GMM_FILE=GMM_fit.dat ATOMS=protein-h # translate into bias - apply every 2 steps emr: BIASVALUE ARG=gmm.scoreb STRIDE=2 PRINT ARG=emr.* FILE=COLVAR STRIDE=500 FMT=%20.10f