Shortcut: LANDMARK_SELECT_FPS

Module landmarks
Description Usage
Select a of landmarks from a large set of configurations using farthest point sampling. used in 1 tutorialsused in 1 eggs

Output components

This action can calculate the values in the following table when the associated keyword is included in the input for the action. These values can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the value required from the list below.

Name Type Keyword Description
data matrix ARG the data that is being collected by this action
logweights vector ARG the logarithms of the weights of the data points
rectdissims matrix DISSIMILARITIES a rectangular matrix containing the distances between the landmark points and the rest of the points
sqrdissims matrix DISSIMILARITIES a square matrix containing the distances between each pair of landmark points

Further details and examples

Select a of landmarks from a large set of configurations using farthest point sampling.

If you have collected a set of trajectory frames using COLLECT_FRAMES you can use this action to select a subset of the configurations you have collected. This shortcut does this using FARTHEST_POINT_SAMPLING the first point is thus selected at random. The remaining points are then selected by taking the unselected point in the input data set that is the furthest from all the points that have been selected thus far. The following input demonstrates how you can use this method:

Click on the labels of the actions for more information on what each action computes
tested on2.11
# This stores the positions of all the first 10 atoms in the system for later analysis
cc: COLLECT_FRAMESCollect atomic positions or argument values from the trajectory for later analysis This action is a shortcut. More details ATOMSlist of atomic positions that you would like to collect and store for later analysis=1,2,3,4,5,6,7,8,9,10 ALIGN if storing atoms how would you like the alignment to be done can be SIMPLE/OPTIMAL=OPTIMAL STRIDE the frequency with which data should be stored for analysis=1 CLEAR the frequency with which data should all be deleted and restarted=1000

# Select landmarks ll: LANDMARK_SELECT_FPSSelect a of landmarks from a large set of configurations using farthest point sampling. This action is a shortcut. More details ARGthe COLLECT_FRAMES action that you used to get the data=cc NLANDMARKSthe numbe rof landmarks you would like to create=100
# Output the data to a file DUMPPDBOutput PDB file. More details ATOMSvalue containing positions of atoms that should be output=ll_data ATOM_INDICESthe indices of the atoms in your PDB output=1,2,3,4,5,6,7,8,9,10 FILEthe name of the file on which to output these quantities=traj.pdb STRIDE the frequency with which the atoms should be output=1000

If you expand the shortcuts in the input above you will notice that the LANDMARK_SELECT_RANDOM shortcut creates a DISSIMILARITIES action that calculates the distances between the input frames. We have to compute these dissimilarities in order to perform the farthest point sampling here so you cannot use the NODISSIMILARITIES flag with this action. However, we also need the dissimilarities to compute the weights of the landmarks as this is done by performing a VORONOI analysis. If you would like to turn off the computation of the VORONOI weights you can use the NOVORONOI flag.

If you have already computed the dissimilarities between the collected frames you can pass them in input to the LANDMARK_SELECT_FPS funtion as shown below:

Click on the labels of the actions for more information on what each action computes
tested on2.11
# This stores the positions of all the first 10 atoms in the system for later analysis
cc: COLLECT_FRAMESCollect atomic positions or argument values from the trajectory for later analysis This action is a shortcut. More details ATOMSlist of atomic positions that you would like to collect and store for later analysis=1,2,3,4,5,6,7,8,9,10 ALIGN if storing atoms how would you like the alignment to be done can be SIMPLE/OPTIMAL=OPTIMAL STRIDE the frequency with which data should be stored for analysis=1 CLEAR the frequency with which data should all be deleted and restarted=1000

# This calculates the dissimilarities between the stored frames cc_dataT: TRANSPOSECalculate the transpose of a matrix More details ARGthe label of the vector or matrix that should be transposed=cc_data dd: DISSIMILARITIESCalculate the matrix of dissimilarities between a trajectory of atomic configurations. More details ARGthe label of the two matrices from which the product is calculated=cc_data,cc_dataT # Select landmarks ll: LANDMARK_SELECT_FPSSelect a of landmarks from a large set of configurations using farthest point sampling. This action is a shortcut. More details ARGthe COLLECT_FRAMES action that you used to get the data=cc DISSIMILARITIESthe matrix of dissimilarities if this is not provided the squared dissimilarities are calculated=dd NLANDMARKSthe numbe rof landmarks you would like to create=100
# Output the data to a file DUMPPDBOutput PDB file. More details ATOMSvalue containing positions of atoms that should be output=ll_data ATOM_INDICESthe indices of the atoms in your PDB output=1,2,3,4,5,6,7,8,9,10 FILEthe name of the file on which to output these quantities=traj.pdb STRIDE the frequency with which the atoms should be output=1000

Notice that you can also read in dissimilarities from a file using a CONSTANT action and pass these directly to the LANDMARK_SELECT_FPS shortcut and avoid using COLLECT_FRAMES.

You can learn how to use landmark selection for dimensionality reduction calculations by working through this tutorial

Syntax

The following table describes the keywords and options that can be used with this action

Keyword Type Default Description
NLANDMARKS compulsory none the numbe rof landmarks you would like to create
ARG optional not used the COLLECT_FRAMES action that you used to get the data
DISSIMILARITIES optional not used the matrix of dissimilarities if this is not provided the squared dissimilarities are calculated
SEED optional not used a random number seed
NOVORONOI optional false do not do a Voronoi analysis of the data to determine weights of final points
NODISSIMILARITIES optional false do not calculate the dissimilarities