Making lepton library faster

In case you are using a lot of CUSTOM functions or switching functions, notice that these commands depend on the lepton library that is included in PLUMED. This library replaces libmatheval since PLUMED 2.5, and by itself it is significantly faster than libmatheval. However, you can make it even faster using a just-in-time compiler. As of PLUMED 2.6, the correct version of ASMJIT is embedded in PLUMED. As of PLUMED 2.8, ASMJIT is enabled by default on supported architectures (X86/X64). You can disable it at runtime setting the environment variable PLUMED_USE_ASMJIT:

export PLUMED_USE_ASMJIT=no

In some case using a custom expression is almost as fast as using a hard-coded function. For instance, with an input that contained the following lines:

Click on the labels of the actions for more information on what each action computes
tested on master
c: COORDINATION 
GROUPA
First list of atoms.
=1-108
GROUPB
Second list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted).
=1-108
R_0
could not find this keyword
=1 d_fast: COORDINATION
GROUPA
First list of atoms.
=1-108
GROUPB
Second list of atoms (if empty, N*(N-1)/2 pairs in GROUPA are counted).
=1-108
SWITCH
This keyword is used if you want to employ an alternative to the continuous switching function defined above.
={CUSTOM FUNC=1/(1+x2^3) R_0=1}

I (GB) obtained the following timings (on a Macbook laptop):

...
PLUMED: 4A  1 c                                          108     0.126592     0.001172     0.000701     0.002532
PLUMED: 4A  2 d_fast                                      108     0.135210     0.001252     0.000755     0.002623
...

Notice the usage of x2 as a variable for the switching function (see switchingfunction), which avoids an unnecessary square root calculation (this is done automatically by the hard-coded switching functions when you use only even powers). The asmjit calculation (d_fast) takes less than 10% more than the hard-coded one (c).