A brief introduction to the plumed core

Plumed 2, unlike its predecessor plumed 1, which was written in plain C, is written in C++.
C++, unlike C, fortran and many of the other languages that are commonly used in the atomistic simulation community, is an object oriented programming language. As such the way things are done in plumed may feel unfamiliar to developers in the scientific community who, if our experience is anything to go by, are more used to programming in non-object oriented languages. For this reason we have tried in what follows to explain how we have used the features of object oriented programming in plumed 2. We hope that this guide is helpful and apologize in advance to any developers who feel patronized.

Object oriented programming

The main objective in object oriented programming is to write code that is more resilient to bugs. There are two ways that object oriented programing allows us to achieve these aims:

  • In object oriented programs one generally needs fewer lines of code
  • Object oriented programming allows us to use the compiler to do many more checks of the code

To be clear though object oriented programming does not allow us to do things that would be impossible with other programming languages. All programs perform some set of mathematical operations and any programming language can be used to implement these mathematical operations. The only advantage of C++ is that the advanced, object-oriented features of the language make implementing things more straighforward.

As you are no doubt aware, in C one can create structures to order the variables in your code.
A naive way of thinking about the objects, or more correctly classes, that one uses in C++ is that these are structures that also contain functions. This is useful for making neat header files as the parameters are kept near the functions. However, at this level of thinking the C++ way of doing things:

class fclass {
bool param0;
int param1,param2;
double param3;
double f( double x );
}; 

is not much better than the C way of doing things:

struct fparams {
bool param0;
int param1,param2;
double param3;
}; 
double f( double x, struct fparams myparams ); 

Nevertheless for reasons that will hopefully become clear as you read this document, every bias, colvar and function that is implemented in plumed 2 is inside its own separate class.

Public and private members

The variables in a C struct can be accessed anywhere in the code - any function in the code can copy the information from a structure or change the values of the variables in the structure. This was a particularly fun feature of plumed 1.0, every function in the old code could change any of the variables in the code! Obviously this causes problems as new functions can accidentally change the values of some variable in a widely used structure that should never have been changed. As one can imagine this can cause drastic problems. To prevent errors like this C++ provides a set of functionalities to allow one to specify what any given function can do to the members of a class. This is not possible in C and it was the ability to use this functionality to create flexible, easily-extendable code that motivated our choice of C++ for plumed 2. The example class below shows how this is done in practise:

class myclass{
private:
  bool b;  //--This can only be accessed when you are in one of the methods in the class
public:
  int i;  //--This can be acessed by anywhere in the code
};

\section constructors Constructors

As someone who learnt to program in fortran it was this aspect of C++, more than any other, that confused me 
the most. In actually though it is rather simple and I don't really know why I was confused. In essence every 
class must contain some method for creating it. This class should set the initial values of all the variables
inside the class. Obviously the functions (or more correctly methods) that the class
contains cannot be used until an instance of the class has been created using the constructor.

An example of how all this works in practise is given below:

\verbatim
class myclass{
private:
  bool b;
  int i;
  double d;
public:
  myclass( bool bb, int ii, double dd ) : b(bb), i(ii), d(dd) {}
  static double g(int j);
  double f( double x );
};

// Here there are currently no instances of myclass and so I cannot run f
// I can run g however as it is a static member of the class - I run it using
double d=myclass::g(j);

// Now I create an instance of the class using the constructor
myclass thisIsTheInstance(false, 3, 6.5);
// so that I can run the method called f 
double s=thisIsTheInstance.f(4.0); 

In plumed 2 all the lines in the input file are read in inside the constructors. This ensures that the parameters inside any given method are set correctly from the outset.

Operators

Addition, subtraction, multiplication, division and so on are all functions (they are obviously not variables). We usually don't think of them as functions however because we use these operations all the time. C++ recognizes that the short cuts of +, -, *, / and so on are very useful. It thus allows one to define operators in our new classes that explain to the compiler what a given symbol means for a given class. Among other things we can define:

  • How to perform the logical operators !=, ==, etc
  • How to perform arithmatic for a class: +, -, /, *, +=, -= etc
  • What brackets mean i.e. the meanings of (), []

We do not use this extensively in plumed 2 but it does occasionally appear.

Including the functionality of one class in a new class 1: Inclusion

There are various ways that one can include the functionality of one class inside a second class. By far the simplest is to create an instance of class 1 inside class 2 as shown below:

class class1 {
private:
  double d1,d2,d3;
public:
  class1();
  void f(double x);
};

class class2 {
private:
  class1 myclass;
public:
  class2();
  // The methods of class 2 here
};

This is really simple one includes a class in this way in exactly the same way that one includes a double, int or whatever variable.

This kind of inclusion is used extensively in plumed 1.0 and there are a huge number of classes that you can re-use in this way to create new colvars, functions or biases. For a full list of the classes that are available see Tool Box.

Including the functionality of one class in a second class 2: Inheritance

There is an alternate way of reusing the functionality from one class in a second class that is available in C++. This method is called inheritance and it offers some advantages over simply including class A inside class B as described above. To create a class called B that inherits from A one writes:

class B : public A {
// Contents of class B
};  

One advantage of this method over inclusion is that I can use protected members to more closely control what members of A class B is allowed to access. Hence, rather than simply having private and public members I now have:

public These members can be accessed by anyone
protected These members can only be accessed by the methods of class A and class B
private These members can only by accessed by the methods of class A (not by class B)

In addition, I can use inheritance to treat pointers to objects of class B as if they were pointers to objects of class A. In other words, if I create an object of type B I can convert it to an object of type A using dynamic_cast as shown below:

B* mynewB=new B();   // This is a special way of calling the constructor so you get a pointer
A* BpretendingToBeA=dynamic_cast<A*>(mynewB); 

All the colvars and free energy methods of plumed use inheritance. In fact all these methods are built on a single base class called PLMD::Action. This class contains all the functionality for reading stuff from input, the stuff for controlling the dependencies Actions and a set of controls that decide which actions are performed when. All the functionality for the different methods is then built on this root. As you can see (PLMD::Action) the inheritance tree for the code is quite complicated. However, in practise if you are implementing a CV, function, bias or virtual atom the correct start point is with one of the classes listed on this page Base classes for CVs, functions, biases, etc. all of which contain detailed descriptions of how to use them.

Including the functionality of one class in a second class 3: Multiple inheritance

Immediately above the PLMD::Action root of the inheritance tree in plumed there is a very complicated looking layer in the inheritance structure of the code. This layer looks ugly because in this layer we are using multiple inheritance - the classes in the layer above inherit from multiple classes simultaneously. This way of incorporating functionality from classes is unique to C++ and brings with it a special set of difficulties in programming. Its great advantage is though that one can create classes that incorporate bring a set of particular attributes. This will perhaps be most clear if you look at what each of the classes in the multiple inheritance layer is doing (see Classes for multiple inheritance) and see how these functionalities are used in Colvars, Functions and Biases. Please be aware that, unless you are doing something really wacky, you should be able to implement whatever you need to implement without writing classes that take advantage of multiple inheritance. Furthermore, you should not need to touch the classes in this region of the code. The information here is there for completeness only. If you feel you really must change something in this part of the code please contact the developers before doing anything.

Static Polymorphism

Polymorhpism is a way of using the same code to do many different things. As an example consider a Matrix. The elements of a Matrix can be ints, doubles, floats or even some fancy new class but we would still want the operator (i,j) to return the element in row i and column j. That is to say the operator (const int i, const int j) of a matrix is independent of what is actually inside the matrix. Using C++ we can use so called template classes to implement thse kinds of things and can then re-use them to do an enormous variety of different operations. To see how this works in practise take a look at PLMD::Matrix, which is a working version of our Matrix example. Be aware that all the routines in a template class must be inside the header file. To use a template within the code you declare it as follows:

Matrix<double> mat;   // This is a matrix of doubles

The most common way we use this kind of functionality in plumed 2 is when we take advantage of the features that are available in the C++ standard library. For more details on the standard library visit http://www.cplusplus.com/reference/

Dynamic Polymorhpism

When you run a calculation with plumed the code calculates a number of CVs. The bias and the forces due to the bias are then calculated and in the final step these forces are propegated back onto the atoms using the chain rule. For example PLMD::colvar::Distance contains the function that calculates a distance between atoms, while PLMD::bias::MetaD contains the function for doing metadynamics. What may thus seem remarkable to the programmer unfamiliar with C++ is that the class that calls the functions that calculate the CVs, biases and so on only uses PLMD::Action. To make that clear it looks like the code can calculate the distances between atoms without ever calling any of the routines from PLMD::colvar::Distance!

We can program in this way because we take advantage of dynamic polymorhpism. If you look at the documenation for PLMD::Action you will see that the method PLMD::Action::calculate is declare inside PLMD::Action as:

virtual void calculate()=0;

This kind of declaration promises two things to a class:

  • That the class will only ever be used in derived classes. No PLMD::Action class is ever constructed in the code. The functionality in PLMD::Action is only ever used in the derived classes that inherit PLMD::Action. Classes like PLMD::Action are called abstract base classes.
  • That in one of the classes that inherits from PLMD::Action a method called calculate will be defined.

The great advantage of declaring calculate() inside PLMD::Action in this way is that the calculate routine that we declare in the derived class is a member of PLMD::Action. We thus can thus write a class for doing all the business of plumed in the manner described previously.

Forward declaration

One problem of including classes inside other classes in C++ is that this enforces one to include one .h file into another one, thus leading to a large set of objects needing to be recompiled just because a single .h file was touched. In some cases this is not avoidable, e.g. when classes inherits from other classes. However, when only a pointer (or a reference) to another class is used, it might be better to just use a forward declaration as in this example:

/////////////////////////////////////////////
// This is file A.h
namespace PLMD{

class A{
  int pippo;
};

}

/////////////////////////////////////////////
// This is file B-bad.h
// it has to include A.h
#include "A.h"
namespace PLMD{

class B{
public:
// notice that here we only use a reference to class A
  int do_something(A&a);
};

}

/////////////////////////////////////////////
// This is file B-good.h
namespace PLMD{

// this command just instructs the compiler that A is a class:
class A;
// no inclusion of A.h is required!

class B{
public:
// notice that here we only use a reference to class A
  int do_something(A&a);
};

}

This trick however does not work is a class is including an instance of another class. E.g., if B contains an instance of A one should know exactly the A declaration to build a B object. In this case, a similar effect can be obtained at the price of adding some more lines of code in constructor and destructor of B as in the following example

/////////////////////////////////////////////
// This is file B-bad.h
// it has to include A.h
#include "A.h"
namespace PLMD{

class B{
  A content;
};

}

/////////////////////////////////////////////
// This is file B-good.h
namespace PLMD{

// this command just instructs the compiler that A is a class:
class A;
// no inclusion of A.h is required!

class B{
// As "content" is a reference, it can be used exactly as if it was a normal object
// However, it is represents by the compiler as a pointer.
  A& content;
public:
  B();
  ~B();
};

}

/////////////////////////////////////////////
// Using B-good.h enforces to add something in B-good.cpp

#include "A.h"
#include "B-good.h"

using namespace PLMD;

B::B():
// now "content" needs to be explicitly allocated ...
  content(*new A){
}

B::~B(){
// ... and deallocated
  delete &content;
}

Notice that this trick cannot be always be applied, e.g., if the constructor to be A needs parameter, or if object "content" is to be accessed by inline methods of B for efficiency. Another example where this does not work is when inline methods are used because of template expressions.

This trick is extensively used in plumed so as to avoid too many indirect dependencies among .h files.

C++11 Features

Since PLUMED 2.4 we systematically use C++11 features. Some of the most important ones are discussed here.

Using auto

Auto allows to implicitly declare the type of a variable. This is particularly handy when using iterators from STL containers. For instance, you can replace the following:

   map<string,vector<AtomNumber> >::const_iterator m=atoms.groups.find(strings[i]);

with:

   const auto m=atoms.groups.find(strings[i]);

Notice that the syntax is significantly simpler, especially if you do not remember which exact type of map is the variable groups.

Iterators are often used in loops. Thus, you can now replace

  for(std::map<AtomNumber,Tensor>::const_iterator p=gradients.begin();p!=gradients.end();++p){
    a+=(*p).second;
  }

with

  for(auto p=gradients.begin();p!=gradients.end();++p){
    a+=(*p).second;
  }

However, in cases where you do not need to explicitly use p as an iterator, you might find even more convenient to use range-based loops:

  for(const auto & p : gradients){
    a+=p.second;
  }

Notice that now p is a constant reference, so it is not anymore necessary to use the * operator.

Using smart pointers

There are many resources on the web about this topic. Have a look at this link for a concise introduction.

Smart pointers can be most of the time used in place of regular pointers so as to better manage memory. In PLUMED you can find many times sections of code such as

  object* obj;
  if(...) obj=new type1;
  else obj=new type2;

  ...

  obj->method();

  ..

  delete obj;

Here we use a pointer to allow dynamic polymorphism.

In this case, the object pointed by obj is not transferred anywhere else. In other words, you can think that the obj pointer owns the object itself. You can replace it with a std::unique_ptr as follows:

  std::unique_ptr<object> obj;
  if(...) obj.reset(new type1);
  else obj.reset(new type2);

  ...
  
  obj->method();

  ..

Notice that instead of assigning it with = you should assign it with reset(). This is because the std::unique_ptr cannot be copied and so does not understand the assignment operator. More importantly, notice that the delete command has disappeared. Indeed, when obj goes out of scope, the pointee is automatically deleted.

You can also use vectors of pointers. Consider the following example

  std::vector<object*> objs;
  for(unsigned i=0;i<10;i++) objs.push_back(new object);

  ...

  for(unsigned i=0;i<10;i++) delete objs[i];

This can be replaced with

  std::vector<std::unique_ptr<object>> objs;
  for(unsigned i=0;i<10;i++) objs.emplace_back(new object);

  ...

Notice that instead of using push_back() we used emplace_back(). The reason is that the latter move the pointer instead of copying it. More importantly, notice that the delete command has disappeared. When the vector is cleared, also the contained objects are deleted.

Notice that emplace_back needs to be fed with a so-called rvalue. In case you created the unique_ptr in advance, you should insert it with the following syntax

  std::vector<std::unique_ptr<object>> objs;
  std::unique_ptr<object> to_insert;
  to_insert.reset(new object);
  objs.emplace_back(std::move(to_insert));

Forward declarations using C++11

Notice that also forward declarations discussed above are a bit simpler to implement using C++11 syntax. This can be done using a std::unique_ptr or, even better, using the class ForwardDecl, which is a small utility class that only implement two methods:

  • A constructor, which takes an arbitrary number of parameters and use them to construct an internally stored std::unique_ptr.
  • A operator *, which returns a pointer to the object.

An example usage is below:

/////////////////////////////////////////////
// This is file B-good.h
#include "tools/ForwardDecl.h"

namespace PLMD{

// this command just instructs the compiler that A is a class:
class A;
// no inclusion of A.h is required!

class B{
  ForwardDecl<A> content_fwd;
  A& content1=*content1_fwd;
  ForwardDecl<A> content_fwd;
  A& content2=*content2_fwd;
public:
  B();
  ~B();
};

}

/////////////////////////////////////////////
// Using B-good.h enforces to add something in B-good.cpp

#include "A.h"
#include "B-good.h"

using namespace PLMD;

B::B():
// constructors that need no argument can be omitted
  content2_fwd(argument)
{
}

B::~B(){
// empty destructor
}

Notice that it is necessary to add a destructor, even though it is empty. The reason is that if the compiler tries to construct an inline destructor for this class it will not be able to create it (the class is not completely defined in B.h. However, the advantage is that objects are deallocated in the correct order as if they were normal members of class B, that is the inverse of the initialization order.

Conclusion

The above is meant to give you some feel as to how plumed works. If there is stuff you do not understand it is not necessarily that important. The great advantage of the code as it is currently written is that you can implement new methods without touching the core of the code. So in conclusion give it a go and have fun!

Notes

More information about C++ http://www.parashift.com/c++-faq-lite/