How to use MPI with a hybrid C++/Python code

The message passing interface (MPI) is the go-to technology for the development of distributed parallel programs. In this blog post, I will explain, with examples, how you can expose a Python interface to an MPI-parallel program. This strategy is helpful on a number of occasions:

  1. You have a C++ MPI-parallel library and you would like to provide a Python interface to it.
  2. You have an MPI-parallel Python program and you want to rewrite some core functionality in C++ for performance reasons.

The final goal is to be able to manipulate MPI objects, such as e.g. communicators and groups, within a Python script, while still being able to use them effectively within the core of the C++ program. The technique we will discuss in this blog post is a cornerstone of the VeloxChem quantum chemistry program. It is not a new technique, but it is underdocumented: we hope this blog post will clarify how it works. I will refer to an example project hosted on GitLab: https://gitlab.com/robertodr/pybind11-mpi4py

The ingredients for this technique are:

  • The mpi4py Python package. You can obtain it through your package manager, compiling the source distribution on PyPI or using the conda package manager.
  • A MPI-parallel C++ code.
  • A Python/C++ binding layer. This can be achieved in a variety of ways and we chose pybind11.

The MPI-parallel C++ code

The C++ code consists of a single function accepting a communicator: an object of type MPI_Comm. The function will check size and rank of the communicator and print them to screen:

int size = 0;
MPI_Comm_size(comm, &size);

int rank = 0;
MPI_Comm_rank(comm, &rank);

std::cout << "Hello from rank " << rank << " of " << size << std::endl;

Binding C++ and Python

The binding code will generate a Python extension module out of the C++ code. First of all, we need to initialize the C API of the mpi4py module, which already implements all the “glue” functionality between many MPI implementations and Python:

if (import_mpi4py() < 0) throw py::error_already_set();

Further, we expose a function, greetings, that accepts a communicator.

m.def(
      "greetings",
      [](py::object py_comm) {
        auto comm = get_mpi_comm(py_comm);
        say_hello(*comm);
      },
      R"pbdoc(
           Print greetings.
    )pbdoc");

Note that we do not bind the corresponding C++ core function directly, but rather use a lambda function. mpi4py provides the binding layer between MPI objects and Python through a C API: pybind11 cannot automatically determine the type cast between the Python and C representations, hence the use of a C++ lambda to perform the conversion and call the core function. This is handled by the get_mpi_comm function:

MPI_Comm *get_mpi_comm(py::object py_comm) {
  auto comm_ptr = PyMPIComm_Get(py_comm.ptr());

  if (!comm_ptr)
    throw py::error_already_set();

  return comm_ptr;
}

This function is itself a wrapper around the PyMPIComm_Get function offered by the mpi4py C API. For correctness, we check whether the typecast was successful and re-raise the Python exception otherwise.

We can now compile the project and write a greetings program in Python:

from mpi4py import MPI

from pb11mpi import greetings

comm = MPI.COMM_WORLD
greetings(comm)

Which can be run as:

mpiexec -n 2 greetings.py

The example repository has a complete CMake build system. Furthermore, the project is tested on Linux and Windows using the continuous integration framework provided by GitLab.

Gotchas on Windows

You can use the technique and the code also when working on Windows. Microsoft offers the MS-MPI library: their own implementation of the MPI standard. As explained here, in order to avoid runtime failures, one needs to add the following lines before include the mpi4py C headers:

#ifdef MSMPI_VER
#define PyMPI_HAVE_MPI_Message 1
#endif

RECENT NEWS

The Call for Proposals for EuroHPC JU Benchmark and Development Access Modes are continuously open calls, with a maximum time-to-resources Read more
(Deadline by May 5th, 2021) CSC is looking for a technical specialist to lead the LUMI’s system admin team in Read more
Recently a PRACE preparatory access application on the newly installed JUWELS Booster system at Jülich Supercomputing Centre (JSC) was accepted. Read more

Categories: