mpi4py and VTK

August 12, 2014

We recently added mpi4py as one of the third party libraries in VTK. Below is a quote from the mpi4py explaining what it is.

MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.

This package is constructed on top of the MPI–1/MPI–2 specification and provides an object oriented interface which closely follows MPI–2 C++ bindings. It supports point-to-point (sends, receives) and collective (broadcasts, scatters, gathers) communications of any picklable Python object as well as optimized communications of Python object exposing the single-segment buffer interface (NumPy arrays, builtin bytes/string/array objects).

See the mpi4py page for details.

We have been using mpi4py in ParaView for several years and with the recent introduction of the numpy_interface module to VTK, we decided to move the mpi4py dependency to VTK as well. This allowed us to support data parallelism with MPI in the numpy_interface module. I will discuss this in an upcoming blog in more detail.

Using mpi4py is pretty straightforward. The following can be called from vtkpython, pvtkpython and python.

from mpi4py import MPI

Note that if you are going to mix parallel VTK and mpi4py, we recommend using pvtkpython, which initializes several VTK data structures that make it easier for algorithms to access MPI communicators.

VTK also provides a Python-accessible interface to MPI in the vtkMPIController and vtkMPICommunicator classes. However, these classes were not designed to be used from Python and as such provide only a small set of methods. Most often, you will use mpi4py when coding in Python.

In some cases, specially when using MPI groups, it is necessary to pass the communicator used by VTK to mpi4py or vice versa. We developed a simple utility class to enable this. This class is called vtkMPI4PyCommunicator and is used as follows.

import vtk
from mpi4py import MPI

# GlobalController is defined automatically when running pvtkpython
# Otherwise, you need to manually create a vtkMPIController and set
# it yourself.
contr = vtk.vtkMultiProcessController.GetGlobalController()
comm = vtk.vtkMPI4PyCommunicator.ConvertToPython(contr.GetCommunicator())

acomm = vtk.vtkMPI4PyCommunicator.ConvertToVTK(comm)
acontr = vtk.vtkMPIController()
acontr.SetCommunicator(acomm)

Since mpi4py works very nicely with numpy arrays and VTKArray is a subclass of numpy.ndarray (see previous posts [1], [2] and [3]), it is very straightforward to use them together as follows.

import vtk
from vtk.numpy_interface import dataset_adapter as dsa
from mpi4py import MPI
import numpy

gc = vtk.vtkMultiProcessController.GetGlobalController()

rank = gc.GetLocalProcessId()

fa = vtk.vtkFloatArray()
fa.SetNumberOfTuples(10)
fa.FillComponent(0, rank)
if rank == 0:
    fa.SetValue(3, 10)

vtk_array = dsa.vtkDataArrayToVTKArray(fa)
result = numpy.array(vtk_array)

comm = vtk.vtkMPI4PyCommunicator.ConvertToPython(gc.GetCommunicator())
comm.Allreduce([vtk_array, MPI.FLOAT], [result, MPI.FLOAT], MPI.MAX)

if rank == 0:
    print result

When this is executed as

mpiexec -n 2 pvtkpython parallel_array.py

it prints

[  1.   1.   1.  10.   1.   1.   1.   1.   1.   1.]

In my next blog, I will talk about how to use the algorithms module in parallel. Until then, happy Message Passing.

Many thanks to Ben Boeckel for moving mpi4py to VTK and implementing vtkMPI4PyCommunicator.

2 comments to mpi4py and VTK

  1. The syntax in your examples is a little confusing. Did you import vtkMPI4Py? When I try to use it with vtk.vtkMPI4Py I am getting an error that it is not defined. Also you have a variable controller which I’m assuming should be contr (or else it is not defined either). I have a script that I can run with pvtkpython, but I am trying to test setting up my own controller. Can you clear up these definitions please?

Leave a Reply