Improved VTK – numpy integration (part 4)

Welcome to another blog where we continue to discover VTK’s numpy_interface module. If you are not familiar with this module, I recommend checking out my previous blogs on it ([1], [2], [3]). In this blog, I will talk about how numpy_interface can be used in a data parallel way. We will be using VTK’s MPI support for this – through VTK’s own vtkMultiProcessController and mpi4py. You may want to check my last blog on VTK’s mpi4py integration for some details.

Let’s get started. First, if you want to run these examples yourself, make sure that VTK is compiled with MPI support on by setting VTK_Group_MPI to ON during the CMake configuration step. The simplest way to run these examples is to use the pvtkpython executable that gets compiled when Python and MPI support are on. Pvtkpython can be run from the command line as follows (check out your MPI documentation for details).

Now, let’s start with a simple example.

When I run this example on my laptop, I get the following output:

Depending on a particular run and your OS, you may see something similar or something a bit garbled. Since I didn’t restrict the print call to a particular rank, all processes will print out roughly at the same time and depending on the timing and buffering, may end up mixing up with each others’ output. If you examine the output above, you will notice that the overall range of the scalars is (37.35310363769531, 276.8288269042969), which we can confirm by running the example serially.

Note that vtkRTAnalyticSource is a parallel-aware source and produces partitioned data. The following lines are what tell vtkRTAnalyticSource to produce its output in a distributed way

For more details, see this page.

So how can we find out the global min and max of the scalars (RTData) array? One way is to use mpi4py to perform a reduction of local values of each process. Or we can use the numpy_interface.algorithms module. Add the following code to the end of our example.

This should print the following.

That simple! All algorithms in the numpy_interface.algorithms module work properly in parallel. Note that min, max and any other parallel algorithm will return the same value on all ranks.

It is possible to force these algorithms to produce local values by setting their controller argument as follows.

This will print the following.

All algorithms in the numpy_interface.algorithms module were designed to work in parallel. If you use numpy algorithms directly, you will have to use mpi4py and do the proper reduction.

One final thing. Numpy.dataset_adapter and numpy.algorithms were designed to work properly even when an array does not exist on one or more of the ranks. This occurs when a source can produce a limited number of pieces (1 being the most common case) and the size of the parallel job is larger. Let’s start with an example:

On my machine, this prints out the following.

Note how the normals is a VTKNoneArray on 2 of the ranks. This is because vtkCubeSource is not parallel-aware and will produce output only on the first rank. On all other ranks, it will produce empty output. Consider the use case where we want to do something like this with normals.

One would expect that this would throw an exception on ranks where the normals array does not exist. In fact, the first implementation of the numpy_interface.dataset_adapter returned a None object and threw an exception in such cases as expected. However, this design had a significant flaw. Because of the exception, ranks that did not have the array could not participate in the calculation of global values, which are calculated by performing MPI_Allreduce. This function will hang unless all ranks participate in the reduction. We addressed this flaw by developing the VTKNoneArray class. This class supports all operators that regular arrays support and always returns VTKNoneArray. Furthermore, parallel algorithms function properly when asked to work on a VTKNoneArray.

We have covered a lot ground so far. In the next blog, which will be the last one in this series, I will talk about composite datasets and composite arrays.

Continue on to Part 5.

All posts in this series: Part 1, Part 2, Part 3, Part 4, and Part 5.

Questions or comments are always welcome!