New Universal Ghost Cells Generator


The next version of VTK will include a new class for generating ghost cells for most data types in VTK. This filter will replace the previous one in ParaView 5.10. This filter aims to unify the ghost cells generator API so that it is available for all vtkDataSets in VTK. At the time of writing this post, only vtkExplicitStructuredGrid and vtkHyperTreeGrid are not compatible with this new filter. Hence, all VTK classes for generating ghost cells previously available (except for vtkHyperTreeGridGhostCellsGenerator) are now deprecated classes. This includes vtkUniformGridGhostDataGenerator, vtkPUniformGridGhostDataGenerator, vtkStructuredGridGhostDataGenerator, vtkPStructuredGridGhostDataGenerator, vtkUnstructuredGridGhostCellsGenerator, and vtkPUnstructuredGridGhostCellsGenerator.

Before diving into technicalities about vtkGhostCellsGenerator, let us talk a bit more about input types. In addition to regular vtkDataSet inputs, this filter can also take vtkMultiBlockDataSet, vtkPartitionedDataSet and vtkPartitionedDataSetCollection as inputs (refer to here for a description of data models in VTK). In such cases, ghosts are generated across blocks / partitions, but not between two separate vtkPartitionedDataSets (which can happen inside a vtkPartitionedDataSetCollection. There are some assumptions that one needs to respect when using composite data set inputs: inputs inside a common vtkPartitionedDataSet or in a common vtkMultiBlockDataSet are assumed to hold the same point data and cell data structure. They should all have the same attached arrays, listed in the same order, and partial arrays are not supported.

Ghost cells array and ghost points array are accessible through the vtkDataObject API, using the function:

By definition, a ghost array is a vtkUnsignedCharArray, and the returned array should be downcasted to vtkUnsignedCharArray using vtkArrayDownCast<vtkUnsignedCharArray>.

What are ghost cells (and ghost points)?

Ghost cells are cells inside a data set that are copies of the interfacing cells of an adjacent data set. This is typically being used in multi-process environments, but can be used in a single-process environment when using vtkPartitionedDataSet. Ghost cells are required when running filters that need cell neighbors information. For instance, in order to compute the gradient of an image, one needs to have access to one layer of ghosts in the input. In order to compute the Laplacian of an image, one needs to have access to 2 layers of ghosts in the input. The filter vtkCellDataToPointData requires one layer of ghosts in the input. Connecting a filter that needs n layers of ghosts to a filter that needs m layers of ghosts results in needing n + m layers of ghosts in the source.

The nth layer of cells is defined as the set of cells that are adjacent to at least one cell of the (n-1)th layer, and that are not part of any layer strictly lower than n, defining layer 0 to be the set of cells in the outer crust of the data set when ghosts are removed.

Ghost points are points that are a copy of a unique concrete point in some other partition. A ghost point should always be a point of at least one ghost cell. The total of non-ghost cells and the total of non-ghost points should equal the total of cells and of points if the partitioned data set was merged into a single data set, merging duplicate points. In other words, one should be able to do statistics on point data or cell data by just eliminating ghost cells and ghost points from consideration.

In the illustration below, an input partitioned data set of two adjacent image data (on the left, one image data is blue, the other is red) is passed into the ghost cells generator. The output is displayed on the right, showing how each partition has changed in the process when generating one layer of ghosts. Ghost cells and ghost points are displayed in gray.

Notice that points of the red that were initially in the red image data became ghosts in the process (they turned gray in the above image). The ghost cell generator will keep one instance of each point that is at the interface between partitions as non-ghost points, while all other duplicate point instances become ghosts. If this pattern wasn’t followed, then statistics computed on point data across partitions would oversample points on the interfaces between the partitions.

In VTK, ghost points and ghost cells can be tagged using multiple values. The ghost cells generator only outputs vtkDataSetAttributes::DUPLICATECELL and vtkDataSetAttributes::HIDDENCELL, as well as vtkDataSetAttributes::DUPLICATEPOINT. The tag HIDDENCELL is only used on structured data sets (image data, rectilinear grids, structured grids), in the instance where we need to allocate a larger grid but are missing a neighbor to fill some newly allocated cells. A cell tagged HIDDENCELL should be ignored by any downstream filter.

Processing composite data sets

As told in the introduction, the ghost cells generator can process multi block data sets, partitioned data sets, as well as partitioned data set collections. Multi block data sets are processed in the same way as partitioned data sets. Partitioned data sets inside a collection are iterated on separately.

Inside a multi block or a set of partitions, only data sets sharing a common ancestor type that are supported are processed. For instance, an image data cannot share ghost cells with a rectilinear grid, or an unstructured grid cannot share ghost cells with a structured grid. But if the filter is run on a set of partitions including, for example, one image data and two unstructured grids, then the image data is untouched, and the two unstructured grids share ghosts if they are adjacent.

Let us illustrate how ghost cells are exchanged when the input is a partitioned data set (or a multi block equivalently). Pentagons represent a certain data set type, and stars another one that do not share any common supported ancestor, but that are supported by the filter. Let us assume that we are running in a multi-process environment, and that blue data sets are in rank 0, while red data sets are in rank 1. The lines represent on which data set types the filter will try to exchange ghosts. Obviously, data sets of the same type that are not adjacent with anyone, at the end of the pipeline, remain untouched.

Determining data set adjacency

Data sets adjacency is only determined by matching common points between partitions. In an unstructured grid, if at least one point matches a point of another unstructured grid, ghosts are exchanged between them. Point matching can be either done by looking at their position in 3D, or by providing global ids to the points, by adding an id array using vtkPointData::SetGlobalIds(vtkDataArray*). In such instances, point positions are ignored and point global ID are blindly used. This means that two mangled points in 3D with different global ID will be considered as different points, and that such points will not create a connection between the corresponding partitions at the point’s location. Note that global ID on structured data sets are not used. In this case, the points positions are used, even if global point ID are provided.


This filter behaves exactly as vtkUnstructuredGridGhostCellsGenerator behaved by default to preserve backwards compatibility. Hence, if you set the number of ghost layers using SetNumberOfGhostLayers, this number will be ignored. Instead, the input given by the streaming pipeline vtkStreamingDemandDrivenPipeline::UPDATE_NUMBER_OF_GHOST_LEVELS(). This happens when turning on vtkGhostCellsGenerator‘s BuildIfRequiredOn flag. If BuildIfRequired is off, then the maximum between the input number of layers that are set, and what is returned by the streaming pipeline, is used to generate the ghosts. By default, BuildIfRequired is on.


Questions or comments are always welcome!