Catalyst improvements in ParaView 5.9

ParaView 5.9 release includes several major changes to ParaView’s in situ data processing and visualization capabilities. My intent with this post is to provide the motivation for and an overview of some these changes with links to documentation for those seeking additional details.

Broadly speaking, there are three areas in which we can categorize these changes:

  • setting up and exporting Python files for Catalyst from ParaView GUI,
  • structure of Catalyst Python scripts, and
  • API for developing Catalyst adaptors and instrumenting simulations.

Setup / Export Catalyst Python files from ParaView GUI

One of the nice features of ParaView has been that you can use to the GUI to set up a visualization pipeline and then save out a Python script. This script can then be used with Catalyst-instrumented simulation codes to execute the analysis and visualization pipeline in situ. One of the challenges has been, however, how does one describe the outputs that the analysis generates i.e. if you want to save out files for datasets generated from certain filters in the pipeline or save out images from certain views, how do we describe those. Over the years, we tried various approaches. Initial implementation relied on an Export Wizard that would let you choose which views to save out and when. Past couple of releases introduced an Export Inspector which replaced the wizard. I won’t go into the details of the usability challenges with either of these approaches. Suffice it to say that neither was easy to use. Those experiments, however, led us to come up with a cleaner solution in the form of Extractors.

Extractors are a brand new concept introduced with ParaView 5.9. These are pipeline objects, just like readers, and filters. You can create them in the UI, and they will show up in the Pipeline Browser. You can change properties on these extractors using the Properties panel. There are two types of extractors: data extractors and image extractors. Data extractors act like the Save Data menu action, when triggered, they save datasets generated by data sources or filters to disk. Image extractors act like the Save Screenshot menu action and when triggered, save results from views to images on disk. Unlike the comparable menu actions, however, these extractors automatically get triggered when a new timestep is produced by the simulation and thus are repeatable. You can, of course, customize the triggering to limit it to every n-th timestep etc. using the Properties panel, for example. See the ParaView Guide for more details on how to create, setup and use these extractors.

Side note: The ParaView Guide has now become an online edition hosted on ReadTheDocs. In coming releases, we plan to migrate the tutorials and Catalyst guide to this as well and thus providing a single entry point for accessing a lot of the useful user documentation.

Catalyst Python scripts

Traditionally, the Python scripts intended for in situ use with Catalyst were quite different from those intended for pvbatch or pvpython. While large sections of it appeared similar there were considerable differences that made it nearly impossible to use the same script for both, in situ and post-processing, use-cases. Unifying these two types of scripts was one of objectives of the changes to the Catalyst Python scripts in this release. Making them simpler and easier to debug was the second objective. The extractors made it possible for us to get closer to achieving these objectives.

For a step-by-step description of the capabilities of these new Catalyst Python scripts, refer to the documentation here. The page also describes how to use some of the mini-apps now packaged with ParaView to test these Catalyst scripts.

These new scripts are *not* backwards compatible with older versions of Catalyst Python scripts. However, note that those older versions are *still supported*. So if you have scripts that you have built over the years, fret not. They will continue to work in this release.

Catalyst Instrumentation and Adaptor API

The final set of changes affect how simulations are instrumented to use Catalyst. These changes should be considered as a preview- or alpha-release at this time. They are not fully functional yet and hence may not be suitable for production use. However, early adopters are definitely welcome and would help iron out kinks and help prioritize development. Also outstanding are updates to documentation including the Catalyst User Guide, Catalyst tutorials and examples. Expect those to be updated in the coming months as well.

The motivation for these changes is to minimize and simplify the effort needed for instrumenting a simulation to use Catalyst. Instrumenting a simulation refers to the code-development necessary to enable use of Catalyst for in situ processing. In previous releases, this instrumentation required a decent understanding of VTK and how to efficiently transform simulation data structures to VTK. Over the years, it became apparent that this was not an easy task, especially for folks who are only peripherally aware of the the VTK nuances. It was very easy to accidentally use API that causes deep-copies, for example, resulting in unnecessary overheads that could be easily avoided. Once you had a simulation code instrumented, keeping it up-to-date with new releases of ParaView/VTK was just as daunting. First, you needed a ParaView build from source since you can’t use ParaView binaries distributed from or available on the HPC system for post processing use. Second, the APIs often change requiring additional upkeep on the Catalyst adaptor side. With 5.9, we introduce a new way of instrumenting simulations for use with Catalyst that attempts to avoid these difficulties.

A detailed exposition on this new API is beyond the scope of this introductory post. Suffice it to say that more posts/tutorials are planned that will cover this in detail. A short summary of this new approach is as follows:

Catalyst is now a new VTK/ParaView independent project. It defines an API specification developed for simulations (and other scientific data producers) to analyze and visualize data in situ. The project includes a light-weight / canonical implementation of this API specification. This implementation, called stub, can be used when instrumenting simulation codes. It is possible to develop implementations of the Catalyst API that are ABI-compatible with the stub. ParaView binaries will include such an ABI-compatible implementation of the Catalyst, called ParaView-Catalyst. A simulation instrumented with the stub, can seamlessly switch to using ParaView-Catalyst to use ParaView for in situ processing at runtime by manipulating environment variables. Multiple versions of ParaView can continue to provide ABI-compatible implementations with the stub, since the Catalyst API itself is independent of ParaView and hence doesn’t need to change with each release of ParaView.

Instead of converting simulation data structures to VTK, the new approach relies on the simulations describing their data structures. All mapping necessary is handled by the Catalyst implementation, in our case, ParaView-Catalyst. To describe the data, we simply use Conduit’s C API. Catalyst API, too, is C-based. This is crucial to simplify development of ABI-compatible implementations.

More documentation is in order, so if this has piqued your interest, stay tuned. If you’re interested in reading the discussion that led to this design, check this discourse post out.

Catalyst docs are available here. These describe the Catalyst API. ParaView-Catalyst, the ParaView-based implementation of Catalyst, is documented in the ParaView’s Doxygen pages here. Looks for the links under the ParaView Catalyst heading.

Questions or comments are always welcome!