Supporting tomorrow’s data storage systems in VTK / ParaView

Sage 2

Context

As part of the SAGE2 project funded by the European Union, Kitware worked on porting VTK to a new generation of data storage system.

SAGE2 is a consortium of European Experts in data storage, data management and supercomputing technologies, lead by Seagate.
The Precursor SAGE project built a prototype storage system in 2017 that is now running and being extended at the Juelich Supercomputing Centre in Germany.
This prototype:

  • Gives us a deeper understanding of the usage of Non Volatile Memories (NVM)
  • Is a storage system that can accommodate any storage device type (Disk, SSD, NVM)
  • Runs on software that can help the storage system to keep growing indefinitely
  • Is a storage system that can also do computations
  • Can work with low power processing technology – based on “arm”

Most of the project was open-sourced as part as the Cortx community.
A distributed object store, replacing the usual filesystem, called Motr was developed.

The main challenge for Kitware was to use one of the higher-level API of Motr to be able to read data from the object store and pass them to a visualization pipeline.

Developments

To interact with Motr, we choose the Ummap-io library improved by Atos, as it offers a generic API to work with several back-ends (in our case, filesystem and Motr).

A first implementation was done as a ParaView plugin. This allows us to use a release version of ParaView, instead of create patches/branches harder to maintain and update.
The plugin adds a new source to ParaView, the MeroVTKReader (Mero is the old name of Motr), with two properties: the URI and the size of the object.
Internally, we forward those information to Ummap-io and get back a pointer to the corresponding memory in return.
Then we pass this pointer to a VTK ASCII reader, which creates a vtkDataObject to feed the pipeline.

ParaView plugin in action on the SAGE2 system

Developping a ParaView plugin has a nice side effect: it brings the SAGE2 technologies to the Web!

Visualizer natively supports the plugin and ParaView Lite only requires some UI elements.

Using Ummapio to load data in Visualizer
Easy customization of ParaView Lite

Good points: this solution is small, efficient and non-intrusive.
Not so good point: most of the file readers cannot read from a raw pointer and require a file path to create a fstream from it. So our approach cannot be used for those.

Next step: the boost streams

Then we thought about the boost iostreams library, an easy way to write a custom stl-compatible stream.
We wrote an implementation of a UmmapioSource that creates a ummap-io mapping and can be used as standard stream, for instance

  io::stream<UmmapIoSource> fileStream(uri);
  for (std::string line; getline(fileStream, line);)
  {
    std::cout << line << std::endl;
  }

The idea behind that is to hook vtksys::ifstream and replace it with our UmmapioSource, so most of VTK readers will be ported to Motr system without additional work.

Pros: with this approach, a lot of readers and writers may take benefit of new storage technologies. We did it with SAGE2 Motr but we can imagine even more! Reading from network, from database, etc…
Cons: it requires an actual modification of the inner tools (the VTK library). Adding a new dependency (such as UmmapIo) and modifying core code is harder and longer than writing a plugin.

Whereas the plugin was developed and tested, the boost approach is still at the proof of concept step.

References

Acknowledgments

  • SAGE2 consortium funded by the European Union
  • Sebastien Vallat @ Atos for ummap-io support.

Leave a Reply