CMake and the Wolfram Language

April 17, 2014

For the past 25 years, Wolfram has “incubated,” so to speak, the burgeoning Wolfram Language inside its flagship product, Mathematica, in the computation module of the system called the kernel.  These technologies have evolved into the Wolfram Language and the Wolfram Computation Engine, respectively.

The following text comes from the Wolfram Language product description:
“The Wolfram Language is a highly developed knowledge-based language that unifies a broad range of programming paradigms and uses its unique concept of symbolic programming to add a new level of flexibility to the very concept of programming. ”

In this article, I will describe how the Wolfram team uses CMake to facilitate the development and deployment of new builds of the Wolfram Language and the Wolfram Computation Engine that executes it.

Wolfram Language Software Engineering

The Wolfram Language comprises a complex software system made up of several million lines of C/C++, Wolfram Language, and Java code.   Wolfram uses a customized enhancement of C (hereafter called Wolfram-C) that supports object-oriented behavior and customized memory management operations of certain data structures.  The source files written in Wolfram-C are processed by a tool called the precompiler prior to compilation with standard C compilers.  Our build environments can also build the Wolfram-C files as C++ sources.  We do not use the C++ version for production due to the speed enhancements and memory optimizations that are available in the C version.  After compiling the base portion of the Wolfram Engine from Wolfram-C, a large portion of the remainder of the system is processed and implemented in the Wolfram Language itself.  In addition to the primary Wolfram-C sources and the Wolfram Language sources, creating the Wolfram Engine also requires a set of third-party libraries built specifically for each platform.

Development Pipeline

Along with developing features and functionality for the Wolfram Language, Wolfram has also strived to develop a streamlined development pipeline, capable of building and delivering versions of Mathematica (and the Wolfram Language) from stable sources in as little time as possible.  In 2008, having narrowed our list of supported platforms to the major three platforms available today, Macintosh, Windows, and Linux, we reached a point where we had to maintain three separate build environments to build the Wolfram Engine.  This ternary build strategy caused no end of trouble for our developers.

For example, the addition or deletion of source material required updates in three separate systems.  Developers wasted an inordinate amount of time just communicating changes to sources.

Furthermore, over time, each build environment had acquired certain development characteristics and facilities that made it difficult for developers to leave one environment and work on problems on another platform.
Those same idiosyncrasies made it difficult for new developers to get started working on a particular platform.
In addition to overall problems in the development pipeline, each platform’s build environment had its own set of build curios that contributed to problems for developers.

Linux

On Linux, we used a build environment based on make, but configured using the X11 project’s Imake  tool.  The Imake-derived build system suffered from various behaviors, which are listed below.
First, the system did not correctly handle source file dependencies, resulting in developers wasting time with unnecessary module rebuilds.

Second, Imake configuration, as a combination of C pre-processor syntax and make syntax, eventually led to a system that was too confusing for many developers to change.

Third, the Imake-based system did not correctly run builds in parallel.

Fourth, the original designer of the Imake-based system created a large monolithic build environment that built every module for each build.  Therefore, it was not easy for the developers of individual modules to only build the modules they needed.

Windows

On Windows, we made use of a custom combination of make and Visual Studio.  For each build, the system used make to invoke the precompiler tool on each Wolfram-C source file to create the generic C source suitable for use with the Visual Studio compiler.  After that step, a Visual Studio project drove the compilation of the C portion of the system, and custom scripts drove the processing of the Wolfram Language source files.  This system suffered from several problems.

For one, the initial portion required a make environment for pre-processing the Wolfram-C source files with the precompiler tool.  This requirement led most Windows developers to build the C++ version of the Wolfram Engine that did not require the use of the precompiler tool.   Developing the C++ version of the product meant that most Windows developers never built or directly tested the production version of the Wolfram Engine.

In addition, the varied stages of this system made it difficult to modify configuration parameters.  The project files
contained the build settings and required spelunking through projects in order to find the correct location to change build settings.

Furthermore, the Visual Studio projects did not assemble the software into finished file layouts but, instead, relied on registry entries to point various components to pre-existing layouts constructed by the primary product installer.
Finally, the system lacked cohesive documentation that most often manifested problems in acquiring the correct set of third-party external libraries used to build the Wolfram Engine for Windows.

Macintosh

The Xcode build of the Wolfram Engine had numerous problems, which are listed below.

The Xcode projects only built the C++ version of the Engine.  As a result, Macintosh production builds did not use the Xcode projects. Instead, they used the Imake-configured make system.

In addition, Xcode does not provide a mechanism for informing the Xcode build processor that it should compile files with custom extensions as particular languages.  For example, you could not (from inside of Xcode on a global basis) have Xcode compile all .wlc files as C or C++ source files.  In Xcode, you can change the source file type for each file individually, but when you have thousands of source files with custom extensions, the Xcode environment quickly becomes unwieldy, making the develop->compile->debug cycle difficult.

In order to avoid the first two problems, most Macintosh developers did not use the Xcode project.  As a result, the Xcode project source lists were often out of date.

Enter CMake

I first encountered CMake while experimenting with Kitware’s Insight, Segmentation, and Registration Toolkit for image processing.  I quickly realized that CMake had all the necessary features to solve our build environment problems.  CMake allowed us to merge all three build systems into one system, while still serving the needs of each individual platform.  CMake provided three gigantic productivity gains: 1. The three build platforms all now use the same set of configuration files to enumerate sources and resources (no more communicating source changes), 2. CMake can configure all three build environments to have the same set of features (e.g., they can all easily acquire requisite third-party libraries before builds, they can all build the production version using the precompiler tool and generate file layouts, they can all build the C++ version, etc.), and 3. CMake produces dependency correct build systems that correctly build in parallel in all build environments (where the native build tool can build in parallel), drastically reducing build times for production builds on our build servers.  For example, Linux builds of the Wolfram Engine went from build times of around an hour to build times on the order of five minutes thanks to the ability to leverage multiple processors on the build servers.

In addition to addressing the primary problems with our build environments, having standardized on CMake has made transitioning to mobile platforms particularly easy.  CMake’s build description language is flexible enough to allow us to easily extend our support for the Wolfram Engine on iOS and Android, as well as hobbyist systems such as Raspberry Pi and Intel’s new platform, Edison.

Customizing CMake

While CMake solved the majority of our needs, we did find the need to customize CMake to handle some portions of our build requirements that the standard CMake implementation did not cover.  For example, we found that we needed to have add_custom_command() syntax that allowed custom commands to vary by build configuration.  We changed the standard add_custom_command() signature to the following:

add_custom_command(OUTPUT output1 [output2 ...]
COMMAND command1 [ARGS] [args1...]
[COMMAND command2 [ARGS] [args2...] ...]
[MAIN_DEPENDENCY depend]
[DEPENDS [depends...]]
[IMPLICIT_DEPENDS <lang1> depend1
[<lang2> depend2] ...]
[WORKING_DIRECTORY dir]
[COMMENT comment] [VERBATIM] [APPEND]
——>   [CONFIG Debug | MinSizeRel | Release |
RelWithDebInfo | ...]
)
add_custom_command(TARGET target
PRE_BUILD | PRE_LINK | POST_BUILD
COMMAND command1 [ARGS] [args1...]
[COMMAND command2 [ARGS] [args2...] ...]
[WORKING_DIRECTORY dir]
[COMMENT comment] [VERBATIM]
—->   [CONFIG Debug | MinSizeRel | Release |
RelWithDebInfo | ...]
)

The CONFIG option specifies that the custom command(s) should run only when the given build configuration is active.  This option allows custom commands to have configuration specific behavior.

We also had to add changes to CMake to correct various minor problems in the Visual Studio and Xcode generators that prevented the generators from correctly handling certain linker and compiler flags.  Finally, we added support for Objective-C as a fully supported language understood by CMake.  Here is an example of a CMakeLists.txt file configured for building an Objective-C library:

cmake_minimum_required(VERSION 2.8)

project(ObjectiveKernel OBJC C CXX)

set(OBJECTIVEKERNEL_HEADERS
ObjectiveKernel.h
OKExprStackElement.h
OKKernel.h
OKKernelDelegate.h
)

set(OBJECTIVEKERNEL_SOURCES
OKExprStackElement.m
OKExprStackElement.h
OKKernel.m
OKKernel.h
OKKernelDelegate.h
)

set_source_files_properties(
OKExprStackElement.h
OKKernel.h
OKKernelDelegate.h
PROPERTIES HEADER_FILE_ONLY ON
)

add_library(ObjectiveKernel
STATIC
${OBJECTIVEKERNEL_SOURCES}
)

install(TARGETS ObjectiveKernel
ARCHIVE DESTINATION lib
)

install(FILES ${OBJECTIVEKERNEL_HEADERS}
DESTINATION include
)

 

I am working with the CMake developers to merge these customizations back into the main CMake product.

Conclusion

Any organization doing non-trivial, cross-platform development with C, C++, Fortran, Objective-C, or another similar language should seriously consider using CMake to minimize the stress of cross-platform development on the development pipeline.  CMake smoothed out the wrinkles in our development processes and allowed for fast turnaround times on our build servers.   In addition, CMake allowed us to fix platform specific build idiosyncrasies that caused problems for unified development efforts.  On behalf of our developers, I offer thanks to the CMake project and its developers for providing such a useful and productive software development tool.

Steve Wilson is a Senior Technical Staff Member in the Core Mathematica Engineering group at Wolfram.  He works on communication protocols for the Wolfram Language, as well as software engineering and software infrastructure for the Wolfram Engine.

Leave a Reply