06julAll Day12SciPy 2014

Event Details

Members of Kitware are participating in several presentations and activities.

‘Reproducible Science: Walking the Walk.’ This tutorial will be presented by Luis Ibáñez, Aashish Chaudhary, Jean-Christophe Fillion-Robin, and Matt McCormick.

Description

Reproducibility verification is the core principle of the scientific method. The goal of this tutorial is to train reproducible research warriors on the real principles of scientific research and on practices and tools that make it possible to implement an end-to-end data analysis workflow enabling reproducibility verification. These tools, all open source, combined with open data practices and open access publications, empower anyone to practice reproducible research today.

The tutorial will expose attendees to the tools and practices by doing hands-on activities and completing an exercise that includes data gathering, storage, analysis up to publication into a reproducible article. The target audience is researchers generating scientific content (data, analysis code, and articles) and data scientists. Attendees are expected to have basic familiarity with scientific python, usage of git, and be interested in the scientific publication and validation process.

‘SimpleITK – Advanced Image Analysis for Python.’ This talk is co-authored by Luis and Matt.

Description

SimpleITK brings advanced image analysis capabilities to Python. SimpleITK exposes a large collection of image processing filters from ITK, including image segmentation and registration. SimpleITK is freely available as an open source package under the Apache 2.0 License.

While there are many Python packages to process 2D photographic images, scientific image analysis adds additional requirements. Images encountered in these domains often have anisotropic pixel spacing, or spatial orientations, and calculations are best performed in physical space as opposed to pixel space.

SimpleITK brings to Python a plethora of capabilities for performing image analysis. Although SimpleITK was developed by the biomedical imaging community, it is also used for generic image processing. It differentiates from OpenCV in offering 3D images and multi-component images, and it differentiates from scipy by offering the abstraction of image classes and their associated data structures. This applies to images modalities such as CT scans, MRI, fMRI, ultrasound, and in microscopy modalities such as confocal, SEM, TEM, and traditional bright and dark field.

Among the key functionalities supported by SimpleITK are over 260 advanced image filtering and segmentation algorithms as well as access to scientific image file formats, including specialized formats such as DICOM, Nifti, NRRD, VTK and other formats that preserve 3D metadata.

SimpleITK development is sponsored by the US National Library of Medicine.

‘Python cross-compilation and platform builds for HPC and scientific computing.’ This presentation will be given by Jean-Christophe, Aashish, and Matt.

Description

A system for faster, straightforward cross-platform CPython builds across HPC, desktop, and mobile platforms with multiple build system generators and easy integration with C/C++/Fortran scientific computing libraries will be described.

Abstract

While the Python language has seen multiple implementations across a number of languages as it has grown in popularity, the original C-implementation of Python, CPython, remains the most widely adopted implementation for scientific computing. This can be largely attributed to the ubiquitous presence of C-build systems on scientific computing platforms and the large number of libraries that have a C interface, which are bridged with C-Python Extension Modules. CMake [1] is a popular cross-platform build system that performs reliable system introspection and configuration of C/C++/Fortran builds. The scientific computing community has tooled the python.org C-Python distribution with a CMake configuration so CPython can be built with the CMake build system. This enables cross-compilation for HPC clusters, the raspberry PI, and ARM architectures such as those found in mobile platforms, static and shared builds, and a static python with C extension modules included in the library, for example. Additionally, it provides capabilities such as faster compilation, cross-platform builds with multiple build system generators, easy integration with other CMake configured projects, and configuration and linking other scientific computing libraries into C-Extensions. It is a community maintained, open source project available on Github [2] with nightly test results submitted to a software quality dashboard [3].

[1] http://www.cmake.org
[2] https://github.com/davidsansome/python-cmake-buildsystem
[3] http://open.cdash.org/index.php?project=CPython

‘Climate & GIS: User friendly data access; workflows; manipulation; analysis and visualization of climate models.’ This presentation will be given by Aashish.

Description

The impact of climate change will resonate through a broad range of fields including public health, infrastructure, water resources, and many others. Long-term coordinated planning, funding, and action are required for climate change adaptation and mitigation. Unfortunately, widespread use of climate data (simulated and observed) in non-climate science communities is impeded by factors such as large data size, lack of adequate metadata, poor documentation, and lack of sufficient computational and visualization resources. Additionally, working with climate data in its native format is not ideal for all types of analyses and use cases often requiring technical skills (and software) unnecessary to work with other geospatial data formats.

We present open source tools developed as part of ClimatePipes and OpenClimateGIS to address many of these challenges by creating an open source platform that provides state-of-the-art user-friendly data access, processing, analysis, and visualization for climate and other relevant geospatial datasets – making the climate and other geospatial data available to non-researchers, decision-makers, and other stakeholders.

The overarching goals are:

  • Enable users to explore real-world questions related to environment and climate change.
  • Provide tools for data access, geo-processing, analysis, and visualization.
  • Facilitate collaboration by enabling users to share datasets, workflows, and visualization.

Some of the key technical features include 1) Support for multiprocessing for large datasets using Python-celery distributed task queuing system, 2) Generic iterators allowing data to be streamed to arbitrary formats (relatively) easily (e.g., ESRI Shapefile, CSV, keyed ESRI Shapefile­CSV, NetCDF), 3) NumPy­ based array computations allowing calculations such as monthly means or heat indices ­optionally on temporally grouped data slices, 3) Decorators to expose existing Python API as a RESTful API, 4) Simple to use, lightweight Web-framework and JavaScript libraries for analyzing and visualizing geospatial datasets using D3 and WebGL.

‘Web-based Analysis and Visualization for Large Geospatial Datasets for Climate Scientists.’ This presentation will be given by Aashish.

Description

At present, the majority of the climate science community still relies heavily on primitive analysis and visualization tools that are based on the thick (or fat) client application concept, meaning that the user must download software to appropriate machines or hardware where the data resides (e.g., laptops, desktops, or HPC machines). In such cases, users encounter multiple level of installation challenges, such as finding the right prerequisite software packages, software versions, and currently supported hardware and operating systems. Analysis and visualization tools have thus begun moving toward the thin client application concept, where users install very little software. In most cases, only a web browser is needed. In such cases, an analysis and visualization software is deployed on a central server rather than each individual system, eliminating users’ installation and operating system requirement challenges. This approach also provides the flexibility to install the software on a user’s system using VM (virtual operating system) in case a installation is needed by scientists.

The need for a highly scalable, collaborative, and easy to install and use software for large climate and geospatial data analysis and visualization leads us to the development of the UVis toolkit. UVis utilizes the latest in web-technologies such as RESTful API and HTML5 to provide powerful visualization and analysis capabilities on modern web-browsers. Underneath, it is built on top of scientific python based UV-CDAT, ParaViewWeb and DJango python web-framework. The UVis backend python API is developed on top of UV-CDAT to utilize its analysis and visualization capabilities. The web API of UVis utilizes ParaViewWeb at its core. ParaViewWeb enables communication with a UV-CDAT server running on a remote visualization node or cluster using a light-weight JavaScript API. By utilizing UV-CDAT and ParaViewWeb, UVis provides capabilities of interactive 2D and 3D visualization, remote job submission and processing, and exploratory and batch-mode analysis for scientific models and observational datasets.

Matt is chairing the Birds-of-a-Feather (BOF) sessions and the Vision, Imaging, and Visualization symposium.

Matt, Luis, Aashish, and Jean-Christophe are participating in conference’s developer sprints.

Matt and Aashish are members of the Program Committee.

In addition, Kitware is a Silver sponsor for the event.

 

Time

july 6 (Sunday) - 12 (Saturday)

Questions or comments are always welcome!

X