Introducing Versio: Distributed Version Control for Spatial Data

Boundless is pleased to introduce Versio, our new data management and collaboration platform built specifically for spatial data. We’ve been working on Versio for over a year, extending the power of the GeoGig approach into an online platform that will transform the way organizations collaborate on, control, and share their spatial data.

Versio is now available in private beta with select customers and community members. We expect to release the full platform in early 2015, but if you would like to get an early look and help us improve the product, please request an invitation at

The Problem

Spatial data is dynamic, and anyone who works with it knows that keeping it updated is a challenge. The larger the organization, the worse this problem gets. Increasing team sizes, field-based data collection, and integrated workflows that depend on the publishing of timely, updated datasets increases the complexity of managing spatial data. Too often this challenge is met by emailing shapefiles around, resulting in a file naming nightmare, and a labor-intensive process to merge data together. Alternatively, if you have the money, an expensive versioning database can be used, but even that suffers from potential centralized outages, file locks, and complex system administration.

The Solution

To solve this problem, we looked for a new model of collaboration – one that has revolutionized software development – distributed version control systems.  Boundless first introduced the distributed versioning concept to Spatial IT with GeoGig, an open source tool that draws inspiration from Git but adapts its core concepts specifically for spatial data. As GeoGig has matured, we built Versio to broaden the audience by including a high-performance server and easy-to-use web interface.

Versio’s distributed versioning model and repository data structure enables new approaches to collaboration and data management. On the collaboration side, data owners and GIS analysts determine specifically who they collaborate with, creating private data repositories and inviting others to join, or sharing a data repository for the
crowd to update. Changes to a dataset can be stored in a different “branch” of the repository, and data owners have explicit control over what changes are merged back into the core repository.

The repository data structure preserves the entire lineage of a dataset, allowing a user to track and visualize changes over time. Rather than relying on a central database, each individual’s working copy is a complete repository of the spatial data. Each “commit” to the repository is saved as a discrete version, yet also maintains its relationship to previous versions in the repository. This allows users to visualize the lineage of features across all versions, as well as the changes between different versions. Additionally, file sizes are minimized since only the changes between versions are stored (eliminating redundant data) and it is easy to rollback to a previous version of a dataset.

With Versio, we want to support as many editing workflows as possible, both online and offline. So we’ve made the platform client-agnostic to support traditional desktop GIS software as well as web, mobile, or custom applications built on the Versio API.

