Stay Connected with the Boundless Blog

GeoGig Library for Python Announced

Victor OlayaAt Boundless, we usually describe GeoGig (formerly GeoGit) not as an application itself, but as a library. We see it as a basic component of geospatial data management on top of which other applications can be built. While GeoGig currently has a command-line interface (CLI), adding new ways of interacting with GeoGig will increase the possibilities for creating new projects that rely on GeoGig to manage changes to geospatial data. We hope to see GeoGig as the core of an ecosystem of tools that solve a variety of problems.

We have started developing a Python library, the geogig-py library (formerly geogit-py), to make it much easier to create GeoGig-based applications. Since Python is a widespread scripting language, this will allow other developers to incorporate GeoGig-based capabilities into many other applications. In fact, we are already using it to create a plugin to bring the versioning capabilities of GeoGig into QGIS.

Basic GeoGig Automation

The geogig-py library will also make it easier to automate tasks when working with a GeoGig repository, since all the great features of the Python language can be used alongside GeoGig methods. This represents a great feature for all GeoGig users, especially those that use it in a workflow that can be partially of fully automated.

Here are some examples to provide an idea of what using the library is like. A basic workflow should start with something like this:

# Create repo
repo = Repository('path/to/repo/folder', init = True)

# Configure
repo.config(geogig.USER_NAME, 'myuser')
repo.config(geogig.USER_EMAIL, '')

# Add some data and create a snapshot
repo.addandcommit('first import')

You can automate this first step and easily import a set of layers , creating a different snapshot for each one. Assuming that we have a set of shapefiles in a folder, the following code will do it.

for f in os.listdir(folder):
if f.endswith('.shp'):
   path = os.path.join(folder, f)
   repo.addandcommit('Imported ' + f)

Editing Features

In a normal GeoGig workflow, you export from a GeoGig repository, edit the exported data using the tool of your choice (i.e. a desktop GIS like QGIS), and then import the changed layer so GeoGig can compute the changes that have been introduced, which are later used to create a new snapshot.

With geogig-py, that approach is still possible, but you can also edit without exporting while directly modifying a feature. Internally, geogig-py still calls GeoGig import/export commands but wraps them to expose them in a more practical way. It is also more efficient, since it does not import/export the whole layer. Here’s an example.

# Take a feature and modify its geometry
feature = repo.feature(geogig.HEAD, 'parks/1')
geom = feature.geom
attributes = feature.attributesnogeom
newgeom = geom.buffer(5.0)

# insert the modified geometry and create a new snapshot with the changes
repo.insertfeature(feature.path, attributes, newgeom)
repo.addandcommit('modified parks/1 (buffer computed)')

In this case we have computed a buffer, but you can modify the geometry as you like. Geometries are Shapely objects, so you can use all the methods in that powerful library to work with them. You can also modify the non-geometry attributes in the feature (though we haven’t done so in that example).

Working with GeoGig Branches

Working with branches is also rather simple:

# Create a branch at the current HEAD commit to work on it
repo.createbranch(repo.head, 'mybranch', checkout = True)

# [...] Perform some work on the branch, modifying the repo data and creating new commits

# Bring changes to master branch (which might itself have changes)
        print 'Merge correctly executed. No merge conflicts'
except GeoGigConflictException, e:
        print 'Cannot merge. There are merge conflicts'

Growing GeoGig

Although most of the functionality of GeoGig is already available through geogig-py, some features are not yet supported. Unsupported features mostly correspond to convenience options that can very easily be implemented, or replicated with a few lines of Python code.

A side effect of developing geogig-py is that it has helped us improve GeoGig itself. Several new features have been added to GeoGig commands to allow for a better integration, and some new commands have even been implemented. Using GeoGig from geogig-py has given us more insight into ways that GeoGig can be used by different applications and services, and has helped us shaping it and improving it.

A comprehensive test suite is included with geogig-py, which represents a real test suite for GeoGig itself, adding to the large collection of sets that GeoGig has. Moreover, we have plans to use geogig-py as part of our quality assurance, specifically for testing GeoGig and prototyping GeoGig use cases.

Soon we will release some of the projects that we are working on that rely on geogig-py. Stay tuned for further updates.