BoundlessLogoWhite

Stay Connected with the Boundless Blog

Clustered Configuration with GeoServer Geospatial Software Part 2

In our last post on clustering, we talked about the theory behind some different options for clustering. In this post, we’ll go into an example of clustering, taken from our recent experience with one of our OpenGeo Suite Enterprise clients. If you’ll be attending FOSS4G-NA and want to learn more about clustering and GeoServer consider attending our GeoServer training and Juan Marin’s GeoServer in Production presentation (scheduled for 5/23/2013 at 11:30 am).

Clustering Scenario

In this following scenario, we will work through the installation and configuration of two GeoServers each inside their own servlet container instances on the same machine. Each servlet container will use the same JRE and the same container binaries (Apache Tomcat 7), but they will have independent configurations that allow them to run on different ports. These two GeoServer/Tomcat instances will be fronted by a local software proxy called HAProxy which acts as a HTTP/TCP load balancer. Load balancer configurations provide very basic “round robin” balancing of GeoServers. More sophisticated load-balancing configurations are possible, but are beyond the scope of this example. All GeoServers will be deployed as WAR files placed into each of the Tomcat webapps directories. It is possible to have multiple instances of Tomcat share a single web-application through the use of contexts. This is useful if you anticipate your web-application (GeoServer) will be changed/updated frequently, but isn’t necessary.

Implementation

The following steps will walk through the installation and configuration of a basic cluster containing two GeoServers in separate Tomcat servlet containers, behind an HAProxy load-balancer/proxy on the same machine (high-performance). The steps are:

  1. Download and unpack Tomcat binaries
  2. Create individual Tomcat instances
  3. Start Tomcat instances and deploy GeoServer applications
  4. Install and configure load balancer / proxy server

Following this walk-through, we will discuss other options for and extensions of this configuration such as high-availability, alternate proxy tools/configurations, database-backed catalogs, and triggered configuration reloads.

Download and unpack Tomcat binaries

  1. Start by downloading a binary distribution of Tomcat to your machine. This example uses the latest version (7.0.39 at the time of writing) as a .tar.gz file from http://tomcat.apache.org/download-70.cgi.
  2. Unpack this archive to a suitable location: In this example the entire contents of the archive are extracted to /var/tomcat. By convention, we’ll refer to this directory as $CATALINA_HOME.

Create individual Tomcat instances

Next we’ll make two directories for two separate instances of Tomcat to run from. Both instances will use the same Tomcat binaries from the directory created above; the directories created here will just hold the logs and configuration for each of our instances. In this example we’ll make two instance directories, /var/tomcat1 and /var/tomcat2. These will be referred to as $CATALINA_BASE directories. Each of these directories needs a basic structure and some initial content to host a Tomcat instance. Run the following commands (or a variation thereof) to create the basic directory structure.

# mkdir /var/tomcat1/conf
# mkdir /var/tomcat1/logs
# mkdir /var/tomcat1/temp
# mkdir /var/tomcat1/webapps
# mkdir /var/tomcat1/work
# mkdir /var/tomcat2/conf
# mkdir /var/tomcat2/logs
# mkdir /var/tomcat2/temp
# mkdir /var/tomcat2/webapps
# mkdir /var/tomcat2/work

Each Tomcat instance needs two configuration files to start with that define how the service will run (name, host(s), and ports). Copy the files server.xml and web.xml from the $CATALINA_HOME/conf directory into each of the $CATALINA_BASE/conf directories. We’ll need to edit each of the server.xml files so that each Tomcat instance runs and shuts down on different ports. In each file, look for lines like:

<Server port="8005" shutdown="SHUTDOWN">

<Connector port="8080" protocol = "HTTP/1.1"
 connectionTimeout="20000"
 redirectPort="8443" />

Change these port values. The values can typically be any unused port above 1024. Both the SHUTDOWN and the HTTP Connector ports must be different, and the values for these ports in tomcat1/conf/server.xml must be different than those in tomcat2/conf/server.xml. For example:

  • On tomcat1: SHUTDOWN on port 8005, serve HTTP requests on 8085
  • On tomcat2: SHUTDOWN on port 8006, serve HTTP requests on 8086

Start Tomcat instances and deploy GeoServer applications

Once configured, we’re ready to start up the servlet containers. We can do this by running a series of commands from the terminal, however it might be more pragmatic to write a small script to accomplish this since these steps often need to be repeated. Create two scripts: /var/tomcat1/tomcat1.sh and /var/tomcat2/tomcat2.sh. These scripts will be identical except for the value of the $CATALINA_BASE variable.

export CATALINA_HOME=/var/apache
export CATALINA_BASE=/var/apache1
export CATALINA_TMPDIR=$CATALINA_BASE/temp
export JRE_HOME=/usr/lib/jvm/java-6-openjre/jre
export CLASSPATH=/var/tomcat/bin/bootstrap.jar;/var/tomcat/bin/tomcat-juli.jar

$CATALINA_HOME/bin/catalina.sh start

These scripts will perform the following functions:

  • Define the location of the Tomcat binaries in $CATALINA_HOME
  • Set the location for the current container in $CATALINA_BASE and $CATALINA_TMPDIR
  • Define the location of the JRE for Tomcat to use (we’re assuming Java is installed)
  • Define the CLASSPATH to the core Tomcat JARs
  • Export each environment variable so they are available to the Tomcat start-up calls
  • Run the Catalina control script to start the Tomcat service

With the files created, run both /var/tomcat1/tomcat1.sh and /var/tomcat2/tomcat2.sh to configure and start the two services. Note that you’ll need to make the files executable. To confirm that the services started correctly, you can tail the catalina.out files in /var/tomcat1/logs and /var/tomcat2/logs. Next, copy the geoserver.war file (or unpacked application directory) into each of the /var/tomcat1/webapps and /var/tomcat2/webapps directories. The applications should automatically deploy. Confirm that you have two GeoServers running independently by browsing to each one on their respective port.

Two GeoServers
Two GeoServers

Install and configure load balancer / proxy server

Install a web server that is capable of acting as a load balancer in front of the cluster. In this example we use HAProxy installed on Ubuntu Linux using standard package management. That said, rhere are many other tools you can use as a front-end load balancer / proxy server other than HAProxy. One example would be Apache HTTP with mod_proxy_http and mod_proxy_balancer. Microsoft IIS can also sit in front of Tomcat instances using the Network Load Balancer (NLB) or Application Request Routing (ARR). Configure haproxy to act as a load balancer using the following sample in /etc/haproxy/haproxy.cfg. Stop haproxy, copy / edit the config file in place, and then restart.

global
    maxconn 4096
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlogpull
    retries 3
    option redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000
    log 127.0.0.1 local0
    log 127.0.0.1 local7 debug

frontend http-in
    bind *:80
    default_backend geoserver

backend GeoServer
    balance roundrobin
    server GeoServer1 localhost:8085 maxconn 32 check
    server GeoServer2 localhost:8086 maxconn 32 check
    ##option httpchk
    ##option forwardfor

listen admin
    bind *:8080
    stats enable

(Note that these are very basic configurations. Your system administrator will have a better idea of what is the norm for your organization.) There are a few ways you can confirm that HAProxy is working and balancing the clustered back-ends as anticipated. 1) Browse to the HAProxy admin page at http://<server>:8080/haproxy?stats:

HAProxy stats
HAProxy stats

In the figure above we see that HAProxy recognizes the GeoServer backend with two members GeoServer1 and GeoServer2. 2) You might also make a noticeable configuration to one or GeoSevers and observe that the single proxy URL is requesting data from all members of the backend. For example, a change to a single SLD file, and then requesting a layer that makes use of that style through HAProxy confirms that both of our GeoServers are handling requests.

Conflicting styles from two different GeoServers through HAProxy
Conflicting styles from two different GeoServers through HAProxy

A Common Data Directory

Normally, GeoServer data directories in a cluster will be identical in content. In this example we will set a common data directory for all cluster members using a context parameter in the web.xml file of each GeoServer in our cluster. You can specify the GeoServer data directory location several other ways. Make a new directory for your shared GeoServer Data Directory.

# mkdir /geoserver_data

Copy a template data directory into the new location. This step is optional. As long as the base data directory exists, GeoServer will create the basic configurations it needs if they’re not found on start-up. It’s just a bit more painless for this example if they are where we expect them.

# cp -r /var/tomcat1/webapps/geoserver/data* /geoserver_data

Stop the two GeoServer / Tomcat instances so we can reconfigure our data directory locations, either by killing the identified pids for the java processes that our GeoServers are running under, or (in a more sophisticated installation) using a service. Update the web.xml file in each web application WEB-INF directory. This file typically lives at $CATALINA_HOME/webapps/geoserver/WEB-INF/web.xml. Make four changes to each file:

  • Specify the new location of the GEOSERVER_DATA_DIR
  • Specify a GEOSERVER_LOG_LOCATION for this particular instance to log to. This will avoid collisions with the other GeoServer nodes in the cluster writing to the same location. For example, set to /geoserver_data/logs/geoserver_tomcat1.log
  • Set GWC_DISKQUOTA_DISABLED to true. This will avoid collisions with the other GeoServer nodes’ GWCs writing disk use information to common locations.
  • Set GWC_METASTORE_DISABLED to true. This will avoid collisions with the other GeoServer nodes’ GWCs writing cache status information to common locations.

Extending These Examples

This document is intended as just an example of setting up GeoServer in a clustered configuration, designed to move users towards a more scalable GeoServer installation, but might not be suitable for production in all environments. Future versions and alternate scenarios will take into consideration:

  • Scaling GeoServer horizontally (on multiple machines)
  • Hybrids of vertically and horizontally scaled instances
  • Better ways to manage and configure multiple servlet containers (Tomcat) and web applications (GeoServer)

What sort of GeoServer clustering environments are you interested in setting up? Let us know in the comments below.

Privacy Preference Center

Close your account?

Your account will be closed and all data will be permanently deleted and cannot be recovered. Are you sure?