Blog.openshift.com

Build your own Google Maps (and more) with GeoServer on OpenShift

2014-02-10

Greetings Shifters! Today we are going to continue in our spatial series and bring up Geoserver on OpenShift and connect it to our PostGIS database. By the end of the post you will have your own map tile server OR KML (to show on Google Earth) or remote GIS server.

The team at Geoserver has put together a nice short explanation of the geoserver and then a really detailed list. If you want commercial support, Boundless will give you a commercial release and/or support for all your corporate needs. Today though I am only going to focus on the FOSS bits.

Getting started

There are two ways to run Geoserver. They ship a version that includes a Jetty container so you can just unzip and run on your local machine. They also ship just a WAR file (a pre-packaged Java web application) which you can drop into a pre-existing Java Application Server and it "just runs". Since OpenShift comes with several flavors of application servers, we are just going to use the WAR file.

I made a Github repo with all the code you need to run this so you can actually just do:

This is using version 2.4.2 (released on Nov 19, 2013) which I downloaded from the Geoserver download page. But this only brings up Geoserver instance. What I would recommend is to actually create a scalable application with Tomcat 7 and PostgreSQL 9.2 - in this way we can plug Geoserver into PostGIS. To do this we carry out two stages with the app creation

By using the -s flag we now use two gears for our application - one for Tomcat and the HAProxy load balancer and another for the PostgreSQL. In this way Tomcat and PostgreSQL are not competing for CPU and memory, which might be tight in 512 megs of RAM for PostgreSQL and a Java Application server.

To enable PostGIS please follow the directions in the previous blog post I wrote. Then there is another post on how to load data into PostGIS - please be sure to ignore the instructions on how to set up PostGIS (those instructions are for PostGIS 1.5). Here is a post talking about how to get data put in your PostGIS database.

How to install Geoserver on OpenShift

Let's show the steps I went through to build this quickstart.

We just cd'ed into the git repository created with the rhc app create command above.

First delete pom.xml since we will not be doing Maven builds - we are just deploying a WAR file.

Next, we need to work on setting up the configuration directory that Geoserver uses for configuration files and data files. By default, Geoserver wants to write these files into the exploded war directory. On OpenShift we don't have write permissions in that directory but luckily we can tell Geoserver to use a different directory. We do this by setting a configuration option to Tomcat.

WHY you do not have the permissions?

We do this by cd'ing into the .openshift/action_hooks and creating a file titled pre_start_jbossews-2.0. There is an open bug to prepopulate this directory with all the valid action hooks but until then you need to create and edit the file. The contents of this file will be executed before Tomcat 7 (JBoss EWS 2.0) is started. Add the following line:

This line is setting the CATALINA_OPTS for the Geoserver data directory to be the OpenShift data dir in a directory called geoserver_data. We have write permissions to this directory.

Now we need to quickly SSH into our application. Note, you only need to do this if you DON'T want to use the example data sets in the github repository I provide. There is a script in that repo that will create this directory automatically if it is not present. Since we are in the git repo on our local machine and we used the rhc command line tools we can ssh and execute the one command by just doing:

Next we take the Geoserver WAR file and add it to the webapps directory in our git repository. Then we rename the file ROOT.war (capitalization matters). Any WAR files in this directory are deployed to Tomcat on OpenShift. This will make the Geoserver application available as the root application. If you want this to have your Tomcat also have other apps it may make more sense to leave the app names geoserver.war. The side effect of this would be that all the URL below would start with http://yourapp-yourdomain.rhcloud.com/geoserver

If you don't want all the sample data sets and map styles you can skip the next step. For the repository I built I added it because I want it to be easier for the beginner to get themselves acquainted with all the functionality of GeoServer.

Finally, in order to have the sample datasets and styles installed, we need to unpack a portion of the WAR file, put it in the git repo., and write a short script to copy it over to the GeoServer data directory on application creation.

Here is the little bash script I wrote. It tests for the presence of one of the directories that we unpacked. If it is not there then copy the data over else do nothing. I put this script in the .action_hooks directory and I titled it: pre_start.

When we are done editing this file we still need to set the execute flag on this file. If you are on any system but MS Windows you can do:

If you are on Windows (or are chmod adverse) you can use git from the command line:

And with that I was finished with all the prep work I needed to do on my local machine. Make sure you are at the top of git repo and then final steps on the local machine are:

Deployment

If you paid attention to the size of the geoserver.war file you noticed it is slightly over 58 megs, which is huge for a WAR file. This has several consequences:

The git push will take a while - helps to have a fast connection

The deploy of the WAR file will take a while

It will take minutes for the application to full start

All of these factors mean you should get up from the computer while this runs and go stretch your legs for a few minutes (or if you work at home like me - go put up a load of laundry). Then when you get back you will see a log output that contains errors:

Don't freak out - this is due to the long time it takes to deploy the WAR - everything should be fine.

Managing your Geoserver instance

Almost all the administration you will want to do of Geoserver happens through the web console. The console can be found at the following URL

Again, you will need to wait a while for the pages to compile and render for the first time. When it is finished you should now see a page that looks like this:

The first thing we need to do is change the admin password. Please login with the fields in the upper right corner highlighted in red. The username will be admin and the password will be geoserver - which is the first thing we will change.

On the logged in screen you will see a bunch of warnings, the last of which talks about the need to change the admin password. You can take care of those other warnings when you have more time but please click on the link for changing the password.

On the page that comes up please change the password to something more secure and then scroll to the bottom and click save.

Previewing layers

If you look on the left side of the page (even if you are not logged in) there is the option to select previews:

Once you click on that link you will see a page that let's you preview all the data sets you have published:

The red box outlines the icons that show you the "type" of the dataset. The grid is for raster images and the rest of the icons are for point, lines, and polygons.

The green box outlines the links you can click to get a simple web map or a KML link. The map is just a simple OpenLayers example but you can also click the three line icon on the top left of the map and it will expose an "advanced" option to play around with how the map images are delivered. Here is the fully exposed tiger:tiger_roads:

The blue box outlines all the possible result sets you can get. For example, for vector formats you can pick geoJSON and get back a JSON file that is the data in that dataset. The URL used for that feed can be used by your web application to request that JSON feed directly to a web page.

Have fun looking at all the sample data sets and format. But when you are done let's go ahead and add a connection to some of our PostGIS datasets.

Connecting to PostGIS

Making sure there is spatial data in your database

Remember way back when we created this application we made it scalable added PostgreSQL to the application. If you followed along and added the datasets from the other blog posts then you are all set. If not I have made life easier for you. I have added two SQL files to the base of the repository that will add PostGIS to your database and then import datasets for the roads and the building footprints in Santa Cruz County, California.

It is important to load Streets_final.pg.sql since that also includes the call to create the PostGIS database. So as above you are going SSH into your gear and then run the commands below (make sure you are in your local git repo to get the SSH command to work):

You now have two spatial data sets loaded into PostGIS. Now let's add those data sets to be served up through GeoServer. When we are done they will be available to look at through the preview page.

Adding the data sets to GeoServer

Back in the web console, you are logged in as admin you will notice on the left side there is a list for Workspaces, Stores, and Layers. In Geoserver, a workspace is a logical container for all the "data" related to a project. Some of the data could be in a DB, some could be in flat files - the workspace allows you to group them. Then from Stores you choose which data to publish as layers. Once a layer is published you can now expose it to the outside world AND it shows up in the preview. So let's get started publishing our new data sets:

First Click on workspaces and then click on Add new workspace on the top of the page. In the fields put Name = awesome, Namespace URI = http://whatever.com, and check Default Workspace and then click submit. This brings us back to the workspace menu.

Second Click on the Stores on the left side and then click on Add new store on the top of the page. You now see the lists of all the default data stores that Geoserver understands. Go ahead and click on the first PostGIS link.

Set the workspace to awesome, set "Data Source Name" to SCC_PostGIS, and put anything you want in the description. To get the information to fill in the Connection parameters you need to SSH your gear again and grab some environment variables.

Your output should look something like this (values changed of course):

You need to use these values to fill in the connection parameters. Here is a screen shot of what my screen looks like:

You can leave everything else as the default, scroll to the bottom of the page, and click save. This brings us right to the next page we want to use...

Third We should be looking at a page that shows the datasets you put in the PostGIS database. Go ahead and click the Publish link on the line with sccobldgfoot. On the next page there are only two required actions but if you want feel free to change the name and description and all that other fun stuff. What you HAVE to do is scroll down to the section titled "Bounding Boxes", click the first link titled "Compute from data" and this should automagically compute native bounding box from your data. Then click the link "Compute from native bounds" which will automagically fill in the Lat/Lon bounding box. Go ahead and click save.

You should now be looking at the list of layers. If you want to publish the roads you just need to click on the Add a new resource link on the top of the page, choose awesome:SCC_PostGIS and you will be able to see the streets data with the publish link. Follow the same process as the previous paragraph.

Finally When you are all done publishing your layers, go ahead and click on layer preview. Your new layers will be in the list. Please remember there are a lot of polygons in the building footprints and in a small gear we don't have much memory to spare so it may take a few seconds to render at the top level.

Conclusion

My goal for today's post was to introduce you to GeoServer, show you how to run it (along with PostGIS) on OpenShift, and make the process easy for you to get started. I did not really talk about all the great things that GeoServer can do, but at this point you have a normal functioning GeoServer instance to play with. From here some good next steps would be:

Start reading the Geoserver Documentation and you could skip most of the early stuff and start with the web administration interface

Play with some of the features in a GeoServer sub-project, GeoWebCache which actually is the software caching the generated map images under the hoods. Please remember in the free tier you only have 1 Gig of disk space available (including your git repo.) so be conservative with how many tiles you cache.

If you really want to start map serving I would highly recommend to upgrade to the silver plan and create medium gears for your application (both PostGIS and Tomcat love memory). I would also immediately grab the extra disk space so you can put more data into both your Postgresql db and also the Tomcat gear can host a larger map cache.

In a future post, I will cover other cools things to do with our geospatial platform we just spun up. You now have the capability to stand up a geospatial db (PostGIS) and a geospatial application server (Geoserver). Now all the fun stuff can begin and we get to do it all for Free on OpenShift.

Next steps

Get an OpenShift account and host your web apps on the Free Plan today

Promote your awesome app in the OpenShift Application Gallery by applying today.

Ask an OpenShift question and get help on StackOverflow