Planet.haskell.org

Yesod Web Framework: Yesod hosting with Docker and Kubernetes

2015-12-14

About a month ago, there were a few days of instability for yesodweb.com
hosting. The reason for this was that I was in the midst of moving hosting of
yesodweb.com (and a few other sites) to new hosting. I went through a few
iterations, and wanted to mention how the hosting now works, what I like, and
some pain points to notice.

The end result of this is a
Docker/Kubernetes
deployment, consisting of a single Docker image containing various sites (six
currently), and an extra application that reverse proxies to the appropriate
one based on virtual host. But let's step through it bit by bit.

Stack's Docker support

The Stack build tool supports using Docker in two
different ways, both of which are leveraged by this setup.

Using a Docker build image to provide build tools (like GHC), system
libraries, and optionally Haskell libraries, and performing the build within
such a container. This isolates your build from most host-specific
configurations, and grants you immediate access to many tools (like PostgreSQL
client libraries) without modifying your host.

Generate a Docker image based on a base image that includes necessary system
libraries, and includes generated executables and any additional files
requested (such as configuration files and static resources like CSS and
Javascript).

What's really nice about this setup vs a more standard Docker image generation
approach is that our generated runtime image (from (2)) does not include any
build-specific tools. This makes our images lighter-weight, and avoids having
unnecessary code in production (which is good from a security standpoint).

What's really nice about all of this is how simple the Stack configuration is
to make it happen. Consider the following stack.yaml file:

With this in place, running stack image container will generate the
snoyberg/yesodweb image, which I can then push to whatever Docker registry I
want using normal Docker commands.

For more information on Stack's Docker support, see the Docker integration
page.

Mega-repo

I initially deployed each of my sites as a separate deployment. However, for
various resource-related reasons (disk space, number of machines), I decided to
try out deploying the six sites as a single deployment. I'm not convinced yet
that this is a great idea, but it's certainly working in practice. The result
is my snoyman-webapps repo. As a
short snippet:

This file has a few important things to note:

A submodule for each site in the deployment, inside the sites
directory

The Kubernetes configuration (discussed below) in the kube
directory

A reverse proxying web
application
for running the sites and serving appropriate content based on virtual host.

Let's jump into that last one right away.

Reverse proxying

Probably the most instructive file on this program is the
webapps.yaml
config file. This shows that the web app is capable of:

Running child applications (the six sites I mentioned)

Reverse proxying to the appropriate applications

Performing simple redirects between domain names

In theory this code could be turned into something standalone, but for now it's
really custom-tailored to my needs here.

The biggest downside with this approach is that (without Server Name
Indication, or SNI) it
doesn't support TLS connections. I chose sites for this that are not served
over TLS currently, and do not handle sensitive information (e.g., no password
collection). Upgrading to have SNI support and using something like Let's
Encrypt would be a fun upgrade in the future.

Kubernetes configuration

Kubernetes uses YAML files for configuration. I'm not going to jump into the
syntax of the config files or the overarching model Kubernetes uses for running
your applications. If you're unfamiliar and interested, I recommended reading
the Kubernetes docs.

Secrets

Three of the sites I'm hosting have databases, and therefore the database
credentials need to be securely provided to the apps during deployment.
Kubernetes provides a nice mechanism for this: secrets. You specify some
(base64-encoded) content in a YAML file, and then you can mount a virtual
filesystem for your apps to access the data from. Let's have a look at the
haskellers.com (scrubbed) secrets file:

I have a separate secrets config for each site. The decision was made mostly
for historical reasons, since the sites were originally hosted separately. I
still like to keep them separate though, since it's easy to put the secrets
into different subdirectories, as we'll see next.

Replication controller

There's a lot of content in the replication controller
config.
I'll strip it down just a bit:

The interesting stuff:

We mount the haskellers.com secret at /app/haskellers/config/db, where the
app itself expects it

The webapps app needs to know what port to listen on, so we tell it via the
PORT environment variable

We also tell Kubernetes that the application is listening on port 3000

The internally listened on ports for each application are irrelevant to us:
the webapps app handles though for us automatically

Service (load balancer)

The load balancing service is quite short and idiomatic:

Updates

When it comes time to make updates to one the these sites, I do the following:

Change that site's repo and commit

Update the submodule reference for snoyman-webapps

Run stack image container && docker push snoyberg/snoyman-webapps

See the stack.yaml file for details

Perform a rolling update with Kubernetes: kubectl rolling-update snoyman-webapps --image=snoyberg/snoyman-webapps:latest

Some notes:

The rolling-update is lack-luster in Kubernetes; it can fail to work for a
variety of reason I have yet to fully understand. My biggest advice: when
possible, create single-container replication controllers.

I always just push and use the latest image. For more control/reliability, I recommend tagging Docker images with the Git SHA1 of the commit being built from. I'm lazy for these sites, but for client deployments at FP Complete, we always follow this practice.

Google Container Engine

I started off this whole project as a way to evaluate Kubernetes. Based on
that, I started hosting this on Google Container Engine instead of fiddling
with configuring AWS to host Kubernetes myself. Overall, I'm happy with how it
turned out. I had a few ugly issues

Running out of disk space due to the large number of Docker images (likely the primary motivation to moving towards a single Docker image for all the sites).

All my sites went down one day when my account switched over from the trial to non-trial. I don't remember getting an email warning me about this, which would have been nice.

A n1-standard-1 instance size has been plenty to support all of these sites,
which is nice (yay lightweight Haskell/Yesod!). That said, at FP Complete, we
host our stuff on AWS, and have had pretty good experience with running
Kubernetes there.

Conclusion

Overall, I find the Docker/Kubernetes deployment workflow quite pleasant to
work with. I may find more hiccups over time, but for now, I'd strongly
recommend people consider it for deployments of their own, especially if you're
using tooling like Stack that makes it so easy to create Docker images.