Code.activestate.com

Doozer: Distributed Configuration Used by Heroku and Stackato

2012-08-15

I've blogged before about Heroku Buildpack support in Stackato, but there's another Heroku-backed innovation under the hood in Stackato. In version 2.0 we added a new system for managing state in a Stackato cluster, and this system (exposed by the new 'kato' command) uses a data store called Doozer.

Doozer allows Stackato administrators to monitor and update server components running in a cluster. This is different from the other important message queuing system used in Stackato, NATS, which is used for distribution of user applications and allocation of services.

Q & A with Phil Whelan

Some of the details of how Doozer works in Stackato get a bit over my head, so I've asked Phil Whelan a few questions about it. Phil, along with Sridhar Ratnakumar, did most of the implementation work for Doozer in Stackato.

Here's what Phil had to say about our use of Doozer in Stackato:

Why did Stackato need a data store like Doozer?

Stackato uses Doozer for distributed configuration and knowing when processes are up and running.

Prior to using Doozer, Stackato used YAML files on the local disk of every node for configuration. This meant that to update the configuration you had to log in to each node and manually update the configuration file for each process. We have a "cloud controller" that needed to be able to update any configuration on the cluster. This cloud controller is used by our web console and we wanted to give administrators the ability to configure their cluster directly from that one location, rather than having to log in to each machine.

Once a process has read its configuration from Doozer, and is fully up and running, it watches Doozer for the configuration changes which relate to it. This is a Doozer mechanism for pushing changes to clients that have registered interest in particular branch of the configuration tree.

This connection to Doozer, for monitoring configuration changes, doubles as an "ephemeral node", basically just another configuration value which contains the identity of the process. The ephemeral node disappears when the process disappears, so any other process watching who is connected will immediately be notified of this "configuration" change.

What other technologies were considered?

We looked at Apache ZooKeeper, which is part of the Apache Hadoop project and is well tested in large clusters.

We decided on Doozerd because it seemed like a lighter-weight solution and was better suited for the virtual machine environment that Stackato runs in. Apache ZooKeeper has a strong focus on persistence and its guidelines are very strict about running it on dedicated physical disks.

Doozerd does not have disk persistence, so we have to build that into our solution.

How is it different from NATS? Why couldn't Stackato just use that for cluster role management?

When we started to look at Doozer, some of us had lofty ideas that it may replace NATS altogether, but this is definitely not the case. NATS and Doozer provide two very different roles within a Stackato cluster.

NATS is used for communication between Stackato components and can do so at high volume and high complexity. It does not provide any storage.

Doozer is not about communication, unless you consider notifications of process and configuration changes as communication. It's about distributed storage.

We do not want to overload Doozer with too much information and too many responsibilities, such as storing information about user applications. It is designed specifically for small data sets and does that job well. At its core, Doozer uses the Paxos algorithm between nodes to ensure that all Doozerd servers are on the same page as to what that latest configuration is.

The 'kato' command was implemented first in Python, then in Ruby. Why is that?

At ActiveState, our team covers a broad range of programming languages and we choose the best tools for the job. We really like Python and when we started out writing kato it seemed like a good choice. One reason is due to libraries available for integrating with Supervisord [also used in Stackato], which itself is written in Python. Maybe I'll save Supervisord for another post...

Most of the Stackato components are written in Ruby and we found we were starting to replicate more and more of the same functionality in both Python and Ruby. It made sense to jump tracks with kato to Ruby while it was still young and reduce our workload.

Doozerd is written in Go, so it is likely there will be more activity around this language in the near future.

How were the Python and Ruby client libraries? Any issues?

I have not worked with Python for Doozer myself, though hopefully someone on the team will add a comment below.

With Ruby, there are 2 client libraries, "fraggle" and "fraggle-block" (are you spotting the Fraggle Rock theme yet?)

"fraggle" is based around EventMachine and is an asynchronous client library for Doozer. "fraggle-block" is a blocking client. I think the guys at Heroku only use the asynchronous version of these two, since fraggle-block was a bit rusty when I picked it up and it took some love to get it working again. I had no issues with asynchronous client for Ruby.

Did you have to make any changes to Doozer itself?

Yes. We made changes to add the "ephemeral nodes", so that we could tell which processes were connected to Doozer and see them disappear from Doozer if they died or went away for any other reason.

What's your favorite Doozer feature?

The "watchers". Having configuration changes pushed to watchers is very powerful. A process using a watcher can reconfigure itself in real-time without needing a restart. We even have the potential to push changes to the browser to update the Stackato Console in real-time. Lastly, because we model the state of processes (via ephemeral nodes) as configuration values we can use watchers to take real-time action in a similar way when we see changes in process state.

My second favourite feature is "revisions". Every change made to the Doozer configuration tree results in a new revision. You can even request the entire configuration tree at a particular revision (as long as the revision is not too old). This adds real robustness to reading and watching configuration. You can be sure you never miss a beat as long as you keep track of which revision you were at last. This is used heavily when we read the entire configuration for a process and then watch for changes to that process's configuration. We want to be sure that no updates have been made between these two calls.

-->

Trackback URL for this post:

http://www.activestate.com/trackback/3477