2015-07-17

Written by: Ryan Hallisey

Today in the cloud space, a lot of buzz in the market stems from Docker and providing support for launching containers on top of an existing platform. However, what is often overlooked is the use of Docker to improve deployment of the infrastructure platforms themselves; in other words, the ability to ship your cloud in containers.



Ian Main and I took hold of a project within the OpenStack community to address this unanswered question: Project Kolla. Being one of the founding members and core developers for the project, I figured we should start by using Kolla’s containers to get this work off the ground. We began by deploying containers one by one in an attempt to get a functioning stack. Unfortunately, not all of Kolla’s containers were in great shape and they were being deployed by Kubernetes. First, we decided to get the containers working, then deal with how they’re managed later. In the short term, we used a bash script to launch our containers, but it got messy as Kubernetes was opening up ports to the host and declaring environment variables for the containers, and we needed to do the same. Eventually, we upgraded the design to use an environment file that was populated by a script, which proved to be more effective. This design was adopted by Kolla and is still being used today[1].

With our setup script intact, we started a hierarchical descent though the OpenStack services, starting with MariaDB, RabbitMQ, and Keystone. Kolla’s containers were in great shape for these three services, and we were able to get them working relatively quickly. Glance was next, and it proved to be quite a challenge. Quickly, we learned that the Glance API container and Keystone were causing one another to fail.



The culprit was that Glance API and Keystone containers were racing to see which could create the admin user first. Oddly enough, these containers worked with Kubernetes, but I then realized Kubernetes restarts containers until they succeed, avoiding the race conditions we were seeing. To get around this, we made Glance and the rest of the services wait for Keystone to be active before they start. ][this section read oddly to me. Based on the first sentence, I had the impression that you didn’t want the “race.” Then the next sentence suggests that Kubernetes prevents the “race,” but it seems that the “restarting until “success” approach wasn’t what you wanted either, but the third sentence suggests that you worked around Kubernetes. It could that this will all make sense to someone familiar with containers and Project Kolla, but as a non-expert reader, I was confused.] Later, we pushed this design into Kolla, and learned that Docker has a restart flag that will force containers to restart if there is an error.[2] We added the restart flag to our design so that containers will be independent of one another.

The most challenging service to containerize was Nova. Nova presented a unique challenge not only because it was made up of the most number of containers, but because it required the use of super privileged containers. We started off using Kolla’s containers, but quickly learned there were many components missing. Most significantly, the Nova Compute and Libvirt containers were not mounting the correct host’s directories, exposing us to one of the biggest hurdles when containerizing Nova: persistent data and making sure instances still exist after you kill the container. In order for that to work, Nova Compute and Libvirt needed to mount /var/lib/nova and /var/lib/libvirt from the host into the container. That way, the data for the instances is stored on the host and not in the container[3].

echo Starting nova compute

docker run -d –privileged \

–restart=always \

-v /sys/fs/cgroup:/sys/fs/cgroup \

-v /var/lib/nova:/var/lib/nova \

-v /var/lib/libvirt:/var/lib/libvirt \

-v /run:/run \

-v /etc/libvirt/qemu:/etc/libvirt/qemu \

–pid=host –net=host \

–env-file=openstack.env kollaglue/centos-rdo-nova-compute-nova:latest

A second issue we encountered when trying to get the Nova Compute container working was that we were using an outdated version of Nova. The Nova Compute container was using Fedora 20 packages, while the other services were using Fedora 21. This was our first taste of having to do an upgrade using containers. To fix the problem, all we had to do was change where Docker pulled the packages from and rebuild the container, effectively a one line change in the Dockerfile:

From Fedora:20

MAINTAINER Kolla Project (https://launchpad.net/kolla)



To

From Fedora:21

MAINTAINER Kolla Project (https://launchpad.net/kolla)



OpenStack services have independent lifecycles making it difficult to perform rolling upgrades and downgrades. Containers can bridge this gap by providing an easy way to handle upgrading and downgrading your stack.

Once we completed our maintenance on the Kolla containers, we turned our focus to TripleO[4]. TripleO is a project in the OpenStack community that aims to install and manage OpenStack. The name TripleO means OpenStack on OpenStack, where it deploys a so called undercloud, and uses that OpenStack setup to deploy an overcloud, also known as the user cloud.

Our goal was to use the undercloud to deploy a containerized overcloud on bare metal. In our design, we chose to deploy our overcloud on top of Red Hat Enterprise Linux Atomic Host[5]. Atomic is a bare bones Red Hat Enterprise Linux-based operating system that is designed to run containers. This was a perfect fit because it’s a bare and simple environment with nice set of tools for launching containers.

[heat-admin@t1-oy64mfeu2t3-0-zsjhaciqzvxs-controller-twdtywfbcxgh ~]$ atomic –help

Atomic Management Tool

positional arguments:

{host,info,install,stop,run,uninstall,update}

commands

host                            execute Atomic host commands

info                             display label information about an image

install                          execute container image install method

stop                            execute container image stop method

run                               execute container image run method

uninstall                      execute container image uninstall method

update                        pull latest container image from repository

optional arguments:

-h, –help                  show this help message and exit

Next, we had help from Rabi Mishra in creating a Heat hook that would allow Heat to orchestrate container deployment. Since we’re on Red Hat Enterprise Linux Atomic Host, the hook was running in a container and it would start the heat agents; thus allowing for heat to communicate with Docker[6]. Now we had all the pieces we needed.

In order to integrate our container work with TripleO, it was best for us to copy Puppet’s overcloud deployment implementation and apply our work to it. For our environment, we used devtest, the TripleO developer environment, and started to build a new Heat template. One of the biggest differences between using containers and Puppet, was that Puppet required a lot of setup and config to make sure dependencies were resolved and services were being properly configured. We didn’t need any of that. With Puppet, the dependency list looked like[7]:

puppetlabs-apache

puppet-ceph



44 packages later…



puppet-openstack_extras

puppet-tuskar

With Docker, we were able to replace all of that with:

atomic install kollaglue/centos-rdo-

We were able to use a majority of the existing environment, but now starting services was significantly simplified.

Unfortunately, we were unable to get results for some time because we struggled to deploy a bare metal Red Hat Enterprise Linux Atomic Host instance. After consulting Lucas Gomes on Red Hat’s Ironic (bare metal deployment service) team, we learned that there was an easier way to accomplish what we were trying to do. He pointed us in the direction of a new feature in Ironic that added support for full image deployment[8]. Although there was a bug in Ironic when using the new feature, we fixed it and started to see our Red Hat Enterprise Linux Atomic Host running. Now that we were past this, we could finally create images and add users, but Nova Compute and Libvirt didn’t work. The problem was that Red Hat Enterprise Linux Atomic Host wasn’t loading the kernel modules for kvm. On top of that, Libvirt needed proper permission to access /dev/kvm and wasn’t getting it.

#!/bin/sh

chmod 660 /dev/kvm

chown root:kvm /dev/kvm

echo “Starting libvirtd.”

exec /usr/sbin/libvirtd

Upon fixing these issues, we could finally spawn instances. Later, these changes were adopted by Kolla because they represented a unique case that could cause Libvirt to fail[9].

To summarize, we created a containerized OpenStack solution inside of the TripleO installer project, using the containers from the Kolla project. We mirrored the TripleO workflow by using the undercloud (management cloud) to deploy most of the core services in the overcloud (user cloud), but now those services are containerized. The services we used were Keystone, Glance, and Nova; with services like Neutron, Cinder, and Heat soon to follow. Our new solution uses Heat (the orchestration service) to deploy the containerized OpenStack services onto Red Hat Enterprise Linux Atomic Host, and has the ability to plug right into the TripleO-heat-templates. Normally, Puppet is used to deploy an overcloud, but now we’ve proven you can use containers. What’s really unique about this, is that now you can shop for your config in the Docker Registry instead of having to go through Puppet to setup your services. This allows for you to pull down a container where your services come with the configuration you need. Through our work, we have shown that containers are an alternative deployment method within TripleO that can simplify deployment and add choice about how your cloud is installed.

The benefits of using Docker in a regular application are the same as having your cloud run in containers; reliable, portable, and easy life cycle management. With containers, lifecycle management greatly improves TripleO’s existing solution. The upgrading and downgrading process of an OpenStack service becomes far simpler; creating faster turnaround times so that your cloud is always running the latest and greatest. Ultimately, this solution provides an additional method within TripleO to manage the cloud’s upgrades and downgrades, supplementing the solution TripleO currently offers.

Overall, integrating with TripleO works really well because OpenStack provides powerful services to assist in container deployment and management. Specifically, TripleO is advantageous because of services like Ironic (the bare metal provisioning service) and Heat (the orchestration service), which provide a strong management backbone for your cloud. Also, containers are an integral piece of this system, as they provide a simple and granular way to perform lifecycle management for your cloud. From my work, it is clear that the cohesive relationship between containers and TripleO can create a new and improved avenue to deploy the cloud in a unique way to implement get your cloud working the way that you see fit.

TripleO is a fantastic project, and with the integration of containers I’m hoping to energize and continue building the community around the project. Using our integration as a proof of the project’s capabilities, we have shown that using TripleO provides an excellent management infrastructure underneath your cloud that allows for projects to be properly managed and grow.

[1]          https://github.com/stackforge/kolla/commit/dcb607d3690f78209afdf5868dc3158f2a5f4722

[2]          https://docs.docker.com/reference/commandline/cli/#restart-policies

[3]          https://github.com/stackforge/kolla/blob/master/docker/nova-compute/nova-compute-data/Dockerfile#L4-L5

[4]          https://www.rdoproject.org/Deploying_RDO_using_Instack

[5]          http://www.projectatomic.io/

[6]          https://github.com/rabi/heat-templates/blob/boot-config-atomic/hot/software-config/heat-docker-agents/Dockerfile

[7]          http://git.openstack.org/cgit/openstack/TripleO-puppet-elements/tree/elements/puppet-modules/source-repository-puppet-modules

[8]          https://blueprints.launchpad.net/ironic/+spec/whole-disk-image-support

[9]          https://github.com/stackforge/kolla/commit/08bd99a50fcc48539e69ff65334f8e22c4d25f6f

Show more