[caption id="attachment_77192" align="alignright" width="237"] Early release available.[/caption]
Editor's note: this post is from Karl Matthias and Sean P. Kane, authors of "Docker Up & Running," a guide to quickly learn how to use Docker to create packaged images for easy management, testing, and deployment of software.
At the Python Developers Conference in Santa Clara, California, on March 15th, 2013, with no pre-announcement and little fanfare, Solomon Hykes, the founder and CEO of dotCloud, gave a 5-minute lightning talk where he first introduced the world to a brand new tool for Linux called Docker. It was a response to the hardships of shipping software at scale in a fast-paced world, and takes an approach that makes it easy to map organizational processes to the principles of DevOps.
The capabilities of the typical software engineering company have often not kept pace with the quickly evolving expectations of the average technology user. Users today expect fast, reliable systems with continuous improvements, ease of use, and broad integrations. Many in the industry see the principles of DevOps as a giant leap toward building organizations that meet the challenges of delivering high quality software in today’s market. Docker is aimed at these challenges.
While ostensibly a virtualization platform, Docker is far more than that. It spans a few crowded segments of the industry that include technologies like KVM, Xen, Mesos, Capistrano, Puppet, Ansible, Chef, and many more. It’s interesting to note that these products represent virtualization, deployment, and configuration management tools, yet Docker is simultaneously disrupting all of them. Each of the individual technologies in this list is generally acclaimed for their ability to improve productivity. Docker is generating a great deal of excitement specifically because it sits right in the middle of some of the most enabling technologies of the last decade.
If you were to do a feature-by-feature comparison of Docker and the reigning champion in any of those specific technology segments, Docker would very likely look like an unthreatening competitor. What truly sets Docker apart is it's positioning to become a foundational technology that can easily support a DevOps-inspired workflow across the whole lifecycle of an application.
The challenges
In traditional deployment workflows there a quite a few required tasks that significantly contribute to the overall pain felt by teams and increase the overall risk inherit in software projects. Some of the specific problems that Docker can mitigate include:
Needing to install and manage software dependencies at every step in the process.
Compiling software multiple times throughout testing, packaging, and deployment.
Packaging, testing and distributing every single software dependency.
Carefully managing the potential conflicts with other software installed on a system.
Supporting deployment to multiple Linux distributions.
The high potential for conflict that arises from broadly distributed ownership of the development and deployment process.
Among others...
Docker brings to the table a workflow to help organizations overcome some of these challenges.
There are also things that Docker is not a great fit for. If any of these are serious requirements in your environment for a majority of your applications then Docker may not be a good fit for your needs.
Running legacy applications that are not highly available.
Running non-Linux applications or jobs.
They are also some common mis-conceptions about what Docker is. These are a few common things, that Docker does not provide.
A replacement for your configuration management tool.
An out-of-the-box Platform as a Service (PaaS).
An enterprise virtualization platform.
A private cloud platform.
The Docker workflow
A major problem in incorporating DevOps successfully into a company's processes is that many people have no idea how where to start. Tools are often incorrectly presented as the solution to what are fundamentally process problems. Adding virtualization, automated testing, deployment tools, or configuration management suites to the environment often just changes the nature of the problems without really solving them.
It would be easy to dismiss Docker as just another tool making unattainable promises about fixing your business processes, but that would be selling it short. Where Docker’s power truly meets the road is in the way that its natural workflow allows applications to travel through their whole lifecycle from conception to retirement, within one ecosystem. That workflow is often opinionated, but it follows a path that simplifies the adoption of some of the core principles of DevOps. It encourages development teams to understand the whole lifecycle of their application, and allows operations teams to support a much wider variety of applications on the same runtime environment.
Minimizing deployment artifacts
Docker alleviates the pain induced by sprawling deployment artifacts. It does this by making it incredibly easy to create a single artifact, called an image file, that contains everything your Linux application requires to run, within a protected runtime environment called a container. Containers can then be easily deployed on modern Linux distributions. Developers using Windows and OS X systems can develop with Docker by using native client tools to manage a Docker daemon on Linux-based virtual machines or physical hardware.
Leveraging Docker allows software developers to create Docker images that, starting with the very first proof of concept release, can be run locally, tested with automated tools, and deployed into integration or production environments without ever rebuilding them. This means that it is easy to ensure that the application and underlying operating system that a developer tested on their laptop are exactly the same as what gets deployed into production. Nothing needs to be recompiled or repackaged during the complete workflow, which significantly lowers the normal risks inherent in many deployment processes. It also means that a single build step replaces a typically error-prone process that involves compiling and packaging multiple complex components for distribution.
Docker images also simplify the installation and configuration of an application, by ensuring that every single piece of software that an application requires to run on a modern Linux kernel is contained in the image, with nothing else that might cause dependency conflicts in many environments. This makes it trivial to run multiple applications that rely on different version of core system software on the exact same server.
Optimizing storage & retrieval
Docker leverages file system layers to allow containers to be built from a composite of multiple images. This saves considerable disk space by allowing multiple containers to be based on the same lower level OS image, layering on the application image, and then utilizing a copy-on-write process to copy files into a new top layer, only when they are modified by the running application(s). It also significantly shrinks the size of the deployed application by only shipping the layers that have changed to the servers.
To support image retrieval Docker leverages a repository called the registry for hosting your images. While not revolutionary on the face of it, the registry actually helps split team responsibilities clearly along the lines embraced by DevOps principles. Developers can build their application, test it, ship the final image to the registry, and deploy the image to the production environment, while the operations team can focus on building excellent deployment and cluster management tooling that pulls from the registry and that runs reliably and ensures environmental health. This enables both teams to focus on what they do best without constantly getting in each other's way.
The payoff
As teams become more confident with Docker and the workflow it encourages, the realization dawns that containers create an incredibly powerful abstraction layer between all of their software components and the underlying operating system. Done correctly, organizations can begin to move away from the legacy need to create custom physical servers or virtual machines for most applications, and instead deploy fleets of identical Docker hosts that can then be used as a large pool of resources to dynamically deploy their applications to, with an ease that was never before possible.
When these process changes are successful, the cultural impact within a software engineering organization can be dramatic. Developers gain more ownership of their complete application stack, including many of the smallest details that would typically be handled by a completely different group. Operations teams are simultaneously freed from trying to package and deploy complicated dependency trees with little or no detailed knowledge of the application.
In a well designed Docker workflow, developers compile and package the application, which makes it much easier for them to become more operationally focused and ensure that their application is running properly in all environments, without being concerned about significant changes introduced to the application environment by the operations teams. At the same time, operations teams are freed from spending most of their time supporting the application and can focus on creating a robust and stable platform for the application to run on. This dynamic creates a very healthy environment where teams have clearer ownership and responsibilities in the application delivery process, and friction between the teams is significantly decreased.
Getting the process right has a huge benefit to both the company and the customers. With organizational friction removed, software quality is improved, processes are streamlined, and code ships to production faster. This all helps free the organization to spend more time providing a satisfying customer experience and delivering directly to the broader business objectives. A well-implemented Docker-based workflow can greatly help organizations achieve those goals.
This post is part of our ongoing exploration into end-to-end optimization.