Planet.python.org

Glyph Lefkowitz: A Container Is A Function Call

2016-08-14

It seems to me that the prevailing mental model among users of container
technology1 right now is that a container is a tiny little virtual machine.
It’s like a machine in the sense that it is provisioned and deprovisioned by
explicit decisions, and we talk about “booting” containers. We configure it
sort of like we configure a machine; dropping a bunch of files into a volume,
setting some environment variables.

In my mind though, a container is something fundamentally different than a
VM. Rather than coming from the perspective of “let’s take a VM and make it
smaller so we can do cool stuff” - get rid of the kernel, get rid of fixed
memory allocations, get rid of emulated memory access and instructions, so we
can provision more of them at higher density... I’m coming at it from the
opposite direction.

For me, containers are “let’s take a program and made it bigger so we can
do cool stuff”. Let’s add in the whole user-space filesystem so it’s got all
the same bits every time, so we don’t need to worry about library management,
so we can ship it around from computer to computer as a self-contained unit.
Awesome!

Of course,
there are other ecosystems
that figured this out
a really long time ago,
but having it as a commodity within the most popular server deployment
environment has changed things.

Of course, an individual container isn’t a whole program. That’s why we need
tools like compose to put containers
together into a functioning whole. This makes a container not just a program,
but rather, a part of a program. And of course, we all know what the smaller
parts of a program are called:

Functions.2

A container of course is not the function itself; the image is the function.
A container itself is a function call.

Perceived through this lens, it becomes apparent that Docker is missing some
pretty important information. As a tiny VM, it has all the parts you need: it
has an operating system (in the docker build) the ability to boot and reboot
(docker run), instrumentation (docker inspect) debugging (docker exec)
etc. As a really big function, it’s strangely anemic.

Specifically: in every programming language worth its salt, we have a type
system; some mechanism to identify what parameters a function will take, and
what return value it will have.

You might find this weird coming from a Python person, a language where

is considered an acceptable level of type documentation by some3; there’s no
requirement to say what a, b, and c are. However, just because the type
system is implicit, that doesn’t mean it’s not there, even in the text of the
program. Let’s consider, from reading this tiny example, what we can discover:

foo takes 3 arguments, their names are “a”, “b”, and “c”, and it returns a
value.

Somewhere else in the codebase there’s an object with an x method, which
takes a single argument and also returns a value.

The type of <unknown>.x’s argument is the same as the return type of
another method somewhere in the codebase, <unknown-2>.d

And so on, and so on. At runtime each of these types takes on a specific,
concrete value, with a type, and if you set a breakpoint and single-step into
it with a debugger, you can see each of those types very easily. Also at
runtime you will get TypeError exceptions telling you exactly what was wrong
with what you tried to do at a number of points, if you make a mistake.

The analogy to containers isn’t exact; inputs and outputs aren’t obviously in
the shape of “arguments” and “return values”, especially since containers tend
to be long-running; but nevertheless, a container does have inputs and outputs
in the form of env vars, network services, and volumes.

Let’s consider the “foo” of docker, which would be the middle tier of a 3-tier
web application (cribbed from a real live example):

In this file, we can only see three inputs, which are filesystem locations:
/clf, /site, and /etc/ssl/private. How is this different than our Python
example, a language with supposedly “no type information”?

The image has no metadata explaining what might go in those locations, or
what roles they serve. We have no way to annotate them within the
Dockerfile.

What services does this container need to connect to in order to get its job
done? What hostnames will it connect to, what ports, and what will it expect
to find there? We have no way of knowing. It doesn’t say. Any errors about
the failed connections will come in a custom format, possibly in logs, from
the application itself, and not from docker.

What services does this container export? It could have used an EXPOSE
line to give us a hint, but it doesn’t need to; and even if it did, all
we’d have is a port number.

What environment variables does its code require? What format do they need
to be in?

We do know that we could look in requirements.txt to figure out what
libraries are going to be used, but in order to figure out what the service
dependencies are, we’re going to need to read all of the code to all of them.

Of course, the one way that this example is unrealistic is that I deleted all
the comments explaining all of those things. Indeed, best practice these days
would be to include comments in your Dockerfiles, and include example compose
files in your repository, to give users some hint as to how these things all
wire together.

This sort of state isn’t entirely uncommon in programming languages. In fact,
in
this popular GitHub project
you can see that large programs written in assembler in the 1960s included
exactly this sort of documentation convention: huge front-matter comments in
English prose.

That is the current state of the container ecosystem. We are at the “late ’60s
assembly language” stage of orchestration development. It would be a huge
technological leap forward to be able to communicate our intent
structurally.

When you’re building an image, you’re building it for a particular purpose.
You already pretty much know what you’re trying to do and what you’re going to
need to do it.

When instantiated, the image is going to consume network services. This is
not just a matter of hostnames and TCP ports; those services need to be
providing a specific service, over a specific protocol. A generic
reverse proxy might be able to handle an arbitrary HTTP endpoint, but an API
client needs that specific API. A database admin tool might be OK with just
“it’s a database” but an application needs a particular schema.

It’s going to consume environment variables. But not just any variables;
the variables have to be in a particular format.

It’s going to consume volumes. The volumes need to contain data in a
particular format, readable and writable by a particular UID.

It’s also going to produce all of these things; it may listen on a network
service port, provision a database schema, or emit some text that needs to
be fed back into an environment variable elsewhere.

Here’s a brief sketch of what I want to see in a Dockerfile to allow me to
express this sort of thing:

An image thusly built would refuse to run unless:

Somewhere else on its network, there was an etcd host/port known to it, its
host and port supplied via environment variables.

Somewhere else on its network, there was a postgres host, listening on port
5432, with a name-resolution entry of “pgwritemaster.internal”.

An environment variable for the etcd configuration was supplied

A writable volume for /logs was supplied, owned by user-ID 4321 where it
could write common log format logs.

There are probably a lot of flaws in the specific syntax here, but I hope you
can see past that, to the broader point that the software inside a container
has precise expectations of its environment, and that we presently have no way
of communicating those expectations beyond writing a Melvilleian essay in each
Dockerfile comments, beseeching those who would run the image to give it what
it needs.

Why bother with this sort of work, if all the image can do with it is “refuse
to run”?

First and foremost, today, the image effectively won’t run. Oh, it’ll start
up, and it’ll consume some resources, but it will break when you try to do
anything with it. What this metadata will allow the container runtime to do is
to tell you why the image didn’t run, and give you specific, actionable, fast
feedback about what you need to do in order to fix the problem. You won’t have
to go groveling through logs; which is always especially hard if the back-end
service you forgot to properly connect to was the log aggregation service. So
this will be an order of magnitude speed improvement on initial deployments and
development-environment setups for utility containers. Whole applications
typically already come with a compose file, of course, but ideally applications
would be built out of functioning self-contained pieces and not assembled one
custom container at a time.

Secondly, if there were a strong tooling standard for providing this metadata
within the image itself, it might become possible for infrastructure service
providers (like, ahem,
my employer)
to automatically detect and satisfy service dependencies. Right now, if you
have a database as a service that lives outside the container system in
production, but within the container system in development and test, there’s no
way for the orchestration layer to say
“good news, everyone!
you can find the database you need here: ...”.

My main interest is in allowing open source software developers to give service
operators exactly what they need, so the upstream developers can get useful bug
reports. There’s a constant tension where volunteer software developers find
themselves fielding bug reports where someone deployed their code in a weird
way, hacked it up to support some strange environment, built a derived
container that had all kinds of extra junk in it to support service discovery
or logging or somesuch, and so they don’t want to deal with the support load
that that generates. Both people in that exchange are behaving reasonably. The
developers gave the ops folks a container that runs their software to the best
of their abilities. The service vendors made the minimal modifications they
needed to have the container become a part of their service fabric. Yet we
arrive at a scenario where nobody feels responsible for the resulting artifact.

If we could just say what it is that the container needs in order to really
work, in a way which was precise and machine-readable, then it would be clear
where the responsibility lies. Service providers could just run the container
unmodified, and they’d know very clearly whether or not they’d satisfied its
runtime requirements. Open source developers - or even commercial service
vendors! - could say very clearly what they expected to be passed in, and when
they got bug reports, they’d know exactly how their service should have
behaved.

which mostly but not entirely just means “docker”; it’s weird, of
course, because there are pieces that docker depends on and tools that
build upon docker which are part of this, but docker remains the nexus. ↩

Yes yes, I know that they’re not really
functions
Tristan, they’re
subroutines, but that’s the
word people use for “subroutines” nowadays. ↩

Just to be clear: no it isn’t. Write a damn docstring, or at least some
type annotations. ↩