Major.io

Launch secure LXC containers on Fedora 20 using SELinux and sVirt

2014-04-22

Getting started with LXC is a bit awkward and I’ve assembled this guide for anyone who wants to begin experimenting with LXC containers in Fedora 20. As an added benefit, you can follow almost every step shown here when creating LXC containers on Red Hat Enterprise Linux 7 Beta (which is based on Fedora 19).

You’ll need a physical machine or a VM running Fedora 20 to get started. (You could put a container in a container, but things get a little dicey with that setup. Let’s just avoid talking about nested containers for now. No, really, I shouldn’t have even brought it up. Sorry about that.)

Prep Work

Start by updating all packages to the latest versions available:

Verify that SELinux is in enforcing mode by running getenforce. If you see Disabled or Permissive, get SELinux into enforcing mode with a quick configuration change:

I recommend installing setroubleshoot-server to make it easier to find the root cause of AVC denials:

Reboot now. This will ensure that SELinux comes up in enforcing mode (verify that with getenforce after reboot) and it ensures that auditd starts up sedispatch (for setroubleshoot).

Install management libraries and utilities

Let’s grab libvirt along with LXC support and a basic NAT networking configuration.

Launch libvirtd via systemd and ensure that it always comes up on boot. This step will also adjust firewalld for your containers and ensure that dnsmasq is serving up IP addresses via DHCP on your default NAT network.

Bootstrap our container

Installing packages into the container’s filesystem will take some time.

This step fills in the filesystem with the necessary packages to run a Fedora 20 container. We now need to tell libvirt about the container we’ve just created.

At this point, libvirt will know enough about the container to start it and you’ll be connected to the console of the container! We need to adjust some configuration files within the container to use it properly. Detach from the console with CTRL-].

Let’s stop the container so we can make some adjustments.

Get the container ready for production

If your package installation inside the container isn’t done yet, you must stop here and wait. Use screen -r to check on it and wait for yum to finish its work.

When the packages are done installing, chroot into your container and set a root password

We will be logging in as root via the console occasionally and we need to allow that access.

Since we will be using our NAT network with our auto-configured dnsmasq server (thanks to libvirt), we can configure a simple DHCP setup for eth0:

Using ssh makes the container a lot easier to manage, so let’s ensure that it starts when the container boots. (You could do this via systemctl after logging in at the console, but I’m lazy.)

Launch!

Cross your fingers and launch the container.

You’ll be attached to the console during boot but don’t worry, hold down CTRL-] to get back to your host prompt. Check the dnsmasq leases to find your container’s IP address and you can login as root over ssh.

Security

After logging into your container via ssh, check the process labels within the container:

You’ll notice something interesting if you run getenforce now within the container — SELinux is disabled. Actually, it’s not really disabled. The processing of SELinux policy is done on the host. The container isn’t able to see what’s going on outside of its own files and processes. The libvirt documentation for LXC hints at the importance of this isolation:

A suitably configured UID/GID mapping is a pre-requisite to making containers secure, in the absence of sVirt confinement.

In the absence of the “user” namespace being used, containers cannot be considered secure against exploits of the host OS. The sVirt SELinux driver provides a way to secure containers even when the “user” namespace is not used. The cost is that writing a policy to allow execution of arbitrary OS is not practical. The SELinux sVirt policy is typically tailored to work with an simpler application confinement use case, as provided by the “libvirt-sandbox” project.

This leads to something really critical to understand:

Containers don’t contain

Dan Walsh has a great post that goes into the need for sVirt and the protections it can provide when you need to be insulated from potentially dangerous virtual machines or containers. If a user is root inside a container, they’re root on the host as well. (There’s an exception: UID namespaces. But let’s not talk about that now. Oh great, first it was nested containers and now I brought up UID namespaces. Sorry again.)

Dan’s talk about securing containers hasn’t popped up on the Red Hat Summit presentations page quite yet but here are some notes that I took and then highlighted:

Containers don’t contain. The kernel doesn’t know about containers. Containers simply use kernel subsystems to carve up namespaces for applications.

Containers on Linux aren’t complete. Don’t compare directly to Solaris zones yet.

Running containers without Mandatory Access Control (MAC) systems like SELinux or AppArmor opens the door for full system compromise via untrusted applications and users within containers.

Using MAC gives you one extra barrier to keep a malicious container from getting higher levels of access to the underlying host. There’s always a chance that a kernel exploit could bypass MAC but it certainly raises the level of difficulty for an attacker and allows server operators extra time to react to alerts.

Launch secure LXC containers on Fedora 20 using SELinux and sVirt is a post from: Major Hayden's blog.

Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.