I tweeted a link today about running ceph inside of Docker, something that I would like to give a shot (mostly because I want to learn Docker more than I do, and I’ve never played with ceph, and it has a lot of interesting stuff going on):
I got to thinking about it, and realized that I haven’t written much about Docker, or even containers in general.
Containers are definitely the new hotness. Kubernetes just released 1.0 today, Docker has taken the world by storm, and here I am, still impressed by my Vagrant-fu and thinking that digital watches are a pretty neat idea. What happened? Containers? But, what about virtualization? Aren’t containers virtualization? Sort of? I mean, what is a container, anyway?
Lets start there before we get in too deep, alright?
UNIX (and by extension, Linux) has, for a long time, had a pretty cool command called ‘chroot‘. The chroot command allows you to point at an arbitrary directory and say “I want that directory to be the root (/) now”. This is useful if you had a particular process or user that you wanted to cordon off from the rest of the system, for example.
This is a really big advantage over virtual machines in several ways. First, it’s not very resource intensive at all. You aren’t emulating any hardware, you aren’t spinning up an entire new kernel, you’re just moving execution over to another environment (so executing another shell), plus any services that the new environment needs to have running. It’s also very fast, for the same reason. A VM may take minutes to spin up completely, but a lightweight chroot can be done in seconds.
It’s actually pretty easy to build a workable chroot environment. You just need to install all of the things that need to exist for a system to work properly. There’s a good instruction set on geek.co.il, but it’s a bit outdated (from 2010), so here’s a quick set of updated instructions.
Just go to Digital Ocean and spin up a quick CentOS 7 box (512MB is fine) and follow along:
# rpm –rebuilddb –root=/var/tmp/chroot/
# cd /var/tmp/chroot/
# wget http://mirror.centos.org/centos/7/os/x86_64/Packages/centos-release-7-1.1503.el7.centos.2.8.x86_64.rpm
# rpm -i –root=/var/tmp/chroot –nodeps ./centos-release-7-1.1503.el7.centos.2.8.x86_64.rpm
# yum –installroot=/var/tmp/chroot install -y rpm-build yum
# cp /var/tmp/chroot/etc/skel/.??* /var/tmp/chroot/root/
At this point, there’s a pretty functional CentOS install in /var/tmp/chroot/ which you can see if you do an ‘ls’ on it.
Lets check just to make sure we are where we think we are:
Now do the magic:
# chroot /var/tmp/chroot /bin/bash -l
How much storage are we paying for this with? Not too much, all things considered:
[[email protected] tmp]# du -hs
Not everything is going to be functional, which you’ll quickly discover. Linux expects things like /proc/ and /sys/ to be around, and when they aren’t, it gets grumpy. The obvious problem is that if you extend /proc/ and /sys/ into the chrooted environment, you expose the outside OS to the container…probably not what you were going for.
It turns out to be very hard to secure a straight chroot. BSD fixed this in the kernel by using jails, which go a lot farther in isolating the instance, but for a long time, there wasn’t an equivalent in Linux, which made escaping from a chroot jail something of a hobby for some people. Fortunately for our purposes, new tools and techniques were developed for Linux that went a long way to fixing this. Unfortunately, a lot of the old tools that you and I grew up with aren’t future compatible and are going to have to go away.
Enter the concept of cgroups. Cgroups are in-kernel walls that can be erected to limit resources for a group of processes. They also allow for prioritization, accounting, and control of processes, and paved the way for a lot of other features that containers take advantage of, like namespaces (think cgroups, but for networks or processes themselves, so each container gets its own network interface, say, but doesn’t have the ability to spy on its neighbors running on the same machine, or where each container can have its own PID 0, which isn’t possible with old chroot environments).
You can see that with containers, there are a lot of benefits, and thanks to modern kernel architecture, we lose a lot of the drawbacks we used to have. This is the reason that containers are so hot right now. I can spin up hundreds of docker containers in the time that it takes my fastest Vagrant image to boot.
Docker. I keep saying Docker. What’s Docker?
Well, it’s one tool for managing containers. Remember all of the stuff we went through above to get the chroot environment set up? We had to make a directory, force-install an RPM that could then tell yum what OS to install, then we had to actually have yum install the OS, and then we had to set up root’s skel so that we had aliases and all of that stuff. And after all of that work, we didn’t even have a machine that did anything. Wouldn’t it be nice if there was a way to just say the equivalent of “Hey! Make me a new machine!”? Enter docker.
Docker is a tool to manage containers. It manages the images, the instances of them, and overall, does a lot for you. It’s pretty easy to get started, too. In the CentOS machine you spun up above, just run
# yum install docker -y
# service docker start
Docker will install, and then it’ll start up the docker daemon that runs in the background, keeping tabs on instances.
At this point, you should be able to run Docker. Test it by running the hello-world instance:
[[email protected] ~]# docker run hello-world
Unable to find image ‘hello-world:latest’ locally
latest: Pulling from docker.io/hello-world
a8219747be10: Pull complete
91c95931e552: Already exists
docker.io/hello-world:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Status: Downloaded newer image for docker.io/hello-world:latest
Usage of loopback devices is strongly discouraged for production use. Either use `–storage-opt dm.thinpooldev` or use `–storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
Hello from Docker.
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the “hello-world” image from the Docker Hub.
(Assuming it was not already locally available.)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
For more examples and ideas, visit:
The message is pretty explanatory, but you can see how it understood that you asked for something that it didn’t immediately know about, searched the repository at the Docker Hub to find it, pulled it down, and ran it.
There’s a wide world of Docker out there (start with the Docker docs, for instance, then maybe read Nathan LeClaire’s blog, among others).
And Docker is just the beginning. Docker’s backend uses something called “libcontainer” (actually now or soon to be runc, of the Open Container format), but it migrated to that from LXC, another set of tools and API for manipulating the kernel to create containers. And then there’s Kubernetes, which you can get started with on about a dozen platforms pretty easily.
Just remember to shut down and destroy your Digital Ocean droplet when you’re done, and in all likelihood, you’ve spent less than a quarter to have a fast machine with lots of bandwidth to serve as a really convenient learning lab. This is why I love Digital Ocean for stuff like this!
In wrapping this up, I feel like I should add this: Don’t get overwhelmed. There’s a TON of new stuff out there, and there’s new stuff coming up constantly. Don’t feel like you have to keep up with all of it, because that’s not possible to do while maintaining the rest of your life’s balance. Pick something, and learn how it works. Don’t worry about the rest of it – once you’ve learned something, you can use that as a springboard to understanding how the next thing works. They all sort of follow the same pattern, and they’re all working toward the same goal of rapidly-available short-lived service providing instances. They use different formats, backends, and so on, but it’s not important to master all of them, and feeling overwhelmed is a normal response to a situation like this.
Just take it one step at a time and try to have fun with it. Learning should be enjoyable, so don’t let it get you down. Comment below and let me know what you think of Docker (or Kubernetes, or whatever your favorite new tech is).