An In-Depth Look at Container Technology, the Next Big Thing in Tech
Containers and Containers as a Service, or CaaS, may very well be the next big thing in tech. If nothing else, container-based software development and deployment is extremely exciting, tremendously beneficial, cost-effective, and a rapidly growing technology and DevOps game changer.
CaaS is built on an abstract concept surrounding the orchestration of containers, which at a very high level can be thought of as a fully portable and isolated operating system and set of running applications. Docker is a very popular open-source container vendor that is attempting to define a container standard, but is not the only one.
Containers offer huge advantages to companies, developers, and system administrators with respect to software development, distribution, and deployment. Benefits to these processes include simplifying and improving usability, consistency, reliability, efficiency, cost savings, and scalability. Scalability is especially important since customer expectations grow as new cloud-based software applications gain traction. Revenues will be lost if proper scaling, high availability, and performance are not addressed.
Another major benefit is easily and reliably running software applications when moved between different environments, such as having the ability to develop an application in a container on a developer’s local laptop, and then deploy and run it on any container-supporting server environment. These environments can include physical and/or virtual testing, staging, production, and public or private clouds.
Container technologies can also dramatically shorten the time required between developing, testing, and deploying running applications and services. They are lightweight, not resource intensive, and allow for virtualizing a single application and its dependencies as opposed to virtualizing an entire operating system and machine.
Docker is the most well-known and talked about container solution at the moment, but there are others. Another notable container technology is CoreOS. Most production or enterprise level CaaS offerings are still a work in progress or are in a beta state. Google is currently working on an open source project called Kubernetes, which is compatible with Docker and looks to be very promising.
What Are Containers and How Does Container Technology Work?
Containers are built on a few key Linux operating system kernel features, which provide performance and isolation benefits for different processes running on the same virtual or physical machine.
Containers are basically a completely contained runtime environment, which includes an operating system, user-added files, metadata, applications and their dependencies, configuration files, and so on. Dependency and reliability issues due to operating system and underlying infrastructure variations are essentially abstracted away (i.e., removed) by “containerizing” an application.
Under the surface, the primary Linux features utilized are namespaces, cgroups (or control groups), and union mounts, which when combined help provide security, isolation, and resource allocation and sharing capabilities. Namespaces are used to provide isolation to running applications, and to give an application the impression that it has its own instances of certain global system resources. Control groups are used to configure and allocate certain kernel resources to a running application, including CPU, RAM, I/O, networking, and so on.
Union mounts are implemented through union file systems, which are a very lightweight and fast type of file system. Examples include UnionFS and AUFS. The major advantage of union file systems is that they allow multiple file systems to be transparently overlaid into a single coherent file system.
Docker combines these components to build a wrapper (or container format) called libcontainer, but also supports traditional Linux container technologies such as LXC. The sandboxed process environment afforded by libcontainer or LXC is what enables consistent and reliable portable deployment of containers across environments.
Containers can also be linked and/or run in parallel to scale an application near real time and in a very cost effective way. Many very large web vendors (e.g., Google and Twitter) are built on container technologies and scaling this way.
How Does Docker work as a Container Solution?
Solutions based on container technologies typically consist of multiple components. Docker as a complete solution consists of the open source container virtualization platform itself (i.e., Docker), and a public hub called Docker Hub for sharing and managing Docker containers.
The Docker platform consists of a server application called the Docker daemon to mount and run containers, and the Docker client to interact with both the hub and daemon. The client communicates with the daemon via sockets or a RESTFul API, and is used to instruct the daemon to create, run, and stop containers. The client and daemon can run on the same system or communicate remotely. It’s possible to use a single client to control a distributed solution with multiple servers running the Docker daemon, or control multiple containers running in parallel on a single Docker daemon server, or both.
Docker, at the element level, consists of images, containers, and registries. Containers are actually running instances of something called an image, and are the run component of Docker. They are used to run applications and can be run, started, stopped, moved, and deleted via an API. Images are read-only templates and represent the complete operating environment, applications and their dependencies, configuration data, and launch instructions.
Images are layered using a union file system. All images are created from a base image that’s usually obtained from Docker Hub. Docker builds images by adding layers to the base image using a set of instructions that create new layers. The instructions that are executed building an image are contained in a programmable file called the Dockerfile. Images are therefore the build component of Docker.
When running a Docker container, namespaces and the resulting isolation layer are created and represent system resources such as networking, file system, users and groups, processes, mount points, communications, and hostname. Docker also uses control groups to allocate system resources and set up limits and constraints if required. The goal is to be a good isolated multi-tenant citizen on the host machine while only using the resources necessary to run the application.
A container is then created in the file system and a read-write layer is mounted on top of the read-only image layer. Next, Docker creates a network interface that allows the container to talk to the local host, while also allocating an IP address. Finally, the container runs the target application, while connecting and logging to standard input, output, and error. This allows for application monitoring and interaction.
The registries or hubs are the distribution component of Docker, and are public or private stores used for uploading or downloading images. A hub, or Docker Hub in the case of Docker’s official public hub, is basically a repository of user-supplied images that people can use as a basis for a container, or upload new images to.
Docker also offers container versioning, which essentially mimics the functionality of source code versioning. This allows committing, reversal, diff’ing, and other git-like capabilities for images and containers. One could therefore create a base image to create and extend other images from. This versioning and ancestry allows the user to treat their application’s infrastructure as a managed application.
Containers Versus Virtual Machines
A single running virtual machine represents a portable package that contains an entire operating system and installed applications. They’re useful for allocating chunks of hardware resources and for isolation and security. Physical servers run VMs by using a hypervisor (bare-metal or hosted). VMs offer significant security advantages as compared to containers, allow for using multiple operating systems, and are well suited for running multiple applications in a single instance.
There are some noteworthy downsides to VMs. Packages can be very large in size, often measured in GBs. It also typically takes a VM a relatively large amount of time to boot the underlying operating system and start running guest applications. They require configuration and management at a level that most developers would rather not worry about. Individual VM’s require dealing with operating system upgrades, patches, licenses, and so on. VM’s also introduce issues with utilization and capacity planning.
Lastly, Provisioning a VM is not a very cost-effective or scalable solution for running a single application or service. They are not as easily scalable as containers, and are not particularly well-suited for the complexities associated with continuous and near real-time scaling of distributed systems. Most applications do not require or use the dedicated resources afforded by a dedicated virtual server, nor require a separate operating system and license.
Comparatively, containers provide a lot of the same flexibility afforded by VMs, but with substantial usability, scalability, distribution, and deployment benefits as well. In addition, users do not need to worry about OS upgrades, patches, security, licenses, etc. They basically run as a process on the host machine and abstract away the operating system kernel as opposed to an entire device as in the case of a hypervisor.
Multiple containers can run on a single machine and operating system, share underlying kernel resources, and maintain secure isolation from other containers. These shared resources are read-only, while each container is mounted and accessible for writing.
Compared to VMs, containers are very lightweight and fast. They are able to run without a hypervisor and can make system calls and establish interface connections without requiring emulation overhead. They are also very lightweight due to the layered images. The layering allows image modifications simply by adding layers as opposed to completely rebuilding a VM. Distribution only requires new layers and not the entire image, which makes distribution a very fast and easy thing to do.
Containers as individual packaged units are usually very small and measured in terms of tens of MBs. Instance size is therefore not a concern. In terms of startup speed, containers do not require an operating system boot and are running almost instantly when started.
In terms of scalability, containers can be added as demand or load on the system grows, and traffic can be routed accordingly. You can typically have many more containerized instances of a server application on a given piece of hardware as compared to individual virtual machines running an application in parallel.
It’s worth noting that security and isolation are very important considerations. Containers are known to be less isolated and secure than their VM counterparts, which is due to the resource sharing discussed, and direct interaction with the underlying OS kernel. VMs on the other hand run on top of a hypervisor, which presents a much smaller attack surface comparatively.
It’s also worth noting that containers and virtual machines are not mutually exclusive. Containers can be used along with virtualization if you prefer not to actively manage the hardware, and leverage some of the security benefits associated with VMs.
Ultimately, containers are much more lightweight, cost-effective, and less resource intensive as compared to VMs. Containers are optimized and better used for deployment of applications, whereas VMs are more about deployment of machines. Since many more containers can be run on a single VM, the potential cost savings due to reduced hardware, power, and cooling requirements can be huge.
Containers are a very exciting technology and potential game changer that has many useful applications and can solve lots of problems. They have potential to be the next big thing in tech.
One of the big questions is whether container technology will replace server virtualization technologies in general. That’s yet to be determined, but security limitations will be at the forefront in determining that. That said, the need to provide similar security to containers as that of VMs is well known and actively being worked on. Also, container technologies and supporting tooling is relatively new and less developed as compared to those surrounding virtualization technologies.
In addition, containers and virtualization can be complementary and leveraged together to obtain the benefits of both. Containers can be run in virtual machines, which helps increase isolation and security while enjoying the many benefits of container technology. Virtualization is also useful for managing hardware infrastructure, which most people would rather use software to do.
Either way, container technologies aren’t going away and will only become more prevalent over time. This is a technology that you’ll definitely want to keep your eye on.