rkt is a new container runtime for applications, intended to meet the most demanding production requirements of security, efficiency and composability. rkt is also an implementation of the emerging Application Container (appc) specification, an open specification defining how applications can be run in containers. Today we are announcing the next major release of rkt, v0.5, with a number of new features that bring us closer to these goals, and want to give an update on the upcoming roadmap for the rkt project.
appc v0.5 - introducing pods
This release of rkt updates to the latest version of the appc spec, which introduces pods. Pods encapsulate a group of Application Container Images and describe their runtime environment, serving as a first-class unit for application container execution.
Pods are a concept recently popularised by Google's Kubernetes project. The idea emerged from the recognition of a powerful, pervasive pattern in deploying applications in containers, particularly at scale. The key insight is that, while one of the main value propositions of containers is for applications to run in isolated and self-contained environments, it is often useful to co-locate certain "helper" applications within a container. These applications have an intimate knowledge of each other - they are designed and developed to work co-operatively - and hence can share the container environment without conflict, yet still be isolated from interfering with other application containers on the same system.
A classic example of a pod is service discovery using the sidekick model, wherein the main application process serves traffic, and the sidekick process uses its knowledge of the pod environment to register the application in the discovery service. The pod links together the lifecycle of the two processes and ensures they can be jointly deployed and constrained in the cluster.
Another simple example is a database co-located with a backup worker. In this case, the backup worker could be isolated from interfering with the database's work - through memory, I/O and CPU limits applied to the process - but when the database process is shut down the backup process will terminate too. By making the backup worker an independent application container, and making pods the unit of deployment, we can reuse the worker for backing up data from a variety of applications: SQL databases, file stores or simple log files.
This is the power that pods provide: they encapsulate a self-contained, deployable unit that still provides granularity (for example, per-process isolators) and facilitates advanced use cases. Bringing pods to rkt enables it to natively model a huge variety of application use cases, and integrate tightly with cluster-level orchestration systems like Kubernetes.
On modern Linux systems, rkt now uses overlayfs by default when running application containers. This provides immense benefits to performance and efficiency: start times for large containers will be much faster, and multiple pods using the same images will consume less disk space and can share page cache entries.
If overlayfs is not supported on the host operating system, rkt gracefully degrades back to the previous behaviour of extracting each image at runtime - this behaviour can also be triggered with the new
--no-overlay flag to
Another improvement behind the scenes is the introduction of a tree cache for rkt's local image storage. When storing ACIs in its local database (for example, after pulling them from a remote repository using
rkt fetch), rkt will now store the expanded root filesystem of the image on disk. This means that when pods that reference this image are subsequently started (via
rkt run), the pod filesystem can be created almost instantaneously in the case of overlayfs - or, without overlayfs, by using a simple copy instead of needing to expand the image again from its compressed format.
To facilitate simultaneous use of the tree store by multiple rkt invocations, file-based locking has been added to ensure images that are in use cannot be removed. Future versions of rkt will expose more advanced capabilities to manage images in the store.
stage1 from source
When executing application containers, rkt uses a modular approach (described in the architecture documentation) to support swappable, alternative execution environments. The default stage1 that we develop with rkt itself is based on systemd, but alternative implementations can leverage different technologies like KVM-based virtual machines to execute applications.
In earlier versions of rkt, the pre-bundled stage1 was assembled from a copy of the CoreOS Linux distribution image. We have been working hard to decouple this process to make it easier to package rkt for different operating systems and in different build environments. In rkt 0.5, the default stage1 is now constructed from source code, and over the next few releases we will make it easier to build alternative stage1 images by documenting and stabilizing the ABI.
"Rocket", "rocket", "rkt"?
This release also sees us standardizing on a single name for all areas of the project - the command-line tool, filesystem names and Unix groups, and the title of the project itself. Instead of "rocket", "Rocket", or "rock't", we now simply use "rkt".
rkt is a young project and the last few months have seen rapid changes to the codebase. As we look towards rkt 0.6 and beyond, we will be focusing on making it possible to depend on rkt to roll-forward from version to version without breaking working setups. There are several areas that are needed to make this happen, including reaching the initial stable version (1.0) of the appc spec, implementing functional testing, stabilizing the on-disk formats, and implementing schema upgrades for the store. We realize that stability is vital for people considering using rkt in production environments, and this will be a priority in the next few releases. The goal is to make it possible for a user that was happily using rkt 0.6 to upgrade to rkt 0.7 without having to remove their downloaded ACIs or configuration files.