Earlier this year we released rkt, the container engine by CoreOS, in its first stable version, and since then the project development has continued apace. Release after release, rkt keeps proceeding towards its goal of providing a stable and minimal container runtime. Here, we will take a small tour of the project’s status and some of its more notable changes since 1.0.
Many users have shown how they care about rkt by providing invaluable feedback and use cases, as well as proposing new features. Based on this feedback, we have several noteworthy changes to highlight. Those are just a small selection among all the changes that we track in the changelogs accompanying every release.
On the standardization front, the Open Container Initiative (OCI) is progressing towards its 1.0 release, and work is well underway in rkt to fully support this important milestone. We’ve started preliminary work on handling OCI images directly, and
rkt fetch can already understand the new format as currently drafted.
A lot of progress has been made on VM-powered containers. The KVM stage1 flavor is now a mature alternative to traditional namespace-based engines. The current implementation is based on LKVM, but a parallel effort based on QEMU is ongoing.
Most notably on the user experience front, two additional subcommands have been introduced to manage the full lifecycle of a pod, namely to perform stop and export operations.
rkt stop can be used to cleanly stop running pods, while
rkt export is capable of exporting modifications to container images after a pod has exited. Those two subcommands offer a homogeneous and quick way to perform operations that previously required multiple steps and additional tools on the host.
Finally, some scenarios may require to run specific pods with direct access to system resources and without limiting host interactions. For those specific cases requiring privileged pods, several flags have been added to the
--insecure-options set to let users disable specific isolators, which are normally enabled by default.
We have applied hardening best practices based on feedback from security experts in the community and from analyst reports. Based on this feedback, we kept implementing and enabling, by default, new isolation layers for containers. Those security features have been steadily introduced and improved in every new release. Below are the ones currently available to users in rkt 1.14:
Applications in a pod now reside in dedicated mount namespaces, on top of the existing pod namespace, and can be optionally turned to read-only to prevent runtime modification to application images.
Users can fine-tune the set of Linux capabilities granted to single applications in a pod. It is possible to specify either a set of capabilities to retain or remove. By default, applications are run with a restricted default set of capabilities, which can be overridden at runtime via
--caps-remove switches. Further details on this are available in the capabilities guide.
It is now possible to apply custom-defined seccomp filters to specific applications. This allows the user to define a set of system calls to forbid (blacklist) or to allow (whitelist). By default, rkt whitelists a standard set of system calls and offers some predefined sets to mimic behavior of other runtimes. This can be overridden at runtime via a
--seccomp switch. More information and usages are shown in the seccomp guide.
Other areas we have added include enhanced support for user namespacing, setting up the
no-new-privilege bit on applications, and further tightening of cgroup parameters. Most of these features follow along with new security knobs available as systemd properties, which are further documented here.
A side effect of rkt composability is that the project as a whole relies and benefits from a healthy Linux ecosystem. We achieved many of the improvements described above by reusing and extending existing components, like systemd-nspawn and libseccomp.
We believe that this too is an important aspect of our projects: by exercising and adapting existing security components we help highlight and fix existing issues, which overall translates to improved security for everybody. For anybody interested in more details about this kind of integration, we will address the topic in depth in the upcoming systemd.conf 2016 in Berlin.
rkt aims at providing a focused container engine that can be relied upon in production environments, leaving other tasks (such as cluster orchestration and node management) to external components. For these reasons, we try to focus on building and stabilizing a core set of features while following a regular release calendar. Stable releases happen with a bi-weekly cadence, and our release pilots take care of ensuring the quality of every proposed change. You are all invited to help with the launch of rkt 1.15.0, which is scheduled soon!
We encourage everybody to join the community and get involved with rkt, a central component of the CoreOS and Tectonic platforms. Get involved on the rkt-dev mailing list or on the #rkt-dev Freenode IRC channel, by filing GitHub issues, or by contributing code and fixes to the project.
Interested in getting started with rkt? Check out these resources:
Interested in helping CoreOS secure the Internet? Join us! We’re hiring engineers in New York, Berlin, and San Francisco.