Today Kubernetes, the production-grade container orchestration manager, celebrates its first birthday. At CoreOS we've been heavily involved with Kubernetes since well before v1.0, and we're thrilled to celebrate its success today. In the last few months, we have been contributing to a new area of Kubernetes that we're very excited about: cluster federation.
Kubernetes 1.3, released earlier this month, introduced support for cluster federation. This feature enables businesses to efficiently and cost-effectively deploy and manage applications across cloud providers and physical data centers. Federation creates a mechanism for multi-cluster geographical replication, keeping the most critical services running even in the face of regional connectivity or data center failures. Let's dive into the federated control plane architecture, and examine a few potential use cases.
A cluster of clusters
In pursuit of high availability and performance, modern deployments are regularly scaled out beyond the confines of a single datacenter or cloud region. By introducing the federation control plane, Kubernetes v1.3 makes its first steps towards being a multi-region orchestration framework.
Cluster federation is architecturally very similar to a Kubernetes cluster. There is a federation API server presenting a standard Kubernetes API and storing state in etcd. However, unlike a normal Kubernetes control plane, which manages compute nodes, the federation control plane manages entire clusters.
Much in the same way the Kubernetes control plane orchestrates workloads across a set of nodes, the federation control plane does the same across a set of Kubernetes clusters.
A standard add-on for Kubernetes clusters is kube-dns, which provides an in-cluster DNS server that can resolve Kubernetes services by name. Services take a number of instances of a containerized application, called pods in Kubernetes, and place them behind a single addressable load balancer. So, let's say you have a service named mysql that exposes a set of running mysql pods. Other applications in the cluster can simply address the mysql service and let the DNS subsystem take care of finding the service address. For example:
For Kubernetes v1.3, the new federation/v1beta1 API will extend this DNS-based service discovery model across cluster boundaries. Sitting "behind" the on-cluster DNS system in the resolv chain, service federation can make use of public DNS records to allow pods to resolve service names transparently across clusters.
This architecture is compatible with using either public routes or some form of private WAN to interconnect the external service interfaces of the various federated clusters.
Looking ahead: Worldwide ability to deploy applications with federation in Kubernetes
Federation is entirely new in Kubernetes v1.3 and is currently part of the beta API, which means that there is work to be done before it's ready for production. At CoreOS, we're working closely with the community to expand and stabilize the feature set in the coming releases so businesses can efficiently deploy and dynamically scale applications across the world thanks to Kubernetes and what we like to call Google's Infrastructure for Everyone Else (GIFEE).
Here is a preview of some exciting possibilities that Kubernetes federation aims to deliver.
The federation control plane is able to assign pods to any registered Kubernetes cluster. In coming releases, the federation scheduler will decide how the workload is distributed among clusters in the federation. This opens up a host of possibilities in orchestrating a global workload across clusters in pursuit of efficiency, reliability and performance.
With global scheduling, you could:
Split work evenly across clusters
Maximize an inexpensive-to-run cluster's workload, and route the overflow to a more expensive-to-run cluster as needed
Schedule workloads to clusters based on demand in that cluster's geographic region at particular times of day to deliver higher bandwidth and lower latency connections to end users
Automated recovery from regional/datacenter failures
Kubernetes is built to automatically deal with failures on a per-machine basis; one of the central motivations behind cluster federation is to automatically recover from entire clusters failing.
The federation control plane could be redundantly deployed across clusters, allowing it to orchestrate recovery from regional outages that take entire clusters out of service. The federation control plane could detect cluster unavailability and re-distribute the failed cluster's workload among the remaining clusters in the federation.
Avoid vendor lock-in
Complex cloud deployments often use provider-specific features and services to orchestrate resources and workload. The result is that large swaths of the deployment are locked to a specific cloud provider.
Kubernetes is an open source orchestration framework that provides a consistent interface for expressing scalable deployments to any supported compute provider. The federation layer contributes to that goal, providing a unified interface for dynamically migrating workloads between providers as conditions such as price, service level or personal preference change.
With features like federation, businesses will be able to efficiently manage their software deployments around the globe, while being able to save on operating costs by dynamically optimizing deployments across resource providers and regions. This is an essential step on the journey to GIFEE.
Experiment with federation in Kubernetes
We are also actively working on integrating Kubernetes Federation with Tectonic, the enterprise Kubernetes solution. Check in over the next few weeks for updates on this effort. In the meantime, get started with Kubernetes by:
Signing up for a private group workshop by CoreOS
Meeting the CoreOS team at one of the Kubernetes birthday (#k8sbday) celebrations:
Federation is under very active development. If you're interested in participating, check out the Federation SIG and the list of open GitHub issues and PRs. If you are more experienced with Kubernetes you can get started with cluster federation by reading the beta docs. Check out the the Federation Operator's Manual for details on building your first cluster federation.