Migrating applications, clusters, and Kubernetes to etcd v3

July 27, 2016 · By Hongchao Deng

Recently etcd v3.0 was released. This new version introduces a new v3 API that provides transactions, continuous event delivery, multi-versioned key-value storage, and more. Besides those features, the new etcd3 server delivers sustained high performance, as shown in various benchmarks.

This post explains how to migrate etcd clusters from v2 to the new v3, including the new etcdctl migrate subcommand for offline data migration. We also discuss migrating a Kubernetes cluster to use the new etcd v3. We distinguish between simple upgrades, which only replace binaries with the latest versions, and migrating, which means changing client applications and data to take advantage of new etcd v3 features.

Migrate clients

etcd client v2 can only talk to etcd v2 API. To use the new v3 API, we need to change the application’s code to adopt client v3. In general, migrating to etcd v3.0 has two steps:

  1. Migrate client applications. Client code using the v2 API needs to be rewritten to use the v3 API.
  2. Migrate data. We cover this in detail below.

Here are some notable differences between the v2 and v3 APIs:

  • Transactions: Client applications should use transactions to replace compare and swap (CAS) and compare and delete (CAD) atomic operations.

  • Flat key space: There is no more directory hierarchy. Keys are managed as a sorted map. Listing or watching a directory-like grouping of keys can be done with a range query or a name prefix match.

  • Leases: Leases replacing expiring time to live (TTL) keys. TTL keys should be replaced and attached to one or more leases. When a lease expires, keys attached to it are removed.

Migrate data

Upgrading your cluster to etcd v3.0 is as easy as previous releases. A rolling upgrade from etcd v2.3+ is all it takes. After a cluster upgrades to v3.0+, both the etcd v2 and etcd v3 APIs will be available. Client applications can still access original data via the v2 API.

However, upgrading etcd itself is not enough to take advantage of the new v3 API. In etcd v3.0, there are actually two data stores running under the hood, one for v2 and one for v3. After the rolling upgrade, the original data will be preserved in v2, and v2 API requests will only change the v2 data. The v3 data store begins empty. We need to migrate the data from v2 to v3 in order to serve it over the v3 API. Dividing the two API versions in this way allowed the v3 semantics to change to support all of the new features.

Migration process

etcd data can be migrated either offline or online. Offline migration is recommended; it is simpler and less prone to interaction errors.

Before beginning offline data migration, ensure the following prerequisites are met:

  • etcd and etcdctl are both versions equal to or greater than 3.0.

  • The etcd data directory is not being used by an active etcd server.

Before beginning the migration, check that all cluster members have the same state:

  1. Wait at least one second to allow etcd servers to finish internal sync.

  2. Run the command ETCDCTL_API=3 etcdctl --endpoints ${all_endpoints} -w table endpoint status; ${all_endpoints} is a comma-delimited string of all endpoint host:port pairs, e.g., “host1:port1,host2:port2,host3:port3”.

  3. The output from this command will look like:

|     ENDPOINT     |        ID        |   VERSION    | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
| etcd-test-1:2379 | 2a3d833935d9d076 | 3.0.0-beta.0 | 25 kB   | false     |       415 |        995 |
| etcd-test-2:2379 | a83a3258059fee18 | 3.0.0-beta.0 | 25 kB   | true      |       415 |        995 |
| etcd-test-3:2379 | 22a9f2ddf18fee5f | 3.0.0-beta.0 | 25 kB   | false     |       415 |        995 |

Make sure the RAFT INDEX varies by no more than 1 across all endpoints. For example, indexes of 945, 944, 945 are considered converged. Indexes of 945, 944, 943 – where the variance is 2 – are not. Wait until the cluster has converged if it hasn’t done so already.

etcdctl migrate

etcdctl migrate is the official tool for offline migration. It parses on-disk files of etcd v2 data and sequentially transforms keys to the v3 format. We recommend users backup their data before running the migrate subcommand.

etcdctl migrate expects the following options:

ETCDCTL_API=3 ./etcdctl migrate  --data-dir=${data_dir} --transformer=${program_path}

data-dir is the data directory of original v2 data.

By default, etcdctl migrate will convert the Key, Value, CreatedIndex, ModifiedIndex of v2 node keys into Key, Value, CreateRevision, ModRevision of v3 keys, and will ignore directory keys. There is more than one way to transform v2 API keys into v3 API keys. Users can specify their own custom transform program with the transformer option. The specified transform program must meet the following specification:

  • Input (stdin) to the transform program is a stream of JSON encoded nodes, ending with an EOF.

  • Output (stdout) from the transform program is a stream of PROTOBUF encoded KV, ending with an EOF.

This custom transformer example shows how to create custom transformer programs to convert from v2 to v3 formats. The transformation it performs is the same as etcdctl migrate’s default.

If migration is successful, etcdctl will print

finished transforming keys

Repeat this process on each etcd server’s data directories to complete an offline migration to the etcd v3 API.

Case study: Kubernetes

The Kubernetes cluster orchestrator’s API Server has a storage abstraction layer for persisting data. By default, this storage layer uses an etcd2 cluster. The new Kubernetes v1.3 release includes support developed by CoreOS for storing data in an etcd3 cluster, using the v3 API. In Kubernetes 1.3, the storage-backend option controls the etcd version used:

./kube-apiserver … --storage-backend=etcd3 ...
Kubernetes v1.3 features configurable storage backends

For Kubernetes releases >= 1.3, you can migrate cluster data and start using etcd v3.

Taking the etcd service completely down for a migration will mean the Kubernetes API Server is also offline. To limit downtime for the cluster, we add a load-balancer in front of the API Server and use it to screen everything but read requests, so that no data changes can be made. The Kubernetes cluster can continue basic operation while we migrate etcd data protected in this fashion.

Now that we have the cluster in a read-only mode, we copy the etcd data directories and begin the migration. After all etcd nodes are migrated, we start a new etcd3 cluster with the migrated data. Finally, we start a new Kubernetes API Server that has the storage-backend option set to etcd3. Once this is done, we can reconfigure the load balancer to route requests to the new API Server, and to allow all requests, both read and write.

Migrating the Kubernetes control plane behind a protective load balancer

This process allows for zero downtime upgrades of Kubernetes. In cases where the uptime of the Kubernetes API is of less importance, you can simply take down the API server, upgrade and migrate etcd and etcd data, and restart the API server with the storage-backend option set to etcd3.

Notes on online migration

If the application cannot tolerate any downtime, data can be migrated online. If you are interested, check out the online migration documentation.

Running Kubernetes with etcd3

This post is a deep-dive into the options available for migrating existing data from v2 to v3. If you are new to etcd or starting up a new Kubernetes cluster, don't panic! New users of either software can largely ignore this post, and start from using the v3 API without worrying about data migration.

If you are learning Kubernetes and would like guidance from CoreOS experts, contact us to set up a training.

Try out Tectonic Starter for a complete enterprise Kubernetes solution from CoreOS.


Special thanks to Timothy St. Clair (@timothysc) who originally shared the idea of disabling writes at the load balancer layer for zero-downtime Kubernetes migrations.