A distributed, reliable key-value store for the most critical data of a distributed system.

Overview

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.

Your applications can read and write data into etcd. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change.

Advanced uses take advantage of the consistency guarantees to implement database leader elections or do distributed locking across a cluster of workers.


Projects using etcd

etcd is the backend for service discovery and stores cluster state and configuration
etcd stores cluster state and configuration and provides a global lock service
Including projects build on etcd, client bindings and more.

Simple Interface

Read and write values with curl and other HTTP libraries

Key/Value Storage

Store data in directories, similar to a file system

/config app2 app1

Watch for Changes

Watch a key or directory for changes and react to the new values


Optional SSL client cert authentication

Benchmarked at 1000s of writes/s per instance

Optional TTLs for keys expiration

Properly distributed via Raft protocol

Technical Overview

etcd is written in Go which has excellent cross-platform support, small binaries and a great community behind it. Communication between etcd machines is handled via the Raft consensus algorithm.

Latency from the etcd leader is the most important metric to track and the built-in dashboard has a view dedicated to this. In our testing, severe latency will introduce instability within the cluster because Raft is only as fast as the slowest machine in the majority. You can mitigate this issue by properly tuning the cluster. etcd has been pre-tuned on cloud providers with highly variable networks.

More Information

Presentation: How Raft Works
follower follower follower follower leader M
Logs replicated to each follower in the cluster.

Securing etcd

etcd should not be exposed outside of the CoreOS cluster. The recommended way to secure your entire cluster (and etcd) is to use a physical firewall, EC2 Security Groups or a similar feature to restrict all traffic unless allowed. Communication within the cluster can be secured with client certificates. Access control lists (ACLs) are supported to restrict users from reading or updating parts of the keyspace.

If you're running containers that are used for load balancing or caching, consider exposing only those containers instead of all containers.

Using etcd with docker containers

Docker containers can read, write and listen to etcd over the docker0 network interface. With these three actions you construct extremely sophisticated orchestration to happen whenever etcd values change.

A common example of listening for changes is to reconfigure an upstream proxy when a new container of an application is started.

To keep service registration logic outside of your main codebase, "sidekick" units can be run next to the main systemd unit. Sidekicks will be scheduled by fleet onto the same machine as the main unit and will stop if the main unit stops for any reason.