Edit: This post has been updated to reflect the project name change from rudder to flannel
As we have previously blogged, Kubernetes, the container cluster manager, works great with CoreOS to distribute a workload across your entire cluster. To make it easier to find your services, Kubernetes does away with port-mapping and assigns a unique IP address to each pod. This works well on Google Compute where each host is assigned a /24 for use by individual pods. Things are not as easy on other cloud providers where a host cannot get an entire subnet to itself. flannel aims to solve this problem by creating an overlay mesh network that provisions a subnet to each server.
While flannel was originally designed for Kubernetes, it is a generic overlay network that can be used as a simple alternative to existing software defined networking solutions.
How it works
An overlay network is first configured with an IP range and the size of the subnet for each host. For example, one could configure the overlay to use 10.100.0.0/16 and each host to receive a /24 subnet. Host A could then receive 10.100.5.0/24 and host B could get 10.100.18.0/24. flannel uses etcd to maintain a mapping between allocated subnets and real host IP addresses. For the data path, flannel uses UDP to encapsulate IP datagrams to transmit them to the remote host. We chose UDP as the transport protocol for its ease of passing through firewalls. For example, AWS Classic cannot be configured to pass IPoIP or GRE traffic as its security groups only support TCP/UDP/ICMP.
Download and build flannel from GitHub: https://github.com/coreos/flannel.
flannel uses etcd for storage of both configuration data and subnet assignments. Upon startup, a flannel daemon will retrieve the configuration and a list of subnet already in use. It will select an available subnet (a randomly picked one) and attempt to register it by creating a key in etcd. For example, consider the following configuration:
Overlay network range: 10.100.0.0/16 Size of subnet for each host: /24
and current registrations stored as keys under /coreos.com/network/subnets:
10.100.5.0-24 10.100.13.0-24 10.100.17.0-24 10.100.18.0-24
flannel might pick 10.100.15.0/24 and format the key as
10.100.15.0-24. It will then attempt to create this key. If it succeeds, it has acquired a subnet lease for 24 hours. flannel uses etcd TTL on keys to enforce lease expirations. Should the key creation fail, it indicates that another host managed to acquire the same subnet first and flannel will retry the registration procedure. An hour before its lease expiration, flannel will extend its lease by doing an update.
The value stored for each key contains the real IP of the host. flannel keeps an eye (via etcd directory watch) on all entries stored in /coreos.com/network/subnets and uses this information to maintain a routing table. To perform the encapsulation, flannel utilizes Universal TAP/TUN devices (in TUN mode) and proxies the IP fragments between the TUN device and a UDP socket.
To find out how much performance penalty flannel introduces, we ran both latency and bandwidth tests on AWS m3.medium VM. We used [qperf] (https://www.openfabrics.org/downloads/qperf/) tool to gather the measurements:
| Without flannel | With flannel ---------------|----------------|------------ UDP Latency | 133 us | 201 us TCP Bandwidth | 47.8 MB/sec | 47.2 MB/sec
As can be seen from the results, while flannel introduces non-trivial latency penalty, it has almost no affect on the bandwidth.
flannel is still in the early phases of development and should be considered to be in the experimental stage. Apart from stabilizing current functionality, future plans include supporting additional encapsulations such as IPSEC. Moreover, even for those environments where it is feasible for a host to be assigned a routable subnet, we would like to extend flannel to perform the subnet allocation across the cluster.