Upgrade Guide

This document shows how to safely upgrade the operator to a desired version while preserving the cluster's state and data whenever possible. It is assumed that the preexisting cluster is configured to create and store backups to persistent storage. See the backup config guide for details.

Backup safety precaution:

First create a backup of your current cluster before starting the upgrade process. See the backup service guide on how to create a backup.

In the case of an upgrade failure you can restore your cluster to the previous state from the previous backup. See the spec examples on how to do that.

Upgrade operator deployment

An in-place update can be performed when the upgrade is compatible, i.e we can upgrade the operator without affecting the cluster.

To upgrade an operator deployment the image field spec.template.spec.containers.image needs to be changed via an in-place update.

Change the image field to quay.io/coreos/etcd-operator:vX.Y.Z where vX.Y.Z is the desired version.

$ kubectl edit deployment/etcd-operator
# make the image change in your editor then save and close the file

Incompatible upgrade

In the case of an incompatible upgrade, the process requires restoring a new cluster from backup. See the incompatible upgrade guide for more information.

v0.6.0 -> v0.6.1

In v0.6.1+ the operator will no longer create a storage class specified by --pv-provisioner by default. This behavior is set by the new flag --create-storage-class which by default is false.

Note: If your cluster does not have the following backup policy then you can simply upgrade the operator to the v0.6.1 image.

Backup policy that has StorageType=PersistentVolume but pv.storageClass is unset. For example:

spec:
    backup:
      backupIntervalInSecond: 30
      maxBackups: 5
      pv:
        storageClass: ""
        volumeSizeInMB: 512
      storageType: PersistentVolume

So if your cluster has the above backup policy then do the following steps before upgrading the operator image to v0.6.1.

  • Confirm the name of the storage class for a given cluster:

    kubectl -n <namespace> get pvc -l=etcd_cluster=<cluster-name> -o yaml | grep storage-class
    
  • Edit your etcd cluster spec by changing the spec.backup.pv.storageClass field to the name of the existing storage class from the previous step.
  • Wait for the the backup sidecar to be updated.

v0.5.x -> v0.6.0

Breaking Change

The v0.6.0 release removes operator S3 flag.

Note: if your cluster is not using operator S3 flag, then you just need recreate etcd-operator deployment with the v0.6.0 image.

If your cluster is using operator S3 flag for backup and want to use S3 backup in v0.6.0, then you need to migrate your cluster to use cluster level backup.

Steps for migration:

  • Create a backup of you current data using the backup service guide.

  • Create a new AWS secret <aws-secret> for cluster level backup:

    $ kubectl -n <namespace> create secret generic <aws-secret> --from-file=$AWS_DIR/credentials --from-file=$AWS_DIR/config

  • Change the backup field in your existing cluster backup spec to enable cluster level backup:

    From

    backup:
      backupIntervalInSecond: 30
      maxBackups: 5
      storageType: "S3"
    

    to

    backup:
      backupIntervalInSecond: 30
      maxBackups: 5
      storageType: "S3"
      s3:
        s3Bucket: <your-s3-bucket>
        awsSecret: <aws-secret> 
    
  • Apply the cluster backup spec: $ kubectl -n <namespace> apply -f <your-cluster-deployment>.yaml

  • Update deployment spec to use etcd-operator:v0.6.0 and remove the dependency on operator level S3 backup flags:

    From

    spec:
      containers:
      - name: etcd-operator
        image: quay.io/coreos/etcd-operator:v0.5.2
        command: 
          - /usr/local/bin/etcd-operator
          - --backup-aws-secret=aws
          - --backup-aws-config=aws
          - --backup-s3-bucket=<your-s3-bucket>
    

    to

    spec:
      containers:
      - name: etcd-operator
        image: quay.io/coreos/etcd-operator:v0.6.0
        command: 
          - /usr/local/bin/etcd-operator
    
  • Apply the updated deployment spec: $ kubectl -n <namespace> apply -f <your-etcd-operator-deployment>.yaml

v0.4.x -> v0.5.x

For any 0.4.x versions, please update to 0.5.0 first.

The 0.5.0 release introduces a breaking change in moving from TPR to CRD. To preserve the cluster state across the upgrade, the cluster must be recreated from backup after the upgrade.

Prerequisite:

Kubernetes cluster version must be 1.7+.

Steps to upgrade:

  • Create a backup of your current cluster data. See the following guides on how to enable and create backups:
  • Delete the cluster TPR object. The etcd-operator will delete all resources(pods, services, deployments) associated with the cluster:
    • kubectl -n <namespace> delete cluster <cluster-name>
  • Delete the etcd-operator deployment
  • Delete the TPR
    • kubectl delete thirdpartyresource cluster.etcd.coreos.com
  • Replace the existing RBAC rules for the etcd-operator by editing or recreating the ClusterRole with the new rules for CRD.
  • Recreate the etcd-operator deployment with the 0.5.0 image.
  • Create a new cluster that restores from the backup of the previous cluster. The new cluster CR spec should look as follows:
    apiVersion: "etcd.database.coreos.com/v1beta2"
    kind: "EtcdCluster"
    metadata:
    name: <cluster-name>
    spec:
    size: <cluster-size>
    version: "3.1.8"
    backup:
      # The same backup spec used to save the backup of the previous cluster
      . . .
      . . .
    restore:
      backupClusterName: <previous-cluster-name>
      storageType: <storage-type-of-backup-spec>
    

    The two points of interest in the above CR spec are:

    1. The apiVersion and kind fields have been changed as mentioned in the 0.5.0 release notes
    2. The spec.restore field needs to be specified according your backup configuration. See spec examples guide on how to specify the spec.restore field for your particular backup configuration.

v0.3.x -> v0.4.x

Upgrade to v0.4.0 first. See the release notes of v0.4.0 for noticeable changes: https://github.com/coreos/etcd-operator/releases/tag/v0.4.0

v0.2.x -> v0.3.x

Prerequisite:

To upgrade to v0.3.x the current operator verison must first be upgraded to v0.2.6+, since versions < v0.2.6 are not compatible with v0.3.x .

Noticeable changes:

  • Spec.Backup.MaxBackups update:
    • If you have Spec.Backup.MaxBackups < 0, previously it had no effect. Now it will get rejected. Please remove it.
    • If you have Spec.Backup.MaxBackups == 0, previously it had no effect. Now it will create backup sidecar that allows unlimited backups. If you don't want backup sidecar, just don't set any backup policy.