on a healthy node, find the unhealthy one and remove it from the cluster:
$ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" cluster-health member 6e8f4db2bc656917 is healthy: got healthy result from https://172.16.10.103:2379 failed to check the health of member c70f7ba744c678a0 on https://172.16.10.107:2379: Get https://172.16.10.107:2379/health: dial tcp 172.16.10.107:2379: getsockopt: connection refused member c70f7ba744c678a0 is unreachable: [https://172.16.10.107:2379] are all unreachable member e4434395baae163e is healthy: got healthy result from https://172.16.10.109:2379 cluster is healthy $ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" member remove c70f7ba744c678a0 Removed member c70f7ba744c678a0 from cluster $ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" cluster-health member 6e8f4db2bc656917 is healthy: got healthy result from https://172.16.10.103:2379 member e4434395baae163e is healthy: got healthy result from https://172.16.10.109:2379 cluster is healthy
on the unhealthy node, stop etcd and delete the contents of its data dir:
$ systemctl stop etcd $ rm -rf /var/lib/etcd/member/snap/* $ rm -rf /var/lib/etcd/member/wal/*
on the healthy node, add a new member with the same details as before:
$ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" member add etcd1 https://172.16.10.107:2380 Added member named etcd1 with ID 962641bc9caa06aa to cluster ETCD_NAME="etcd1" ETCD_INITIAL_CLUSTER="etcd3=https://172.16.10.103:2380,etcd1=https://172.16.10.107:2380,etcd2=https://172.16.10.109:2380" ETCD_INITIAL_CLUSTER_STATE="existing"
on the unhealthy node, start it:
systemctl start etcd
on a healthy node, monitor the cluster status:
$ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" member list 6e8f4db2bc656917: name=etcd3 peerURLs=https://172.16.10.103:2380 clientURLs=https://172.16.10.103:2379 isLeader=false 962641bc9caa06aa[unstarted]: peerURLs=https://172.16.10.107:2380 e4434395baae163e: name=etcd2 peerURLs=https://172.16.10.109:2380 clientURLs=https://172.16.10.109:2379 isLeader=true $ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" member list 6e8f4db2bc656917: name=etcd3 peerURLs=https://172.16.10.103:2380 clientURLs=https://172.16.10.103:2379 isLeader=false 962641bc9caa06aa: name=etcd1 peerURLs=https://172.16.10.107:2380 clientURLs=https://172.16.10.107:2379 isLeader=false e4434395baae163e: name=etcd2 peerURLs=https://172.16.10.109:2380 clientURLs=https://172.16.10.109:2379 isLeader=true $ etcdctl --ca-file /etc/ssl/etcd/ssl/ca.pem --cert-file /etc/ssl/etcd/ssl/member-master2.pem --key-file /etc/ssl/etcd/ssl/member-master2-key.pem --endpoint "https://127.0.0.1:2379" cluster-health member 6e8f4db2bc656917 is healthy: got healthy result from https://172.16.10.103:2379 member 962641bc9caa06aa is healthy: got healthy result from https://172.16.10.107:2379 member e4434395baae163e is healthy: got healthy result from https://172.16.10.109:2379
yay.