Federation

Overview

AMKO uses federation to replicate AMKO configuration to a set of member clusters. This ensures a seamless recovery of AMKO configuration during disasters.

Federation Set

The set of clusters registered on AMKO is considered as a federation set. One of these clusters in the federation set must be designated as the leader. And, all the other clusters are designated as the followers.

Responsibilities

  • Leader Cluster: Is responsible for distributing the AMKO configuration to all the follower clusters in the federation set.
    AMKO in the leader cluster is also responsible to create/update GSLB Services on the Avi Controller.

  • Follower Cluster: AMKO in a follower cluster runs passively and does not carry out federation activities or GslbService sync operations unlike the leader.

Post a disaster, the user should manually pick any of the erstwhile follower clusters and designate that as the new leader. The user must ensure that only a single AMKO is designated as the leader at any given point in time.

Note: At any point in time, ensure that there is only one leader AMKO.

Federated Objects

The following objects are federated from the leader cluster to the follower:

  • GSLBConfig
  • GlobalDeploymentPolicy or GDP

Add/Update/Delete events of these objects are federated to the follower clusters.

Flow

Assume the following topology (a cluster is a Kubernetes-Kubernetes/OpenShift cluster):

  • Cluster 1 in Site 1
  • Cluster 2 in Site 2
  • Cluster 3 in Site 3

Federation

Each site has an Avi Controller deployed. The site 1’s Avi Controller is chosen as the GSLB Leader and all the other sites are GSLB followers. AMKO is deployed on all three sites. For cluster 1 (in site 1), it is marked as the leader. Clusters 2 and 3, ared marked as followers.

On adding or updating the GSLBConfig and GDP objects in cluster 1, AMKO’s federator in cluster 1 federates the changes to these objects to all the follower clusters.

AMKOCluster CRD to control federation

A CRD called AMKOCluster governs the federation. A typical AMKOCluster object for a leader AMKO appears as shown below:


apiVersion: amko.vmware.com/v1alpha1
kind: AMKOCluster
metadata:
  name: amkocluster-sample
  namespace: avi-system
spec:
  isLeader: true
  clusterContext: cluster1
  version: 1.6.1
  clusters:
  - cluster1
  - cluster2
status:
  conditions:
  - status: valid AMKOCluster object
    type: current AMKOCluster Validation
  - status: all cluster clients fetched
    type: member cluster initialisation
  - status: validated all member clusters
    type: member cluster validation
  - status: federated to all valid clusters successfully
    type: GSLBConfig Federation
  - status: federated to all valid clusters successfully
    type: GDP Federation

Here,

  • namespace: The namespace of this object is avi-system.

  • isLeader: Specify whether the AMKO in the current cluster is leader. By default this is set to False. If set to false, AMKO will not sync any objects to the Avi Controller, and the AMKO federator will not federate the objects to the member clusters.

  • clusterContext: Specify the current cluster’s context. Providing the wrong cluster context can cause undefined behavior.

  • version: This is the current cluster’s AMKO version. If installed via helm, this field gets automatically populated.

  • clusters: The Member cluster list on which federation will be performed. Current cluster (if present) in this list will be ignored.

  • status: Indicates the current state of federation.
    The following types are reflected in the status:

    • current AMKOCluster Validation: Indicates the validity of the current AMKOCluster object.
    • member cluster initialisation: Indicates whether the cluster contexts given in spec.clusters were fetched and initialised from the gslb-config-secret secret. If a member cluster given in spec.clusters is not found in the gslb-config-secret, this step would fail.
    • member cluster validation: The federator validates all the member clusters in the spec.clusters list and indicates a success/error. Validation includes some sanity checks, version mismatch checks, leader checks etc.
    • GSLBConfig federation: The federator indicates whether it was able to federate the GSLBConfig object to all the clusters in spec.clusters successfully.
    • GDP Federation: The federator indicates whether it was able to federate the GDP/GlobalDeploymentPolicy object to all the clusters in spec.clusters successfully.

If Helm is used to deploy AMKO, this Custom Resource will be installed, and these values have to be provided via values.yaml.

Notes:

  • The federation set can be a subset of the overall member cluster set used for GSLB.
  • AMKO has to be deployed on all clusters in the federation set.
  • spec.clusterContext must contain the current cluster’s context.
  • spec.version is compared against the versions of all member clusters. All AMKO clusters must have the same version as the leader cluster. The federation logic will not work if there’s a version mismatch.
  • spec.clusters contains the federation cluster set.
  • Only one AMKO can be a leader, all other AMKOs have to be followers. If there are two leaders at any point, federation will stop and the error will be written to the AMKOCluster’s status.

Disasters and Recovery

During a cluster down event on the leader AMKO, the federation of config objects will stop. However, at this point, all other clusters participating in federation would be synced with up to date configuration of the erstwhile GSLB leader. Hence, switching to a new AMKO leader does not require any manual steps of recovering the AMKO config objects.

Assume that there are 3 sites with one cluster in each of them:

  • Cluster 1 in Site 1
  • Cluster 2 in Site 2
  • Cluster 3 in Site 3

A disaster can occur either in the entire site or just for that cluster. Site failure would also mean that the Kubernetes/OpenShift cluster along with the Avi Controller in that site are down. Whereas, a cluster failure would mean that only the Kubernetes/OpenShift cluster is down.

Federation

If the site where the leader AMKO was deployed and which hosted the Avi GSLB leader, fails. At this point, you can:

  • Select a follower site to be the new leader on the cluster where this follower AMKO is deployed.

  • Choose a new follower AMKO to be the new leader, and follow the steps given below on the cluster where this follower AMKO is deployed:

    1. Edit the GSLBConfig object and change the leader IP address:

      
        $ kubectl edit gslbconfig -n avi-system gc-1
        // Set the field spec.gslbLeader.controllerIP to the new leader's IP address
       
    2. Set the isLeader field in the AMKOCluster object to true on this cluster:

      
       $ kubectl edit amkocluster amkocluster-federation -n avi-system
       </pre>
              
      

This reboots the new leader AMKO. After reboot, the new leader will take over the responsibilities of the previous leader.

Scenario: Cluster Failure

Consider the scenario where the cluster where the leader AMKO was deployed, fails.

Cluster Failure

Since the Avi GSLB leader is still active, the user only has to choose a new AMKO leader out of the followers. The steps to recover AMKO and designation of the new leader remains the same as shown in the section above.


$ kubectl edit amkocluster amkocluster-federation -n avi-system

Old Leader AMKO Boots Up

At any point in time, the architecture only allows a single AMKO leader. Conflicts leading from more than one leader must be resolved by the admin manually. This does not have any traffic impact on the existing GslbServices objects.
To resolve this situation, the admin must convert one of the leader AMKOs to follower by setting spec.isLeader field to false in the AMKOCluster object:


$ kubectl edit amkocluster amkocluster-federation -n avi-system

This is especially important for situations, when a cluster, which hosted a leader instance of AMKO, previously failed.

The user has switched a follower AMKO to be the new leader. The failed cluster recovers and brings back the old leader AMKO. In this case, set the old leader to follower.

Caveats

Federation is currently a one way communication from the leader AMKO to the follower AMKOs. AMKO federator on the leader cluster reacts to the create/update/delete operations on the GSLBConfig and GDP objects on the leader cluster. Modification of these objects on the follower cluster will not prompt the federator on the leader to update these objects.

Date Change Summary
July 29, 2021 Created the article for Federation (version 1.4.2)