Migrate the router floating IP from Puppet LBs to cloudscale LBaaS

Migrating the router floating IP from the Puppet LBs to a cloudscale LBaaS instance requires some manual steps.

Prerequisites

  • You need to be able to execute the following CLI tools locally:

  • Admin access to the cluster you want to migrate

  • Admin access to the cluster’s Puppet-managed LBs

  • The cluster is already updated to use component version v9.4.0 or newer

We recommend installing Commodore with uv. See the Commodore installation how-to for details.

Using uv to install Commodore will ensure that commodore and kapitan are available in your $PATH.

Setup environment

  1. Access to API

    # For example: https://api.syn.vshn.net
    # IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
    export COMMODORE_API_URL=<lieutenant-api-endpoint>
    export COMMODORE_API_TOKEN=<lieutenant-api-token>
    
    export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
    export TENANT_ID=$(curl -sH "Authorization: Bearer ${COMMODORE_API_TOKEN}" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)
    
    # From https://git.vshn.net/-/profile/personal_access_tokens, "api" scope is sufficient
    export GITLAB_TOKEN=<gitlab-api-token>
    export GITLAB_USER=<gitlab-user-name>
  2. Connect with Vault

    export VAULT_ADDR=https://vault-prod.syn.vshn.net
    vault login -method=ldap username=<your.name>
  3. Fetch the hieradata repo token from Vault

    export HIERADATA_REPO_SECRET=$(vault kv get \
      -format=json "clusters/kv/lbaas/hieradata_repo_token" | jq '.data.data')
    export HIERADATA_REPO_USER=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.user')
    export HIERADATA_REPO_TOKEN=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.token')
  4. Set the floaty API key Terraform variable from Vault

    floaty_key=$(vault kv get \
      -format=json "clusters/kv/${TENANT_ID}/${CLUSTER_ID}/floaty")
    export TF_VAR_lb_cloudscale_api_secret=$(echo $floaty_key | jq -r '.data.data.iam_secret')
  5. Compile the cluster catalog to create a local working directory

    commodore catalog compile "${CLUSTER_ID}"

Deploy cloudscale LBaaS instance and move floating IP

  1. Disable ArgoCD auto sync

    kubectl --as=cluster-admin -n syn patch apps root --type=json \
      -p '[{"op":"replace", "path":"/spec/syncPolicy", "value": {}}]'
  2. Deploy the cloudscale-loadbalancer-controller and update the Terraform config

    pushd inventory/classes/${TENANT_ID}
    
    yq eval -i '.applications += ["cloudscale-loadbalancer-controller"]' ${CLUSTER_ID}.yml
    yq eval -i ".parameters.openshift4_terraform.terraform_variables.allocate_router_vip_for_lb_controller = true" \
      ${CLUSTER_ID}.yml
    
    git commit -a -m "Prepare ingress migration to cloudscale LBaaS for ${CLUSTER_ID}"
    git push
    
    popd
  3. Compile catalog again

    commodore catalog compile ${CLUSTER_ID}
  4. Deploy cloudscale-loadbalancer-controller

    kubectl --as=cluster-admin apply -f catalog/manifests/cloudscale-loadbalancer-controller/00_namespace.yaml
    kapitan refs --reveal --refs-path catalog/refs -f catalog/manifests/cloudscale-loadbalancer-controller/10_secrets.yaml | \
      kubectl --as=cluster-admin apply -f -
    kubectl --as=cluster-admin apply -Rf catalog/manifests/cloudscale-loadbalancer-controller/10_kustomize
    kubectl -n appuio-cloudscale-loadbalancer-controller \
      wait --for condition=available \
      deploy cloudscale-loadbalancer-controller-controller-manager
  5. Deploy ingress loadbalancer

    yq 'del(.spec.floatingIPAddresses)' catalog/manifests/cloudscale-loadbalancer-controller/20_loadbalancers.yaml | \
      kubectl --as=cluster-admin apply -f -
  6. Wait for the cluster ingress to become reachable via the loadbalancer

    export LB_IP=""
    while [ "$LB_IP" == "" ]; do
      export LB_IP=$(kubectl --as=cluster-admin -n appuio-cloudscale-loadbalancer-controller \
        get loadbalancer ingress -oyaml | \
        yq '(.status.virtualIPAddresses[]|select(.address|contains("2a06")|not)|.address)//""')
      if [ "$LB_IP" == "" ]; then
        echo -n "."
        sleep 5
      fi
    done && echo -e "\nLoadbalancer available at ${LB_IP}"
    
    export APPS_DOMAIN=$(kapitan inventory -t cluster --inventory-backend=reclass-rs | \
      yq .parameters.openshift.appsDomain)
    curl --resolve console-openshift-console.${APPS_DOMAIN}:80:${LB_IP} \
      http://console-openshift-console.${APPS_DOMAIN} -vI (1)
    1 This command should return HTTP/1.1 302 Found

Update Terraform state

  1. Configure Terraform environment

    cat <<EOF > ./terraform.env
    CLOUDSCALE_API_TOKEN
    TF_VAR_ignition_bootstrap
    TF_VAR_lb_cloudscale_api_secret
    TF_VAR_control_vshn_net_token
    GIT_AUTHOR_NAME
    GIT_AUTHOR_EMAIL
    HIERADATA_REPO_TOKEN
    EOF
  2. Setup Terraform

    # Set terraform image and tag to be used
    tf_image=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.image" \
      dependencies/openshift4-terraform/class/defaults.yml)
    tf_tag=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.tag" \
      dependencies/openshift4-terraform/class/defaults.yml)
    
    # Generate the terraform alias
    base_dir=$(pwd)
    alias terraform='touch .terraformrc; docker run -it --rm \
      -e REAL_UID=$(id -u) \
      -e TF_CLI_CONFIG_FILE=/tf/.terraformrc \
      --env-file ${base_dir}/terraform.env \
      -w /tf \
      -v $(pwd):/tf \
      --ulimit memlock=-1 \
      "${tf_image}:${tf_tag}" /tf/terraform.sh'
    
    export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|')
    export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/}
    export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id")
    export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"
    
    pushd catalog/manifests/openshift4-terraform/
  3. Initialize Terraform

    terraform init \
      "-backend-config=address=${GITLAB_STATE_URL}" \
      "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=username=${GITLAB_USER}" \
      "-backend-config=password=${GITLAB_TOKEN}" \
      "-backend-config=lock_method=POST" \
      "-backend-config=unlock_method=DELETE" \
      "-backend-config=retry_wait_min=5"
  4. Move floating IP Terraform state

    terraform state mv "module.cluster.module.lb.cloudscale_floating_ip.router_vip[0]" \
      "module.cluster.cloudscale_floating_ip.router_vip[0]"
  5. Get router floating IP from Terraform state

    terraform refresh
    export INGRESS_FLOATING_IP=$(terraform output -raw router_vip)
  6. Grab LB hostnames from Terraform state

    declare -a LB_FQDNS
    for id in 1 2; do
      LB_FQDNS[$id]=$(terraform state show "module.cluster.module.lb.cloudscale_server.lb[$(expr $id - 1)]" | grep fqdn | awk '{print $2}' | tr -d ' "\r\n')
    done
  7. Disable Puppet on LBs

    for lb in "${LB_FQDNS[@]}"; do
      ssh $lb sudo puppetctl stop "Migrating router floating IP"
    done
  8. Remove router floating IP from Floaty and restart keepalived on LBs

    for lb in "${LB_FQDNS[@]}"; do
      ssh $lb sudo sed -i "/${INGRESS_FLOATING_IP}/d" /etc/floaty/global.yaml
      ssh $lb sudo systemctl restart keepalived
    done
  9. Run Terraform

    terraform apply
  10. Merge Hieradata MR

    This won’t have an immediate effect since we’ve disabled Puppet on the LBs.
  11. Switch back to Commodore working directory

    popd

Move floating IP to LBaaS instance

  1. Set router floating IP in cluster config

    pushd inventory/classes/${TENANT_ID}
    
    yq eval -i '.parameters.openshift.cloudscale.ingress_floating_ip_v4 = "'$INGRESS_FLOATING_IP'"' \
      ${CLUSTER_ID}.yml
    
    git commit -a -m "Migrate ingress floating IP to cloudscale LBaaS for ${CLUSTER_ID}"
    git push
    
    popd
  2. Compile and push catalog

    commodore catalog compile ${CLUSTER_ID} --push -i
  3. Enable ArgoCD sync

    kubectl --as=cluster-admin -n syn patch apps root --type=json \
      -p '[{
        "op":"replace",
        "path":"/spec/syncPolicy",
        "value": {"automated": {"prune": true, "selfHeal": true}}
      }]'
  4. Wait until cloudscale floating IP is attached to cloudscale LBaaS instance

    export AUTH_HEADER="Authorization: Bearer ${TF_VAR_lb_cloudscale_api_secret}"
    while [ "$(curl -sH"$AUTH_HEADER" https://api.cloudscale.ch/v1/floating-ips/${INGRESS_FLOATING_IP} | jq -r '.load_balancer')" == "null" ]; do
      echo -n '.'
      sleep 1
    done && echo -e "\nFloating IP attached to LBaaS instance"
  5. Verify that cluster console is accessible

    curl https://console-openshift-console.${APPS_DOMAIN} -vI
  6. Enable Puppet on LBs

    for lb in "${LB_FQDNS[@]}"; do
      ssh $lb sudo puppetctl start
    done
  7. Verify that cluster console is still accessible

    curl https://console-openshift-console.${APPS_DOMAIN} -vI