Upgrade Exoscale clusters from component version v4 to v5

When migrating from component version v4 to v5, the following breaking changes require manual changes to the Terraform state of existing clusters:

terraform-openshift4-exoscale v3.0.0 removes all uses of deprecated Exoscale Terraform resources.

This guide assumes that you’ve already upgraded component openshift4-terraform to a v5 version for the cluster you’re migrating.

Prerequisites

You need to be able to execute the following CLI tools locally:

docker
yq yq YAML processor (Version 4 or newer)
jq
vault Vault CLI
terraform Terraform CLI >= 1.3.0
python3 >= 3.8.0

Setup environment

Access to API

# For example: https://api.syn.vshn.net
# IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
export COMMODORE_API_URL=<lieutenant-api-endpoint>

export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)

# From https://git.vshn.net/-/profile/personal_access_tokens, "api" scope is sufficient
export GITLAB_TOKEN=<gitlab-api-token>
export GITLAB_USER=<gitlab-user-name>

Connect with Vault

export VAULT_ADDR=https://vault-prod.syn.vshn.net
vault login -method=ldap username=<your.name>

Update cluster configuration

Verify that your cluster uses component openshift4-terraform v5. If that’s not the case, update the component version to v5.x before compiling the cluster catalog.
Update cluster configuration to use new Exoscale compute instance type identifiers.

The new exoscale_compute_instance Terraform resource uses different identifiers to specify instance types. We’ve modified the Terraform module to use different variable names for the instance types in the new version. If you haven’t customized instance types for your cluster, you can skip this step.

Update your cluster configuration to replace the following parameters:
- infra_size → infra_type
- storage_size → storage_type
- worker_size → worker_type
- additional_worker_groups[*].size → additional_worker_groups[*].type
- if you were using additional_affinity_group_ids to select a dedicated hypervisor: additional_affinity_group_ids → deploy_target_id
The new resource uses identifiers of form <family>.<size>, for example Extra-large is now standard.extra-large. See exo compute instance-type list for the list of all types.
Compile the cluster catalog to create a local working directory
```
commodore catalog compile "${CLUSTER_ID}"
```

Migrate Terraform state

Fetch hieradata repo token from Vault
```
export HIERADATA_REPO_TOKEN=$(vault kv get \
  -format=json "clusters/kv/lbaas/hieradata_repo_token" | jq '.data.data.token')
```
You need the Hieradata repo token so that Terraform can access the LB hieradata Git repo.
Get an Exoscale unrestricted API key for the cluster’s organization from the Exoscale Console.
```
export EXOSCALE_API_KEY=EXO... (1)
export EXOSCALE_API_SECRET=... (2)
```
1 Replace with your API key

2 Replace with your API secret

You need the Exoscale credentials so that Terraform can refresh the state.

Setup Git author information and create the Terraform environment file. The Git author info is required so that Terraform can commit the hieradata changes.

export GIT_AUTHOR_NAME=$(git config --global author.name)
export GIT_AUTHOR_EMAIL=$(git config --global author.email)

cat <<EOF > ./terraform.env
EXOSCALE_API_KEY
EXOSCALE_API_SECRET
HIERADATA_REPO_TOKEN
GIT_AUTHOR_NAME
GIT_AUTHOR_EMAIL
EOF

Setup Terraform

Prepare Terraform execution environment

# Set terraform image and tag to be used
tf_image=$(\
  yq eval ".parameters.openshift4_terraform.images.terraform.image" \
  dependencies/openshift4-terraform/class/defaults.yml)
tf_tag=$(\
  yq eval ".parameters.openshift4_terraform.images.terraform.tag" \
  dependencies/openshift4-terraform/class/defaults.yml)

# Generate the terraform alias
base_dir=$(pwd)
alias terraform='docker run -it --rm \
  -e REAL_UID=$(id -u) \
  --env-file ${base_dir}/terraform.env \
  -w /tf \
  -v $(pwd):/tf \
  --ulimit memlock=-1 \
  "${tf_image}:${tf_tag}" /tf/terraform.sh'

export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|')
export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/}
export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id")
export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"

pushd catalog/manifests/openshift4-terraform/

The migration script doesn’t use the terraform alias configured here. Please make sure that you’ve got a Terraform binary available locally (see also section Prerequisites).

Initialize Terraform

terraform init \
  "-backend-config=address=${GITLAB_STATE_URL}" \
  "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
  "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
  "-backend-config=username=${GITLAB_USER}" \
  "-backend-config=password=${GITLAB_TOKEN}" \
  "-backend-config=lock_method=POST" \
  "-backend-config=unlock_method=DELETE" \
  "-backend-config=retry_wait_min=5"

Migrate state
Run the migration script
```
./migrate-state-exoscale-v4-v5.py
```
Verify state using plan
```
terraform plan
```
You can expect the following changes:
- The managed Floaty Exoscale access key will be created
- The LB hieradata will be updated to use the new Floaty access key
- All compute instances will be updated to use their FQDN instead of their hostname for field name
  
  Despite what the web console claims, this change doesn’t require the instances to be restarted.
- Field private_network_ids of all compute instances is added
- The admin SSH key resource is recreated
Apply the changes.
```
terraform apply
```
Merge the hieradata MR.

Run Puppet on the cluster’s LBs, so that you’re using the new Floaty API key managed by Terraform

for id in 0 1; do
  lb_fqdn=$(terraform state show "module.cluster.module.lb.exoscale_domain_record.lb[$id]" | grep hostname | cut -d'=' -f2 | tr -d ' "\r\n')
  echo "${lb_fqdn}"
  ssh "${lb_fqdn}" sudo puppetctl run
done

Fetch and then remove the old Floaty API key from Vault

OLD_FLOATY_KEY=$(vault kv get -format=json \
  clusters/kv/${TENANT_ID}/${CLUSTER_ID}/floaty | \
  jq -r '.data.data.iam_key')

vault kv delete clusters/kv/${TENANT_ID}/${CLUSTER_ID}/floaty

Revoke the old Floaty access key

Don’t remove the old Floaty API key before you’ve ensured that the new API key has been rolled out on the LBs. Otherwise, Floaty won’t be able to migrate the Elastic IPs between the two LBs until you roll out the new key.

Print out the Terraform-managed key

NEW_FLOATY_KEY=$(terraform state show "module.cluster.module.lb.exoscale_iam_access_key.floaty" |\
  grep id | cut -d'=' -f2 | tr -d ' "\r\n')
echo "Terraform-managed key: ${NEW_FLOATY_KEY}"

Revoke the old key

exo iam access-key revoke "${OLD_FLOATY_KEY}"

1	Replace with your API key
2	Replace with your API secret