CiliumShadowRangeNodeMissing

Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub.

Overview

This alert fires if the expected and actual count of nodes that are configure to host an Egress Gateway shadow range don’t match for more than 5 minutes.

This usually indicates that the egress gateway shadow range config wasn’t updated when nodes were replaced.

Steps for debugging

  1. Check the Egress Gateway shadow range configmap for the expected nodes

    kubectl -n cilium get cm eip-shadow-ranges -oyaml | yq '.data|keys'
  2. Check the cluster for the actual nodes

    kubectl get nodes

Remediation

Update the egress gateway shadow range config in the cluster’s Project Syn config to map the shadow ranges to the updated set of nodes (usually infra nodes for VSHN Managed OpenShift).