Alert rule: EspejoteOutOfMemoryEvents

Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub.

Overview

This alert fires if Espejote pods are crashing due to an OOM (Out Of Memory) events.

There might be a cache misconfiguration or Espjote needs more memory.

Steps for debugging

Check cache metrics

topk(10,sum by(exported_namespace,managedresource,name) (espejote_cache_size_bytes))
topk(10,sum by(exported_namespace,managedresource,name) (espejote_cached_objects))

Check if cache size can be reduced by selecting a smaller amount of objects by using label selectors or other filters. All cache options can be found with the kubectl explain managedresource.spec.context.resource and kubectl explain managedresource.spec.triggers.watchResource commands.

If using the same selectors for triggers and context the cache can be shared between them by using .spec.triggers.watchContextResource.

Increase memory limits

parameters:
  espejote:
    resources:
      espejote:
        limits:
          memory: 1Gi

Increase the memory limits of the Espejote deployment in the Project Syn hierarchy.