Alert rule: APPUiOCloudReportingMissingData

Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub.

Overview

APPUiO Cloud Reporting is missing data to link either categories, tenants or products to their representation in the ERP, or there might be inconsistencies in the data that can cause wrong amounts billed in the invoices. The next invoice run might fail.

This alert only fires for the last week (7 days) of each month.

Steps for debugging

The mapping should be done automatically.

Inform the product owner of APPUiO Cloud. The situation will usually be handled by the product owner. In absence of the product owner, inform Aldebaran. Use the following instructions to resolve the situation should neither be available.

Identify if data is missing or inconsistent.

There are two possible cases and you might see one or both of them.

  1. Identify the missing fields from the pod log

    kubectl -n appuio-cloud-reporting logs $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name | tail -n1) --all-containers

    If the pod has been deleted, you can also use this Kibana URL.

  2. Look at the last line of the log.

    5 missing, 7 inconsistent entries found.
  3. Continue with the steps for the missing data or the inconsistent data where the count is greater than 0.

Inconsistent data

Inconsistent fact entries can be caused by changes to the during time ranges of the corresponding dimension after facts were already created. The entries should be deleted and re-created.

  1. Connect to the database

    Connect with the billing database.

    INSTANCE_NS=$(kubectl -n appuio-cloud-reporting get vshnpostgresql cloud-reporting-db-prod -o jsonpath='{.status.instanceNamespace}')
    kubectl -n ${INSTANCE_NS} port-forward svc/primary-service 5432 &
    
    DB_USER=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_USER}' | base64 --decode)
    DB_NAME=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_DB}' | base64 --decode)
    export PGPASSWORD=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_PASSWORD}' | base64 --decode)
    
    psql -U "${DB_USER}" -w -h localhost "${DB_NAME}"
  2. Delete the inconsistent fact entries

    begin;
    
    with to_delete as (
      select facts.id from facts
        inner join <TABLE> on facts.<FOREIGN KEY> = <TABLE>.id (1)
        inner join date_times on facts.date_time_id = date_times.id
        where false = <TABLE>.during @> date_times.timestamp
    )
    delete from facts where id in (select id from to_delete);
    1 Replace <TABLE> with the table name of the dimension as shown in the log. Replace <FOREIGN KEY> with the foreign key of the dimension. It’s the dimension name in singular with _id appended. queries becomes query_id for example.
  3. Commit the changes

    commit;
  4. See Alert rule: APPUiOCloudReportingDatabaseBackfillingFailed on how to backfill the database in the time range of the deleted facts.

Missing data

  1. Identify the missing fields from the pod log

    kubectl -n appuio-cloud-reporting logs $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name | tail -n1) --all-containers

    If the pod has been deleted, you can also use this Kibana URL.

  2. Lookup the missing identifiers

    Use the content of the field source as a starter. Use portal.appuio.cloud to look up the organization, and then use control.vshn.net to identify a matching customer. Once a matching customer was found, look it up in erp.vshn.net. Search for the customer’s accounting contact. It’s named <customer name>, Accounting. Fall back to the main customer record if no dedicated accounting contact could be found. The ID you need is the Odoo Partner ID.

  3. Complete the data

    Connect with the billing database.

    Connect with the billing database.

    INSTANCE_NS=$(kubectl -n appuio-cloud-reporting get vshnpostgresql cloud-reporting-db-prod -o jsonpath='{.status.instanceNamespace}')
    kubectl -n ${INSTANCE_NS} port-forward svc/primary-service 5432 &
    
    DB_USER=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_USER}' | base64 --decode)
    DB_NAME=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_DB}' | base64 --decode)
    export PGPASSWORD=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_PASSWORD}' | base64 --decode)
    
    psql -U "${DB_USER}" -w -h localhost "${DB_NAME}"

    Update the records.

    UPDATE tenants set target = <ID from ERP> where id = <UUID from pod log>
  4. Trigger a new check

    name="check-missing-manual-$($(command -v gdate || command -v date) +%s)"
    kubectl --as cluster-admin -n appuio-cloud-reporting create job --from=cronjob/check-missing "${name}"
    kubectl -n appuio-cloud-reporting logs "job/${name}"

    Continue with step 2 should there be more data missing.

  5. Clean up

    Remove all failed check-missing jobs to clear the alert.

    kubectl -n appuio-cloud-reporting delete $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name)