Alert rule: APPUiOCloudReportingMissingData
Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub. |
Overview
APPUiO Cloud Reporting is missing data to link either categories, tenants or products to their representation in the ERP, or there might be inconsistencies in the data that can cause wrong amounts billed in the invoices. The next invoice run might fail.
This alert only fires for the last week (7 days) of each month. |
Steps for debugging
The mapping should be done automatically. Check APPUiOCloudReportingDatabaseTenantMappingFailed first. |
Inform the product owner of APPUiO Cloud. The situation will usually be handled by the product owner. In absence of the product owner, inform Aldebaran. Use the following instructions to resolve the situation should neither be available. |
Identify if data is missing or inconsistent.
There are two possible cases and you might see one or both of them.
-
Identify the missing fields from the pod log
kubectl -n appuio-cloud-reporting logs $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name | tail -n1) --all-containers
If the pod has been deleted, you can also use this Kibana URL.
-
Look at the last line of the log.
5 missing, 7 inconsistent entries found.
-
Continue with the steps for the missing data or the inconsistent data where the count is greater than 0.
Inconsistent data
Inconsistent fact entries can be caused by changes to the during
time ranges of the corresponding dimension after facts were already created.
The entries should be deleted and re-created.
-
Connect to the database
Connect with the billing database.
INSTANCE_NS=$(kubectl -n appuio-cloud-reporting get vshnpostgresql cloud-reporting-db-prod -o jsonpath='{.status.instanceNamespace}') kubectl -n ${INSTANCE_NS} port-forward svc/primary-service 5432 & DB_USER=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_USER}' | base64 --decode) DB_NAME=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_DB}' | base64 --decode) export PGPASSWORD=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_PASSWORD}' | base64 --decode) psql -U "${DB_USER}" -w -h localhost "${DB_NAME}"
-
Delete the inconsistent fact entries
begin; with to_delete as ( select facts.id from facts inner join <TABLE> on facts.<FOREIGN KEY> = <TABLE>.id (1) inner join date_times on facts.date_time_id = date_times.id where false = <TABLE>.during @> date_times.timestamp ) delete from facts where id in (select id from to_delete);
1 Replace <TABLE>
with the table name of the dimension as shown in the log. Replace<FOREIGN KEY>
with the foreign key of the dimension. It’s the dimension name in singular with_id
appended.queries
becomesquery_id
for example. -
Commit the changes
commit;
-
See Alert rule: APPUiOCloudReportingDatabaseBackfillingFailed on how to backfill the database in the time range of the deleted facts.
Missing data
-
Identify the missing fields from the pod log
kubectl -n appuio-cloud-reporting logs $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name | tail -n1) --all-containers
If the pod has been deleted, you can also use this Kibana URL.
-
Lookup the missing identifiers
Use the content of the field
source
as a starter. Use portal.appuio.cloud to look up the organization, and then use control.vshn.net to identify a matching customer. Once a matching customer was found, look it up in erp.vshn.net. Search for the customer’s accounting contact. It’s named<customer name>, Accounting
. Fall back to the main customer record if no dedicated accounting contact could be found. The ID you need is theOdoo Partner ID
. -
Complete the data
Connect with the billing database.
Connect with the billing database.
INSTANCE_NS=$(kubectl -n appuio-cloud-reporting get vshnpostgresql cloud-reporting-db-prod -o jsonpath='{.status.instanceNamespace}') kubectl -n ${INSTANCE_NS} port-forward svc/primary-service 5432 & DB_USER=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_USER}' | base64 --decode) DB_NAME=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_DB}' | base64 --decode) export PGPASSWORD=$(kubectl -n appuio-cloud-reporting get secret/reporting-db-prod-cred -o jsonpath='{.data.POSTGRESQL_PASSWORD}' | base64 --decode) psql -U "${DB_USER}" -w -h localhost "${DB_NAME}"
Update the records.
UPDATE tenants set target = <ID from ERP> where id = <UUID from pod log>
-
Trigger a new check
name="check-missing-manual-$($(command -v gdate || command -v date) +%s)" kubectl --as cluster-admin -n appuio-cloud-reporting create job --from=cronjob/check-missing "${name}" kubectl -n appuio-cloud-reporting logs "job/${name}"
Continue with step 2 should there be more data missing.
-
Clean up
Remove all failed check-missing jobs to clear the alert.
kubectl -n appuio-cloud-reporting delete $(kubectl -n appuio-cloud-reporting get job -l cron-job-name=check-missing --field-selector status.successful=0 -o name)