✦ Sample Prompt
Normalize every Prometheus alert rule file across our repos to the alerting standard.
For each rule file (`alerts/*.yaml`, `monitoring/*.yml`, `*-rules.yaml`, ServiceMonitor / PrometheusRule CRDs):
1. Normalize the `severity` label to one of: `critical`, `warning`, `info`.
Map legacy values: `P1`/`page` → `critical`, `P2`/`ticket` → `warning`, `P3` → `info`.
2. Ensure every rule has these annotations:
- `summary` (keep existing if present)
- `runbook_url` (fill from the runbook catalog; use `TODO-runbook-url` if unknown)
3. Ensure every rule has these labels:
- `team` and `service`, pull from the service catalog mapping provided
4. Remove deprecated routing labels (`pagerduty_service`, `routing_key_v1`) and replace
with the new `routing_key` label using the supplied mapping.
5. Do not change `expr`, `for`, or any threshold values.
Skip files that already conform. The Problem
Alert rules drift fast. One team uses `severity: critical`, another uses `severity: P1`, a third forgets to label severity at all. Half the alerts have no runbook URL, and the threshold for "CPU high" varies by 20 points across teams. When an alert fires at 3am, the on-call has no idea whether to page, escalate, or ignore.
Bringing alert rules in line with a single standard means touching every Prometheus rule file across every repo, annotations, labels, severities, runbook links, and routing keys. Doing it by hand takes weeks and the standard drifts again as soon as you turn your back.
What Tidra Does
- Locates every Prometheus rule file across repos (
alerts/,monitoring/,*-rules.yaml) - Normalizes
severitylabels to the approved enum (critical,warning,info) - Adds missing
runbook_url,team, andserviceannotations using the catalog as the source of truth - Removes deprecated routing labels and replaces them with the new routing key scheme
- Opens one PR per repo with a diff summary and a link to the alerting standard doc
Before & After
diff
alerts/api.yaml
- name: api-latency
rules:
- alert: HighLatency
expr: histogram_quantile(0.99, http_request_duration_seconds) > 1
labels:
- severity: P1
+ severity: critical
+ team: payments
+ service: api
annotations:
summary: "p99 latency over 1s"
+ runbook_url: "https://runbooks.example.com/api/high-latency"
Customization Tips
- Severity enum: Update the prompt to match your org’s approved severities (e.g.,
page,ticket,infoinstead of critical/warning). - Catalog source: Tidra can pull
teamandservicefrom OpsLevel, Backstage, Cortex, or a static mapping file; specify which. - Routing keys: If you’re moving from PagerDuty service keys to a new routing scheme, include the mapping in the prompt.