ODOCK.AI
ObservabilityLGTM StackTutorials

Review alerts and pipeline health

Read the active alerts and connect them to the right Grafana dashboards.

Review alerts and pipeline health

Use this tutorial when the question is no longer "what happened to one request" but "what is unhealthy in the deployment right now".

Open the alerting view your deployment uses.

This may be Grafana alerting, Alertmanager, or both depending on how your company exposes the stack.

Group the active alerts by family before you investigate.

The common families are:

  • infrastructure
  • OTel pipeline
  • gateway request health
  • provider health
  • logger pipeline
  • usage collector
  • token volume anomalies

Open the first matching dashboard for the alert family.

Use this mapping:

Alert familyDashboard
InfrastructureInfrastructure or Containers dashboards
OTel pipelineTraces -> OTel Pipeline Health
Gateway request healthGateway Request Dashboard
Provider healthProvider Dashboard
Logger pipelineLogger Health Dashboard
Usage collectorUsage / Budget Dashboard plus Redis and Postgres health
Token anomaliesToken Usage Dashboard

Check whether the issue is signal loss or real request degradation.

For example:

  • missing traces with healthy requests points to OTEL pipeline trouble
  • request 5xx spikes with healthy exporters points to a real runtime failure
  • log drop alerts point to lost evidence, not necessarily failed requests

Capture the evidence before escalating or acting.

Record:

  • the alert name and severity
  • the time range
  • the affected provider, service, organisation, or key
  • the dashboard screenshot or trace or log query you used

Ownership Guide

Alert familyUsually owned by
InfrastructurePlatform or SRE team
OTel pipelinePlatform or observability owner
Gateway request and provider healthPlatform team, often with provider escalation
Token anomaliesPlatform owner and organisation owner together

If you only have read-only access, stop at evidence collection and hand the incident to the deployment owner.

On this page