ODOCK.AI
ObservabilityLGTM Stack

LGTM Stack

Concepts and investigation workflows for the self-hosted or enterprise LGTM observability stack.

LGTM Stack

The LGTM stack is the platform observability workspace for Odock deployments that run on your company's infrastructure. It is available when:

  • you self-hosts Odock
  • your company runs the enterprise edition on its own server or private infrastructure

This section is written for two audiences:

  • organisation users who have been given read-only or investigation access to Grafana, Loki, Tempo, and Prometheus
  • platform operators who manage the deployment and own retention, alerting, and OTEL wiring

If you do not have access to the stack, use Usage Records and Traffic Analytics instead.

What The Stack Adds

The LGTM stack complements the in-product observability surfaces:

  • Prometheus metrics for the gateway and surrounding infrastructure
  • Tempo traces for the request lifecycle
  • Loki logs for structured gateway logging
  • Grafana dashboards and drill-down workflows over all three
  • Alertmanager rules for operational notifications

The stack is provisioned by the observability profile in the root Docker Compose. Configuration lives under observability/ in the repository root.

Pick The Right Surface

QuestionBest place to start
What happened to one specific request?Usage Records
How are traffic, latency, tokens, or cost trending for my organisation?Traffic Analytics
Is the gateway, provider path, log pipeline, or trace pipeline healthy?LGTM stack
Do I need metrics, traces, and logs in one investigation workflow?LGTM stack

Usage records are the product-facing audit trail. The LGTM stack is the operational evidence layer around the gateway itself.

Why A Separate Stack

SurfaceSource of truthPrimary audience
Usage Records / Traffic AnalyticsDatabaseOrganisation users
LGTM stackGateway metrics, traces, logs, and infrastructure exportersOrganisation users with stack access, platform operators

The distinction matters because a failed or degraded request can have two very different causes:

  • the request itself was rejected, rerouted, blocked, or priced in a specific way
  • the platform around the request was unhealthy, slow, or partially failing before or after the usage record was written

The first answer comes from the Odock UI. The second answer comes from LGTM.

High-Level Components

For the per-signal view, see Data flow.

Default Ports

ServiceURL
Grafanahttp://127.0.0.1:3001
Prometheushttp://127.0.0.1:9091
Alertmanagerhttp://127.0.0.1:9093
Lokihttp://127.0.0.1:3100
Tempohttp://127.0.0.1:3200
OTLP HTTP127.0.0.1:4318
OTLP gRPC127.0.0.1:4317

Start The Stack

If you operate the deployment yourself:

docker compose --profile observability up -d

Run from the repository root. To pin env values, copy observability/.env.example to observability/.env and either merge it into the repo .env or pass --env-file observability/.env.

For the end-to-end deployment path, see Self-host Observability Stack.

What odock-server Emits

By default the gateway:

  • exposes Prometheus metrics on /metrics
  • exports OTLP traces to the OTel Collector at http://otel-collector:4318
  • writes structured logs to stdout so Promtail can forward them to Loki

The Collector also accepts OTLP metrics and logs from other services in the platform and forwards them to Prometheus, Tempo, and Loki. odock-server keeps the simpler /metrics scrape path by default to avoid duplicate time series.

OBSERVABILITY_OTEL_EXPORTER=otlphttp
OBSERVABILITY_OTEL_TRACES_EXPORTER=otlphttp
OBSERVABILITY_OTEL_METRICS_EXPORTER=none
OBSERVABILITY_OTEL_ENDPOINT=http://otel-collector:4318
OBSERVABILITY_SERVICE_NAME=odock-server
OBSERVABILITY_SERVICE_NAMESPACE=odock
OBSERVABILITY_SERVICE_VERSION=dev
OBSERVABILITY_SERVICE_INSTANCE_ID=${HOSTNAME}
OBSERVABILITY_DEPLOYMENT_ENVIRONMENT=production
OBSERVABILITY_SAMPLE_RATE=0.1

See OTEL configuration for the full variable set and Kubernetes wiring.

Concept Pages

  • Data flow: how metrics, traces, and logs move from the gateway to Grafana
  • Metrics: the gateway metric catalog and label policy
  • Traces: span hierarchy, key dimensions, and request correlation
  • Logs: structured logging, safe logging rules, and correlation fields
  • Grafana dashboards: what each provisioned folder and dashboard is for
  • Alerts: the alert families shipped with the stack
  • OTEL configuration: environment variables and Kubernetes wiring for platform owners

Tutorials

On this page