Investigate provider latency or errors

Use this tutorial when users report slow or failing requests and you need to know whether the problem is inside Odock or at the upstream provider edge.

Open Operations -> Provider Dashboard in Grafana.

Set the time range to the incident window before applying any filters.

Check request rate, error rate, and latency for the affected provider.

Look for:

a sudden rise in error ratio
p95 or p99 latency spikes
one provider degrading while others stay healthy

Cross-check Gateway Request Dashboard.

Compare total request latency with gateway overhead. If gateway overhead is low while provider latency is high, the provider path is the more likely cause.

Drill into traces or logs for one failing sample.

Open a slow or failing request and inspect gateway.provider.request spans or filtered Loki logs for the same provider and time window.

If your role can access Organizations Overview, confirm whether the issue affects one organisation or the whole deployment.

If only one organisation is affected, compare route choice, model choice, plugins, budgets, and traffic shape before escalating.

How To Interpret What You See

Signal	Interpretation
High provider latency, low gateway overhead	Provider-side slowness
High gateway overhead, normal provider latency	Gateway plugin, safety, routing, or internal dependency issue
One provider erroring, others healthy	Isolated upstream incident
All providers slow at once	Broader platform, network, or infrastructure issue

If metrics suggest a platform issue or exporter problem, continue with Review alerts and pipeline health.

Investigate provider latency or errors

Investigate provider latency or errors

How To Interpret What You See

On this page