ODOCK.AI
Getting Started

Architecture

How Odock separates management configuration from runtime LLM and MCP traffic execution.

Architecture

Odock separates management from runtime execution.

odock-ui is the management plane. It is a Next.js application that writes organisations, users, teams, providers, models, MCP servers, virtual API keys, budgets, quotas, policies, and usage views into Postgres.

odock-server is the runtime gateway. It receives LLM and MCP traffic, reads configuration from Postgres, keeps hot-path state in Redis, enforces governance, calls upstream providers or MCP servers, and records usage.

Postgres is the source of truth. Redis is the runtime acceleration and coordination layer for authentication cache entries, model and MCP metadata, rate-limit policies and counters, smart-routing policy state, SafetySec sessions, usage collection, and cache invalidation.

The optional observability stack collects metrics, logs, traces, dashboards, and alerts around the gateway and supporting infrastructure. See Docker Compose for the service list and LGTM Stack for the observability details.

Management Plane

The UI owns configuration workflows. Operators use it to manage identity, access, provider setup, model setup, MCP setup, routing, budgets, quotas, and usage review.

Configuration changes are persisted through Prisma into Postgres. The UI wraps Prisma with cache-invalidation hooks for mutating operations on runtime-sensitive models such as API keys, model access, organisations, teams, models, providers, provider keys, and MCP servers.

After a relevant mutation, the UI sends invalidation commands to odock-server at /v1/internal/cache/invalidate. The gateway validates the shared secret, publishes the commands to the Redis invalidation channel, and each gateway instance deletes the affected Redis keys. On the next request, the gateway reloads fresh state from Postgres and repopulates Redis.

This is why management writes are visible to runtime traffic without restarting the gateway. For the resources the UI manages, see Organisations, Users, Teams, Virtual API Keys, Providers, Models, MCP Servers, Budgets, and Quotas.

Gateway Architecture

odock-server is the gateway. The executable is cmd/gateway; it wires configuration, storage clients, repositories, auth, rate limiting, SafetySec, plugins, smart routing, provider clients, MCP cache, budget enforcement, usage collection, cache invalidation, and the HTTP server.

At runtime, the gateway accepts client traffic, routes it through the HTTP server, authenticates the virtual API key, applies access and rate-limit policy, runs security and cost controls, executes plugins, calls the selected provider or MCP server, records usage, and emits telemetry.

Important gateway modules:

ModuleResponsibility
internal/httpserverHTTP routes, middleware, provider-compatible endpoints, unified LLM endpoint, MCP endpoint, health checks, metrics endpoint, and internal cache invalidation route.
internal/authVirtual API key authentication and Redis-backed positive/negative auth caching.
internal/storagePostgres and Redis clients plus repositories that mirror the Prisma-backed schema.
internal/modelcache and internal/mcpcacheRedis-backed model, model-access, provider-key, and MCP server lookup caches.
internal/ratelimitRedis-backed rate-limit policy resolution, policy cache, and request gates.
internal/smartroutingOrganisation-level routing enablement and per-API-key routing policy evaluation.
internal/safetysecRequest and response security phases for prompt injection, jailbreak, sensitive-data, and leakage checks.
internal/plugin and internal/pluginsPlugin chain execution before routing, before upstream calls, after upstream calls, and after responses.
internal/budgetenforcerBudget and quota reservation, enforcement, and settlement.
internal/usageUsage event collection, token and cost normalization, and usage record persistence.
internal/observabilityPrometheus metrics, tracing helpers, context attributes, and gateway instrumentation.

For deeper pages, see Routing, Guardrails, Security Engine, Plugin Architecture, Plugin Lifecycle, and Usage Monitoring.

Request Lifecycle

Every runtime request starts in the HTTP server middleware chain. Non-LLM routes, such as health checks and key validation, leave the LLM lifecycle early. LLM routes continue through concurrency, method, rate-limit, SafetySec, auth, decode, plugin, budget, routing, provider, response, and usage stages.

The provider path supports Odock's unified LLM endpoint and provider-compatible endpoints for the currently available or compatible provider shapes. For calling patterns, see Unified Multi Model Endpoint Call and Native Models Call. For MCP behavior, see MCP Servers.

Observability

The observability stack is optional for self-hosting, but the gateway is designed to emit telemetry when it is enabled.

odock-server exposes Prometheus metrics on /metrics, sends traces through OTLP HTTP, and writes structured logs to stdout. Prometheus stores metrics, Loki stores logs, Tempo stores traces, and Grafana reads all three for dashboards and incident investigation.

Gateway metrics cover requests, provider calls, cache lookups, rate-limit decisions, routing decisions, budget decisions, provider-key decrypt behavior, token usage, usage collection, and cache invalidation. Traces can include auth, routing, provider calls, usage collection, rate limits, budgets, plugins, and safety phases.

For setup and retention, see Self-host Observability Stack. For metrics, logs, traces, dashboards, alerts, and OTEL environment variables, see LGTM Stack. For persisted usage and audit records in Postgres, see Usage Monitoring.

On this page