Configure IP policy, request limits, concurrency, payload caps, and token reservations.

Policies And Rate Limits

Rate limiting is enforced by odock-server in staged gates. Policies can be attached at several scopes and are resolved per request.

Policy Scopes

The resolver builds a policy snapshot in this order:

Global policy from RATELIMIT_GLOBAL_POLICY.
Organisation policy.
Team policy.
API key policy.
Model policy.
MCP server policy for MCP requests.

Each scope can contribute rate-limit rules and IP allow/block lists.

Policy Shape

{
  "policies": {
    "ip": {
      "allowlist": ["10.0.0.0/8"],
      "blocklist": ["203.0.113.0/24"]
    },
    "ratelimit": {
      "requests": {
        "per_second": 20,
        "per_minute": 1200,
        "burst": 50
      },
      "tokens": {
        "per_minute": 120000
      },
      "concurrency": {
        "max": 40,
        "lease_ttl_seconds": 30
      },
      "payload": {
        "max_request_bytes": 2097152,
        "max_tokens": 8192
      }
    }
  }
}

All numeric values must be non-negative. Omitted values mean no limit for that field.

Enforcement Stages

Pre-auth gate

Runs before API key authentication.

Purpose:

apply global/IP policy early;
block known bad clients before database/auth work.

Early gate

Runs after auth and initial policy resolution, before body decode and final model normalization.

Enforces:

payload size guardrail,
max token guardrail when known,
concurrent request leases,
request burst,
requests per second,
requests per minute.

Final gate

Runs after the request is decoded, model config is applied, and model class/provider/stream metadata is known.

Enforces:

tokens per minute reservation based on estimated input tokens plus requested max_tokens;
model or MCP scoped token policy.

Post-flight

Runs after the request completes or is rejected after reservation.

Reconciles:

concurrency lease release;
token reservation refund based on actual usage.

Deny Responses

Rate-limit blocks return HTTP 429 with reason metadata and retry information when available. Common deny codes include:

payload_size_unknown
payload_too_large
max_tokens_exceeded
concurrency_exceeded
burst_exceeded
rps_exceeded
rpm_exceeded
tpm_exceeded

Client IP Handling

The gateway computes client IP from the request. If HTTP_TRUST_PROXY_HEADERS=true, forwarded headers are trusted only for configured proxy CIDRs.

Configure:

HTTP_TRUST_PROXY_HEADERS=true
HTTP_TRUSTED_PROXY_CIDRS=172.16.0.0/12,192.168.0.0/16,10.0.0.0/8

Caching

Policy cache keys live in Redis. The UI can invalidate:

API key policy cache,
organisation policy cache,
team policy cache,
model policy cache,
MCP policy cache.

If cache invalidation is not configured, policy changes take effect after the cache TTL.

Practical Policy Examples

Strict API key:

{
  "policies": {
    "ratelimit": {
      "requests": { "per_minute": 60, "burst": 10 },
      "tokens": { "per_minute": 20000 },
      "concurrency": { "max": 5 },
      "payload": { "max_tokens": 4096, "max_request_bytes": 1048576 }
    }
  }
}

Internal network only:

{
  "policies": {
    "ip": {
      "allowlist": ["10.0.0.0/8", "192.168.0.0/16"],
      "blocklist": []
    }
  }
}

Policies And Rate Limits

On this page