Policies And Rate Limits
Configure IP policy, request limits, concurrency, payload caps, and token reservations.
Policies And Rate Limits
Rate limiting is enforced by odock-server in staged gates. Policies can be attached at several scopes and are resolved per request.
Policy Scopes
The resolver builds a policy snapshot in this order:
- Global policy from
RATELIMIT_GLOBAL_POLICY. - Organisation policy.
- Team policy.
- API key policy.
- Model policy.
- MCP server policy for MCP requests.
Each scope can contribute rate-limit rules and IP allow/block lists.
Policy Shape
{
"policies": {
"ip": {
"allowlist": ["10.0.0.0/8"],
"blocklist": ["203.0.113.0/24"]
},
"ratelimit": {
"requests": {
"per_second": 20,
"per_minute": 1200,
"burst": 50
},
"tokens": {
"per_minute": 120000
},
"concurrency": {
"max": 40,
"lease_ttl_seconds": 30
},
"payload": {
"max_request_bytes": 2097152,
"max_tokens": 8192
}
}
}
}All numeric values must be non-negative. Omitted values mean no limit for that field.
Enforcement Stages
Pre-auth gate
Runs before API key authentication.
Purpose:
- apply global/IP policy early;
- block known bad clients before database/auth work.
Early gate
Runs after auth and initial policy resolution, before body decode and final model normalization.
Enforces:
- payload size guardrail,
- max token guardrail when known,
- concurrent request leases,
- request burst,
- requests per second,
- requests per minute.
Final gate
Runs after the request is decoded, model config is applied, and model class/provider/stream metadata is known.
Enforces:
- tokens per minute reservation based on estimated input tokens plus requested
max_tokens; - model or MCP scoped token policy.
Post-flight
Runs after the request completes or is rejected after reservation.
Reconciles:
- concurrency lease release;
- token reservation refund based on actual usage.
Deny Responses
Rate-limit blocks return HTTP 429 with reason metadata and retry information when available. Common deny codes include:
payload_size_unknownpayload_too_largemax_tokens_exceededconcurrency_exceededburst_exceededrps_exceededrpm_exceededtpm_exceeded
Client IP Handling
The gateway computes client IP from the request. If HTTP_TRUST_PROXY_HEADERS=true, forwarded headers are trusted only for configured proxy CIDRs.
Configure:
HTTP_TRUST_PROXY_HEADERS=true
HTTP_TRUSTED_PROXY_CIDRS=172.16.0.0/12,192.168.0.0/16,10.0.0.0/8Caching
Policy cache keys live in Redis. The UI can invalidate:
- API key policy cache,
- organisation policy cache,
- team policy cache,
- model policy cache,
- MCP policy cache.
If cache invalidation is not configured, policy changes take effect after the cache TTL.
Practical Policy Examples
Strict API key:
{
"policies": {
"ratelimit": {
"requests": { "per_minute": 60, "burst": 10 },
"tokens": { "per_minute": 20000 },
"concurrency": { "max": 5 },
"payload": { "max_tokens": 4096, "max_request_bytes": 1048576 }
}
}
}Internal network only:
{
"policies": {
"ip": {
"allowlist": ["10.0.0.0/8", "192.168.0.0/16"],
"blocklist": []
}
}
}