Guardrail Modules
Understand the guardrail module families and the design problem each one solves.
Guardrail Modules
Odock guardrails are modular. Each module family focuses on a different class of runtime risk, and the gateway evaluates those modules at the point in the lifecycle where the right context exists.
The design goal is practical defense in depth. A single large "security check" would be hard to reason about, hard to tune, and easy to bypass by accident. Smaller modules make each decision explainable: one module answers traffic-shape questions, another answers access questions, another answers cost questions, and another answers prompt or response safety questions.
Policy Context
Policy context is the effective set of user-facing policies that apply to the request. It can include organisation, team, API key, model, and MCP settings depending on what the request is doing.
This context lets Odock answer questions such as:
- Is this request coming from an expected network?
- Is this application allowed to use this model or tool server?
- Is the payload within the configured envelope?
- Does this workload have enough token, quota, or budget capacity?
- Are there resource-specific rules for this model or MCP server?
IP Policy
IP policy limits where traffic may originate. It can be evaluated early because the origin is part of the request envelope and does not require model output or tool execution.
Use IP rules for network boundaries, such as allowing only office, cloud, or workload networks. They are not a substitute for API key authentication, access grants, or model/MCP governance.
Ratelimit Modules
Ratelimit modules protect the gateway, upstream providers, and shared organisational capacity from sudden spikes or sustained overuse.
| Limit | Meaning |
|---|---|
| Requests per second | Controls short spikes. |
| Requests per minute | Controls sustained request volume. |
| Burst | Allows a short surge without making the normal rate unlimited. |
| Concurrency | Limits simultaneous in-flight work. |
Use request limits for workloads where traffic shape matters even before token usage or cost is known.
Payload Limits
Payload limits protect the gateway and upstream services from unexpectedly large input or output envelopes.
| Limit | Meaning |
|---|---|
| Max request bytes | Stops oversized request bodies. |
| Max tokens per request | Stops unusually large requested output. |
Payload limits are intentionally separated from content safety. A request can be syntactically safe but too large. A request can also be small but unsafe. These are different problems and should be controlled by different modules.
Concurrency
Concurrency guardrails answer: "How much work can this scope have in flight at once?"
This is different from requests per minute. RPM controls how many requests arrive over a time window. Concurrency controls how many are active at the same time.
Token Limits
Token limits answer: "How much model capacity can this scope consume?"
Use them for LLM traffic where request count alone is not enough. Ten tiny requests and ten large completions should not have the same policy impact. Token-aware modules let Odock reason about the expected model workload and reconcile that expectation with the usage evidence produced by the provider.
Access Grants
Access grants are hard runtime boundaries:
- model access decides which virtual API keys can call a model
- MCP access decides which virtual API keys can call an MCP server
An organisation user can create a model or MCP server without automatically making it callable. The API key still needs the explicit grant.
See Model access grants and MCP access grants.
MCP Tool Guardrails
MCP servers expose tools, so Odock adds tool-level governance:
| Control | Behavior |
|---|---|
| Allowed tools | When set, only listed tools may run. |
| Blocked tools | Listed tools are denied. |
| Semantic filter | Applies configured content rules to MCP payloads. |
| Transport/auth config | Controls how Odock connects to the upstream MCP server. |
Use MCP rules when a server exposes mixed-risk tools, such as read-only search plus write-capable file or repository operations.
For more MCP detail, see MCP Security.
Budgets And Quotas
Budgets and quotas are not ratelimit modules, but they are guardrails because they can prevent traffic that would exceed a cost or usage boundary.
- Budgets protect spend.
- Quotas protect usage counts, token counts, or other period-based metrics.
- Lifecycle-aware accounting helps concurrent requests respect the same remaining boundary.
Security Engine And Plugins
The Security Engine handles prompt and response safety. Ratelimit modules handle request-aware, network-aware, and token-aware traffic guardrails inside the staged ratelimit engine.
Read Security Engine and Ratelimit modules next.