ODOCK.AI
Security & GuardrailsGuardrails

Guardrail Modules

Understand the guardrail module families and the design problem each one solves.

Guardrail Modules

Odock guardrails are modular. Each module family focuses on a different class of runtime risk, and the gateway evaluates those modules at the point in the lifecycle where the right context exists.

The design goal is practical defense in depth. A single large "security check" would be hard to reason about, hard to tune, and easy to bypass by accident. Smaller modules make each decision explainable: one module answers traffic-shape questions, another answers access questions, another answers cost questions, and another answers prompt or response safety questions.

Policy Context

Policy context is the effective set of user-facing policies that apply to the request. It can include organisation, team, API key, model, and MCP settings depending on what the request is doing.

This context lets Odock answer questions such as:

  • Is this request coming from an expected network?
  • Is this application allowed to use this model or tool server?
  • Is the payload within the configured envelope?
  • Does this workload have enough token, quota, or budget capacity?
  • Are there resource-specific rules for this model or MCP server?

IP Policy

IP policy limits where traffic may originate. It can be evaluated early because the origin is part of the request envelope and does not require model output or tool execution.

Use IP rules for network boundaries, such as allowing only office, cloud, or workload networks. They are not a substitute for API key authentication, access grants, or model/MCP governance.

Ratelimit Modules

Ratelimit modules protect the gateway, upstream providers, and shared organisational capacity from sudden spikes or sustained overuse.

LimitMeaning
Requests per secondControls short spikes.
Requests per minuteControls sustained request volume.
BurstAllows a short surge without making the normal rate unlimited.
ConcurrencyLimits simultaneous in-flight work.

Use request limits for workloads where traffic shape matters even before token usage or cost is known.

Payload Limits

Payload limits protect the gateway and upstream services from unexpectedly large input or output envelopes.

LimitMeaning
Max request bytesStops oversized request bodies.
Max tokens per requestStops unusually large requested output.

Payload limits are intentionally separated from content safety. A request can be syntactically safe but too large. A request can also be small but unsafe. These are different problems and should be controlled by different modules.

Concurrency

Concurrency guardrails answer: "How much work can this scope have in flight at once?"

This is different from requests per minute. RPM controls how many requests arrive over a time window. Concurrency controls how many are active at the same time.

Token Limits

Token limits answer: "How much model capacity can this scope consume?"

Use them for LLM traffic where request count alone is not enough. Ten tiny requests and ten large completions should not have the same policy impact. Token-aware modules let Odock reason about the expected model workload and reconcile that expectation with the usage evidence produced by the provider.

Access Grants

Access grants are hard runtime boundaries:

  • model access decides which virtual API keys can call a model
  • MCP access decides which virtual API keys can call an MCP server

An organisation user can create a model or MCP server without automatically making it callable. The API key still needs the explicit grant.

See Model access grants and MCP access grants.

MCP Tool Guardrails

MCP servers expose tools, so Odock adds tool-level governance:

ControlBehavior
Allowed toolsWhen set, only listed tools may run.
Blocked toolsListed tools are denied.
Semantic filterApplies configured content rules to MCP payloads.
Transport/auth configControls how Odock connects to the upstream MCP server.

Use MCP rules when a server exposes mixed-risk tools, such as read-only search plus write-capable file or repository operations.

For more MCP detail, see MCP Security.

Budgets And Quotas

Budgets and quotas are not ratelimit modules, but they are guardrails because they can prevent traffic that would exceed a cost or usage boundary.

  • Budgets protect spend.
  • Quotas protect usage counts, token counts, or other period-based metrics.
  • Lifecycle-aware accounting helps concurrent requests respect the same remaining boundary.

See Budgets and Quotas.

Security Engine And Plugins

The Security Engine handles prompt and response safety. Ratelimit modules handle request-aware, network-aware, and token-aware traffic guardrails inside the staged ratelimit engine.

Read Security Engine and Ratelimit modules next.

On this page