ODOCK.AI
Usage

Native Models call

Call Odock through currently available or compatible provider endpoint shapes.

Native Models call

Native endpoints keep the provider family's API shape. Use them when you already have a provider-compatible client and want Odock to sit between that client and the upstream provider.

The native route determines the provider family. The model still names an Odock model configured in your organisation. At runtime, Odock resolves that model to its upstream slug, provider base URL, timeout, and encrypted provider key.

Native endpoint rules:

  • The request uses an Odock virtual API key.
  • The requested model must exist in the organisation.
  • The virtual API key must have access to the model.
  • The model's provider family must match the native endpoint.
  • Provider-specific payloads are preserved where the endpoint supports passthrough.

Endpoints by Provider

The tabs below show currently available or compatible provider endpoint shapes. They are examples of supported native integrations and can grow as new providers are added.

MethodEndpointUse forSDK base URLStreaming
POST/v1/chat/completionsChat Completionshttps://api.odock.ai/v1Yes
POST/v1/responsesResponses APIhttps://api.odock.ai/v1Yes
POST/v1/embeddingsEmbeddingshttps://api.odock.ai/v1No
POST/v1/images/generationsImage generationhttps://api.odock.ai/v1No
POST/v1/images/editsImage editing multipart requestshttps://api.odock.ai/v1No
POST/v1/images/variationsImage variation multipart requestshttps://api.odock.ai/v1No

OpenAI-compatible endpoints are the best fit when you want the smallest migration. Set the OpenAI SDK base_url to Odock and replace the upstream provider key with the Odock virtual API key.

MethodEndpointUse forSDK base URLStreaming
POST/v1/messagesAnthropic Messageshttps://api.odock.aiYes

The Anthropic SDK appends /v1/messages itself, so its base URL should not include /v1.

MethodEndpointUse forClient base URLStreaming
POST/v1beta/models/{model}:generateContentGemini content generationhttps://api.odock.aiNo
POST/v1beta/models/{model}:streamGenerateContentGemini streaming generationhttps://api.odock.aiYes

Gemini-compatible callers can authenticate with Authorization: Bearer ..., x-goog-api-key, or the ?key= query parameter. Headers are preferred for server-side applications.

MethodEndpointUse forClient base URLStreaming
GET/v1/vllm/modelsvLLM model listinghttps://api.odock.aiNo
POST/v1/vllm/chat/completionsvLLM chat completionshttps://api.odock.aiYes
POST/v1/vllm/completionsvLLM completionshttps://api.odock.aiYes
POST/v1/vllm/responsesvLLM responseshttps://api.odock.aiYes
POST/v1/vllm/embeddingsvLLM embeddingshttps://api.odock.aiNo
POST/v1/vllm/audio/transcriptionsvLLM audio transcriptionshttps://api.odock.aiNo
POST/v1/vllm/audio/translationsvLLM audio translationshttps://api.odock.aiNo
POST/v1/vllm/tokenizevLLM tokenizehttps://api.odock.aiNo
POST/v1/vllm/detokenizevLLM detokenizehttps://api.odock.aiNo
POST/v1/vllm/poolingvLLM poolinghttps://api.odock.aiNo
POST/v1/vllm/classifyvLLM classifyhttps://api.odock.aiNo
POST/v1/vllm/scorevLLM scorehttps://api.odock.aiNo
POST/v1/vllm/rerankvLLM rerankhttps://api.odock.aiNo

vLLM endpoints forward raw vLLM-compatible payloads after Odock resolves the model and governance context.

OpenAI-Compatible Calls

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["ODOCK_API_KEY"],
    base_url=os.environ.get("ODOCK_BASE_URL", "https://api.odock.ai/v1"),
)

response = client.chat.completions.create(
    model=os.environ.get("ODOCK_MODEL", "gpt-4.1-mini"),
    messages=[{"role": "user", "content": "Write a short status update."}],
    temperature=0.3,
    max_tokens=120,
)

print(response.choices[0].message.content)
curl "$ODOCK_BASE_URL/chat/completions" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "'"$ODOCK_MODEL"'",
    "messages": [
      {"role": "user", "content": "Write a short status update."}
    ],
    "temperature": 0.3,
    "max_tokens": 120
  }'
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["ODOCK_API_KEY"],
    base_url=os.environ.get("ODOCK_BASE_URL", "https://api.odock.ai/v1"),
)

response = client.responses.create(
    model=os.environ.get("ODOCK_MODEL", "gpt-4.1-mini"),
    input=[{"role": "user", "content": "Summarize the gateway flow."}],
    temperature=0.3,
    max_output_tokens=160,
)

print(response.output_text)
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["ODOCK_API_KEY"],
    base_url=os.environ.get("ODOCK_BASE_URL", "https://api.odock.ai/v1"),
)

response = client.embeddings.create(
    model=os.environ.get("ODOCK_EMBEDDING_MODEL", "text-embedding-3-small"),
    input="Odock records usage for model traffic.",
    encoding_format="float",
)

print(len(response.data[0].embedding))

Anthropic-Compatible Calls

import os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ["ODOCK_API_KEY"],
    base_url=os.environ.get("ODOCK_GATEWAY_URL", "https://api.odock.ai"),
)

message = client.messages.create(
    model=os.environ.get("ODOCK_MODEL", "claude-sonnet-4-5"),
    max_tokens=160,
    temperature=0.3,
    messages=[
        {"role": "user", "content": "Explain model access in Odock."}
    ],
)

print(message.content[0].text)
curl "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1/messages" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "'"${ODOCK_MODEL:-claude-sonnet-4-5}"'",
    "max_tokens": 160,
    "temperature": 0.3,
    "messages": [
      {"role": "user", "content": "Explain model access in Odock."}
    ]
  }'

Gemini-Compatible Calls

import os

from google import genai
from google.genai import types

client = genai.Client(
    api_key=os.environ["ODOCK_API_KEY"],
    http_options=types.HttpOptions(
        # The Google GenAI SDK appends /v1beta, so use the gateway root here.
        base_url=os.environ.get("ODOCK_GATEWAY_URL", "https://api.odock.ai"),
        api_version="v1beta",
    ),
)

response = client.models.generate_content(
    model=os.environ.get("ODOCK_MODEL", "gemini-2.5-flash"),
    contents="Hello from Gemini through my gateway",
)

print(response.text)
import json
import os
from typing import Any

import httpx


BASE_URL = os.getenv("ODOCK_GATEWAY_URL", "https://api.odock.ai").rstrip("/")
API_KEY = os.environ["ODOCK_API_KEY"]
MODEL = os.getenv("ODOCK_MODEL", "gemini-2.5-flash")
PROMPT = os.getenv("ODOCK_PROMPT", "Explain provider credentials in Odock.")


def gemini_payload(prompt: str) -> dict[str, Any]:
    return {
        "contents": [
            {
                "role": "user",
                "parts": [{"text": prompt}],
            }
        ],
        "generationConfig": {
            "temperature": 0.3,
            "maxOutputTokens": 160,
        },
    }


headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json",
}

url = f"{BASE_URL}/v1beta/models/{MODEL}:generateContent"

with httpx.Client(timeout=30.0) as client:
    response = client.post(url, headers=headers, json=gemini_payload(PROMPT))

response.raise_for_status()
data = response.json()

text_parts: list[str] = []
for candidate in data.get("candidates", []):
    content = candidate.get("content", {})
    for part in content.get("parts", []):
        text = part.get("text")
        if text:
            text_parts.append(text)

print("".join(text_parts).strip() or "<no text>")

usage = data.get("usageMetadata")
if usage:
    print("usage:", json.dumps(usage))
curl "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1beta/models/${ODOCK_MODEL:-gemini-2.5-flash}:generateContent" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Explain provider credentials in Odock."}]
      }
    ],
    "generationConfig": {
      "temperature": 0.3,
      "maxOutputTokens": 160
    }
  }'
curl -N "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1beta/models/${ODOCK_MODEL:-gemini-2.5-flash}:streamGenerateContent" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Give two examples of gateway routing."}]
      }
    ],
    "generationConfig": {
      "temperature": 0.3,
      "maxOutputTokens": 160
    }
  }'

vLLM-Compatible Calls

curl "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1/vllm/chat/completions" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "'"${ODOCK_MODEL:-llama-3.1-8b-instruct}"'",
    "messages": [
      {"role": "user", "content": "Explain quota enforcement."}
    ],
    "max_tokens": 160,
    "stream": false
  }'
curl "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1/vllm/models" \
  -H "Authorization: Bearer $ODOCK_API_KEY"
curl "${ODOCK_GATEWAY_URL:-https://api.odock.ai}/v1/vllm/embeddings" \
  -H "Authorization: Bearer $ODOCK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "'"${ODOCK_EMBEDDING_MODEL:-bge-small-en}"'",
    "input": "Embeddings traffic is governed by Odock."
  }'

Native Error Behavior

Native endpoints can return gateway-controlled JSON errors, plain-text transport errors, or provider-native upstream errors after Odock accepts the request. The most common native-specific case is 400 provider_not_allowed, which means the model is configured for a different provider family than the endpoint.

For the full format and status-code tables, see Gateway Errors.

On this page