Skip to main content

API Reference

The Radicalbit AI Gateway implements the OpenAI API specification — including Chat Completions, Embeddings, and the Responses API — making it compatible with existing OpenAI libraries and tools.

Base URL

http://localhost:9000

Authentication

API Key Authentication

curl -H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
http://localhost:9000/v1/chat/completions

The gateway accepts API keys via the Authorization: Bearer header (sent by OpenAI SDKs and compatible libraries) or the X-Api-Key header for direct API calls.

Endpoints

Chat Completions

Endpoint: POST /v1/chat/completions

Description: Creates a completion for the provided chat messages.

Request Body:

{
"model": "project-name/route-name",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"temperature": 0.7,
"max_tokens": 100
}

Parameters:

ParameterTypeRequiredDescription
modelstringYesThe route in project-name/route-name format
messagesarrayYesArray of message objects
temperaturenumberNoSampling temperature (0.0 to 2.0)
max_tokensnumberNoMaximum tokens to generate
top_pnumberNoNucleus sampling parameter
stopstring/arrayNoStop sequences
presence_penaltynumberNoPresence penalty (-2.0 to 2.0)
frequency_penaltynumberNoFrequency penalty (-2.0 to 2.0)

Message Object:

{
"role": "user|assistant|system",
"content": "Message content"
}

Response:

{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "project-name/route-name",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thank you for asking."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}

Embeddings

Endpoint: POST /v1/embeddings

Description: Creates an embedding for the provided input.

Request Body:

{
"model": "project-name/route-name",
"input": "The text to embed"
}

Parameters:

ParameterTypeRequiredDescription
modelstringYesThe route in project-name/route-name format
inputstring/arrayYesText to embed (string or array of strings)

Response:

{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.1, 0.2, 0.3, ...]
}
],
"model": "project-name/route-name",
"usage": {
"prompt_tokens": 5,
"total_tokens": 5
}
}

Responses API

Endpoint: POST /v1/responses

Description: Creates a response using the OpenAI Responses API format (stateless Phase 1). The gateway translates the request to Chat Completions internally, so all gateway features (guardrails, caching, rate limiting) apply as usual.

tip

The model field works exactly like in Chat Completions — it maps to a route name from your config.yaml, not directly to an upstream model name.

warning

The gateway operates in stateless mode only. Setting previous_response_id will return an error. Multi-turn conversations must be managed client-side.

Request Body:

{
"model": "project-name/route-name",
"input": "What is the capital of France?",
"instructions": "You are a helpful assistant.",
"stream": false,
"temperature": 0.7,
"max_output_tokens": 200
}

The input field also accepts a list of message objects:

{
"model": "project-name/route-name",
"input": [
{"role": "user", "content": "What is the capital of France?"}
]
}

Parameters:

ParameterTypeRequiredDescription
modelstringYesThe route in project-name/route-name format
inputstring or arrayYesUser prompt or list of message objects
instructionsstringNoSystem-level instructions (equivalent to a system message)
streambooleanNoWhether to stream the response
temperaturenumberNoSampling temperature
top_pnumberNoNucleus sampling parameter
max_output_tokensintegerNoMaximum output tokens (mapped to max_completion_tokens)
toolsarrayNoFunction tools to make available
tool_choicestring or objectNoTool selection strategy
parallel_tool_callsbooleanNoAllow parallel tool calls
userstringNoEnd-user identifier forwarded to the upstream provider
previous_response_idstringNoNot supported — raises an error

Response:

{
"id": "resp_123",
"object": "response",
"created_at": 1677652288,
"model": "project-name/route-name",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "The capital of France is Paris."
}
]
}
],
"usage": {
"input_tokens": 10,
"output_tokens": 8,
"total_tokens": 18
}
}

Error Responses

Standard Error Format

{
"error": {
"message": "Error description",
"type": "invalid_request_error",
"code": "route_not_found"
}
}

Common Error Codes

CodeDescriptionSolution
route_not_foundRoute name not found in configurationCheck route name in config.yaml
model_not_foundModel not found in routeVerify model configuration
guardrail_blockedContent blocked by guardrailReview guardrail configuration
rate_limit_exceededRate limit exceededAdjust rate limiting configuration
token_limit_exceededToken limit exceededAdjust token limiting configuration
authentication_failedInvalid API keyCheck API key format and permissions