Framework Examples

The Radicalbit AI Gateway is fully compatible with the OpenAI standard. This means you don't need to change your application code — just point your LLM client to the Gateway by updating three values:

Parameter	Value
`base_url`	Your Gateway URL (e.g., `http://localhost:9000/v1`)
`api_key`	Your Gateway API Key (generated from the UI)
`model`	`project-name/route-name` — the project and route defined in your `config.yaml`

The examples below show how to do this with the most common Python frameworks.

Gateway Configuration

All the framework examples on this page work with the same config.yaml. You only need to define one model and one route:

chat_models:
  - model_id: gpt-5.1-assistant
    model: openai/gpt-5.1
    credentials:
      api_key: !secret OPENAI_API_KEY
    params:
      temperature: 0.7
      max_tokens: 500

routes:
  my-assistant:
    chat_models:
      - gpt-5.1-assistant

Routes are accessed using the format project-name/route-name. If your project is called my-project and your route is my-assistant, you pass my-project/my-assistant as the model parameter. The Gateway handles the rest.

Chat Completion

The OpenAI Python SDK works out of the box. Pass base_url and api_key to the client constructor.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:9000/v1",
    api_key="your-gateway-api-key",
)

response = client.chat.completions.create(
    model="my-project/my-assistant",  # project-name/route-name
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
)

print(response.choices[0].message.content)

LangChain supports OpenAI-compatible endpoints via ChatOpenAI. Set openai_api_base and openai_api_key to redirect traffic through the Gateway.

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(
    openai_api_base="http://localhost:9000/v1",
    openai_api_key="your-gateway-api-key",
    model_name="my-project/my-assistant",  # project-name/route-name
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is the capital of France?"),
]

response = llm.invoke(messages)
print(response.content)

LangChain chains and agents work the same way — just pass this llm instance wherever a language model is expected.

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])

chain = prompt | llm
response = chain.invoke({"input": "What is the capital of France?"})
print(response.content)

LlamaIndex supports custom OpenAI-compatible endpoints via the OpenAI LLM class.

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    api_base="http://localhost:9000/v1",
    api_key="your-gateway-api-key",
    model="my-project/my-assistant",  # project-name/route-name
)

response = llm.complete("What is the capital of France?")
print(response.text)

For chat-style interactions:

from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="What is the capital of France?"),
]

response = llm.chat(messages)
print(response.message.content)

Instructor adds structured output parsing on top of the OpenAI SDK. Patch the client after pointing it to the Gateway.

import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(
    OpenAI(
        base_url="http://localhost:9000/v1",
        api_key="your-gateway-api-key",
    )
)

class Answer(BaseModel):
    capital: str
    country: str

response = client.chat.completions.create(
    model="my-project/my-assistant",  # project-name/route-name
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    response_model=Answer,
)

print(response.capital)   # Paris
print(response.country)   # France

Haystack supports OpenAI-compatible endpoints through OpenAIChatGenerator. Pass the Gateway URL as api_base_url.

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

generator = OpenAIChatGenerator(
    api_base_url="http://localhost:9000/v1",
    api_key="your-gateway-api-key",  # pass as a Secret or plain string
    model="my-project/my-assistant",  # project-name/route-name
)

messages = [
    ChatMessage.from_system("You are a helpful assistant."),
    ChatMessage.from_user("What is the capital of France?"),
]

response = generator.run(messages=messages)
print(response["replies"][0].content)

AutoGen supports custom OpenAI-compatible endpoints through its llm_config. Any agent that uses llm_config will route through the Gateway automatically.

import autogen

llm_config = {
    "config_list": [
        {
            "model": "my-assistant",  # route name from config.yaml
            "api_key": "your-gateway-api-key",
            "base_url": "http://localhost:9000/v1",
        }
    ]
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

user_proxy.initiate_chat(
    assistant,
    message="What is the capital of France?",
)

LiteLLM is a unified interface for multiple LLM providers. Use the openai/ prefix and pass api_base to route through the Gateway.

import litellm

response = litellm.completion(
    model="openai/my-project/my-assistant",  # openai/ prefix + project-name/route-name
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
    api_base="http://localhost:9000/v1",
    api_key="your-gateway-api-key",
)

print(response.choices[0].message.content)

You can call the Gateway directly over HTTP without any SDK.

curl http://localhost:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-gateway-api-key" \
  -d '{
    "model": "my-project/my-assistant",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Streaming

All frameworks above support streaming. Here is an example using the OpenAI SDK — the pattern is identical across frameworks:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:9000/v1",
    api_key="your-gateway-api-key",
)

stream = client.chat.completions.create(
    model="my-project/my-assistant",
    messages=[{"role": "user", "content": "Tell me a short story."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Next Steps

Basic Configuration — Set up your first route
Advanced Configuration — Add guardrails, caching, and limits to your route
Guardrails — Protect your application without changing application code

Gateway Configuration​

Chat Completion​

Streaming​

Next Steps​

Gateway Configuration

Chat Completion

Streaming

Next Steps