Framework Examples
The Radicalbit AI Gateway is fully compatible with the OpenAI standard. This means you don't need to change your application code — just point your LLM client to the Gateway by updating three values:
| Parameter | Value |
|---|---|
base_url | Your Gateway URL (e.g., http://localhost:9000/v1) |
api_key | Your Gateway API Key (generated from the UI) |
model | project-name/route-name — the project and route defined in your config.yaml |
The examples below show how to do this with the most common Python frameworks.
Gateway Configuration
All the framework examples on this page work with the same config.yaml. You only need to define one model and one route:
chat_models:
- model_id: gpt-5.1-assistant
model: openai/gpt-5.1
credentials:
api_key: !secret OPENAI_API_KEY
params:
temperature: 0.7
max_tokens: 500
routes:
my-assistant:
chat_models:
- gpt-5.1-assistant
Routes are accessed using the format project-name/route-name. If your project is called my-project and your route is my-assistant, you pass my-project/my-assistant as the model parameter. The Gateway handles the rest.
Chat Completion
- OpenAI SDK
- LangChain
- LlamaIndex
- Instructor
- Haystack
- AutoGen
- LiteLLM
- cURL
The OpenAI Python SDK works out of the box. Pass base_url and api_key to the client constructor.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:9000/v1",
api_key="your-gateway-api-key",
)
response = client.chat.completions.create(
model="my-project/my-assistant", # project-name/route-name
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
)
print(response.choices[0].message.content)
LangChain supports OpenAI-compatible endpoints via ChatOpenAI. Set openai_api_base and openai_api_key to redirect traffic through the Gateway.
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
llm = ChatOpenAI(
openai_api_base="http://localhost:9000/v1",
openai_api_key="your-gateway-api-key",
model_name="my-project/my-assistant", # project-name/route-name
)
messages = [
SystemMessage(content="You are a helpful assistant."),
HumanMessage(content="What is the capital of France?"),
]
response = llm.invoke(messages)
print(response.content)
LangChain chains and agents work the same way — just pass this llm instance wherever a language model is expected.
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("user", "{input}"),
])
chain = prompt | llm
response = chain.invoke({"input": "What is the capital of France?"})
print(response.content)
LlamaIndex supports custom OpenAI-compatible endpoints via the OpenAI LLM class.
from llama_index.llms.openai import OpenAI
llm = OpenAI(
api_base="http://localhost:9000/v1",
api_key="your-gateway-api-key",
model="my-project/my-assistant", # project-name/route-name
)
response = llm.complete("What is the capital of France?")
print(response.text)
For chat-style interactions:
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are a helpful assistant."),
ChatMessage(role="user", content="What is the capital of France?"),
]
response = llm.chat(messages)
print(response.message.content)
Instructor adds structured output parsing on top of the OpenAI SDK. Patch the client after pointing it to the Gateway.
import instructor
from openai import OpenAI
from pydantic import BaseModel
client = instructor.from_openai(
OpenAI(
base_url="http://localhost:9000/v1",
api_key="your-gateway-api-key",
)
)
class Answer(BaseModel):
capital: str
country: str
response = client.chat.completions.create(
model="my-project/my-assistant", # project-name/route-name
messages=[
{"role": "user", "content": "What is the capital of France?"},
],
response_model=Answer,
)
print(response.capital) # Paris
print(response.country) # France
Haystack supports OpenAI-compatible endpoints through OpenAIChatGenerator. Pass the Gateway URL as api_base_url.
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
generator = OpenAIChatGenerator(
api_base_url="http://localhost:9000/v1",
api_key="your-gateway-api-key", # pass as a Secret or plain string
model="my-project/my-assistant", # project-name/route-name
)
messages = [
ChatMessage.from_system("You are a helpful assistant."),
ChatMessage.from_user("What is the capital of France?"),
]
response = generator.run(messages=messages)
print(response["replies"][0].content)
AutoGen supports custom OpenAI-compatible endpoints through its llm_config. Any agent that uses llm_config will route through the Gateway automatically.
import autogen
llm_config = {
"config_list": [
{
"model": "my-assistant", # route name from config.yaml
"api_key": "your-gateway-api-key",
"base_url": "http://localhost:9000/v1",
}
]
}
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)
user_proxy.initiate_chat(
assistant,
message="What is the capital of France?",
)
LiteLLM is a unified interface for multiple LLM providers. Use the openai/ prefix and pass api_base to route through the Gateway.
import litellm
response = litellm.completion(
model="openai/my-project/my-assistant", # openai/ prefix + project-name/route-name
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
api_base="http://localhost:9000/v1",
api_key="your-gateway-api-key",
)
print(response.choices[0].message.content)
You can call the Gateway directly over HTTP without any SDK.
curl http://localhost:9000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-gateway-api-key" \
-d '{
"model": "my-project/my-assistant",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'
Streaming
All frameworks above support streaming. Here is an example using the OpenAI SDK — the pattern is identical across frameworks:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:9000/v1",
api_key="your-gateway-api-key",
)
stream = client.chat.completions.create(
model="my-project/my-assistant",
messages=[{"role": "user", "content": "Tell me a short story."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
Next Steps
- Basic Configuration — Set up your first route
- Advanced Configuration — Add guardrails, caching, and limits to your route
- Guardrails — Protect your application without changing application code