Fallback Configuration
This page covers fallback configuration for the Radicalbit AI Gateway based on the actual implementation.
Overview
Fallback in the Radicalbit AI Gateway provides automatic failover when primary models fail or become unavailable. The system supports both chat and embedding model fallbacks with configurable target and fallbacks lists.
With the new configuration structure:
- Models are defined at top-level (
chat_models,embedding_models) - Routes reference models by ID (strings)
- Fallback chains always reference model IDs
Fallback Structure
Basic Fallback Configuration (Chat)
chat_models:
- model_id: openai-4o
model: openai/gpt-4o
- model_id: llama3
model: openai/llama3.2:3b
routes:
production:
chat_models:
- openai-4o
- llama3
fallback:
- target: openai-4o
fallbacks:
- llama3
Fallback Fields
Required Fields
target: The primary model ID that will trigger fallbackfallbacks: List of model IDs to use as fallbacks
Optional Fields
type: The fallback type. Useembeddingfor embedding fallbacks.
If omitted, the fallback is treated as chat.
In other words: chat fallback chains do not need
type, while embedding fallback chains should settype: embedding.
Fallback Types
Chat Model Fallback
Used for conversational AI and text generation:
routes:
production:
chat_models:
- gpt-4o
- gpt-4o-mini
- claude-3-sonnet
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini
- claude-3-sonnet
Embedding Model Fallback
Used for text embeddings and vector operations:
routes:
production:
embedding_models:
- openai-embedding
- local-embedding
fallback:
- target: openai-embedding
fallbacks:
- local-embedding
type: embedding
Configuration Examples
Single Fallback
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
- model_id: gpt-4o-mini
model: openai/gpt-4o-mini
routes:
production:
chat_models:
- gpt-4o
- gpt-4o-mini
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini
Multiple Fallbacks
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
- model_id: gpt-4o-mini
model: openai/gpt-4o-mini
- model_id: claude-3-sonnet
model: anthropic/claude-3-sonnet
routes:
production:
chat_models:
- gpt-4o
- gpt-4o-mini
- claude-3-sonnet
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini
- claude-3-sonnet
Mixed Model Types (Chat + Embedding)
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
- model_id: gpt-4o-mini
model: openai/gpt-4o-mini
embedding_models:
- model_id: openai-embedding
model: openai/text-embedding-3-small
- model_id: local-embedding
model: openai/nomic-embed-text
routes:
production:
chat_models:
- gpt-4o
- gpt-4o-mini
embedding_models:
- openai-embedding
- local-embedding
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini
- target: openai-embedding
fallbacks:
- local-embedding
type: embedding
Fallback Behavior
Automatic Failover
When a target model fails:
- The system automatically switches to the first fallback model
- If the first fallback fails, it tries the next one
- This continues until a model succeeds or all fallbacks are exhausted
Fallback Order
Fallbacks are tried in the order specified:
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini # First fallback
- claude-3-sonnet # Second fallback
- local-model # Third fallback
Validation Rules
Model Validation
- The
targetmodel ID must exist in the route model lists (chat_modelsorembedding_models) - All
fallbacksmodel IDs must exist in the same route model list - Fallback models must be of the same type as the target (chat vs embedding)
Type Validation
- Chat model fallbacks can only use chat models
- Embedding model fallbacks can only use embedding models
type: embeddingshould be specified for embedding fallbacks
Production Setup Example
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
credentials:
api_key: !secret OPENAI_API_KEY
- model_id: gpt-4o-mini
model: openai/gpt-4o-mini
credentials:
api_key: !secret OPENAI_API_KEY
- model_id: claude-3-sonnet
model: anthropic/claude-3-sonnet
credentials:
api_key: !secret ANTHROPIC_API_KEY
routes:
production:
chat_models:
- gpt-4o
- gpt-4o-mini
- claude-3-sonnet
fallback:
- target: gpt-4o
fallbacks:
- gpt-4o-mini
- claude-3-sonnet
Best Practices
Fallback Strategy
- Use models with similar capabilities as fallbacks
- Consider cost implications of fallback models
- Test fallback behavior in non-production environments
Model Selection
- Choose reliable fallback models
- Ensure fallback models have appropriate capacity
- Monitor fallback usage and performance
Configuration
- Use descriptive model IDs for clarity
- Keep fallback chains short and intentional
- Test fallback scenarios regularly
Troubleshooting
Common Issues
- Target Model Not Found: Ensure
targetexists in the route model list - Fallback Model Not Found: Verify all fallback IDs exist in the route model list
- Type Mismatch: Ensure fallback IDs match the
targettype (chat vs embedding)
Debug Configuration
routes:
debug:
chat_models:
- debug-model
fallback:
- target: debug-model
fallbacks:
- debug-fallback
Monitoring
Fallback Metrics
- Track fallback activation frequency
- Monitor fallback model performance
- Measure fallback success rates
Alerting
- Set up alerts for frequent fallback usage
- Monitor fallback model availability
- Track fallback response times
Next Steps
- Model Configuration - Detailed model setup guide
- Advanced Configuration - Enterprise configuration options
- Monitoring - Observability and metrics setup