Fallback Configuration

This page covers fallback configuration for the Radicalbit AI Gateway based on the actual implementation.

Overview

Fallback in the Radicalbit AI Gateway provides automatic failover when primary models fail or become unavailable. The system supports both chat and embedding model fallbacks with configurable target and fallbacks lists.

With the new configuration structure:

Models are defined at top-level (chat_models, embedding_models)
Routes reference models by ID (strings)
Fallback chains always reference model IDs

Fallback Structure

Basic Fallback Configuration (Chat)

chat_models:
  - model_id: openai-4o
    model: openai/gpt-4o

  - model_id: llama3
    model: openai/llama3.2:3b

routes:
  production:
    chat_models:
      - openai-4o
      - llama3
    fallback:
      - target: openai-4o
        fallbacks:
          - llama3

Fallback Fields

Required Fields

target: The primary model ID that will trigger fallback
fallbacks: List of model IDs to use as fallbacks

Optional Fields

type: The fallback type. Use embedding for embedding fallbacks.
If omitted, the fallback is treated as chat.

In other words: chat fallback chains do not need type, while embedding fallback chains should set type: embedding.

Fallback Types

Chat Model Fallback

Used for conversational AI and text generation:

routes:
  production:
    chat_models:
      - gpt-4o
      - gpt-4o-mini
      - claude-3-sonnet
    fallback:
      - target: gpt-4o
        fallbacks:
          - gpt-4o-mini
          - claude-3-sonnet

Embedding Model Fallback

Used for text embeddings and vector operations:

routes:
  production:
    embedding_models:
      - openai-embedding
      - local-embedding
    fallback:
      - target: openai-embedding
        fallbacks:
          - local-embedding
        type: embedding

Configuration Examples

Single Fallback

chat_models:
  - model_id: gpt-4o
    model: openai/gpt-4o
  - model_id: gpt-4o-mini
    model: openai/gpt-4o-mini

routes:
  production:
    chat_models:
      - gpt-4o
      - gpt-4o-mini
    fallback:
      - target: gpt-4o
        fallbacks:
          - gpt-4o-mini

Multiple Fallbacks

chat_models:
  - model_id: gpt-4o
    model: openai/gpt-4o
  - model_id: gpt-4o-mini
    model: openai/gpt-4o-mini
  - model_id: claude-3-sonnet
    model: anthropic/claude-3-sonnet

routes:
  production:
    chat_models:
      - gpt-4o
      - gpt-4o-mini
      - claude-3-sonnet
    fallback:
      - target: gpt-4o
        fallbacks:
          - gpt-4o-mini
          - claude-3-sonnet

Mixed Model Types (Chat + Embedding)

chat_models:
  - model_id: gpt-4o
    model: openai/gpt-4o
  - model_id: gpt-4o-mini
    model: openai/gpt-4o-mini

embedding_models:
  - model_id: openai-embedding
    model: openai/text-embedding-3-small
  - model_id: local-embedding
    model: openai/nomic-embed-text

routes:
  production:
    chat_models:
      - gpt-4o
      - gpt-4o-mini
    embedding_models:
      - openai-embedding
      - local-embedding
    fallback:
      - target: gpt-4o
        fallbacks:
          - gpt-4o-mini
      - target: openai-embedding
        fallbacks:
          - local-embedding
        type: embedding

Fallback Behavior

Automatic Failover

When a target model fails:

The system automatically switches to the first fallback model
If the first fallback fails, it tries the next one
This continues until a model succeeds or all fallbacks are exhausted

Fallback Order

Fallbacks are tried in the order specified:

fallback:
  - target: gpt-4o
    fallbacks:
      - gpt-4o-mini       # First fallback
      - claude-3-sonnet   # Second fallback
      - local-model       # Third fallback

Validation Rules

Model Validation

The target model ID must exist in the route model lists (chat_models or embedding_models)
All fallbacks model IDs must exist in the same route model list
Fallback models must be of the same type as the target (chat vs embedding)

Type Validation

Chat model fallbacks can only use chat models
Embedding model fallbacks can only use embedding models
type: embedding should be specified for embedding fallbacks

Production Setup Example

chat_models:
  - model_id: gpt-4o
    model: openai/gpt-4o
    credentials:
      api_key: !secret OPENAI_API_KEY

  - model_id: gpt-4o-mini
    model: openai/gpt-4o-mini
    credentials:
      api_key: !secret OPENAI_API_KEY

  - model_id: claude-3-sonnet
    model: anthropic/claude-3-sonnet
    credentials:
      api_key: !secret ANTHROPIC_API_KEY

routes:
  production:
    chat_models:
      - gpt-4o
      - gpt-4o-mini
      - claude-3-sonnet
    fallback:
      - target: gpt-4o
        fallbacks:
          - gpt-4o-mini
          - claude-3-sonnet

Best Practices

Fallback Strategy

Use models with similar capabilities as fallbacks
Consider cost implications of fallback models
Test fallback behavior in non-production environments

Model Selection

Choose reliable fallback models
Ensure fallback models have appropriate capacity
Monitor fallback usage and performance

Configuration

Use descriptive model IDs for clarity
Keep fallback chains short and intentional
Test fallback scenarios regularly

Troubleshooting

Common Issues

Target Model Not Found: Ensure target exists in the route model list
Fallback Model Not Found: Verify all fallback IDs exist in the route model list
Type Mismatch: Ensure fallback IDs match the target type (chat vs embedding)

Debug Configuration

routes:
  debug:
    chat_models:
      - debug-model
    fallback:
      - target: debug-model
        fallbacks:
          - debug-fallback

Monitoring

Fallback Metrics

Track fallback activation frequency
Monitor fallback model performance
Measure fallback success rates

Alerting

Set up alerts for frequent fallback usage
Monitor fallback model availability
Track fallback response times

Next Steps

Model Configuration - Detailed model setup guide
Advanced Configuration - Enterprise configuration options
Monitoring - Observability and metrics setup

Overview​

Fallback Structure​

Basic Fallback Configuration (Chat)​

Fallback Fields​

Required Fields​

Optional Fields​

Fallback Types​

Chat Model Fallback​

Embedding Model Fallback​

Configuration Examples​

Single Fallback​

Multiple Fallbacks​

Mixed Model Types (Chat + Embedding)​

Fallback Behavior​

Automatic Failover​

Fallback Order​

Validation Rules​

Model Validation​

Type Validation​

Production Setup Example​

Best Practices​

Fallback Strategy​

Model Selection​

Configuration​

Troubleshooting​

Common Issues​

Debug Configuration​

Monitoring​

Fallback Metrics​

Alerting​

Next Steps​