Caching
This page covers caching configuration and features in the Radicalbit AI Gateway.
Overview
Caching in the Radicalbit AI Gateway improves performance and reduces costs by storing responses for frequently repeated requests. The gateway supports two types of caching:
- Semantic Cache: Intelligent caching based on content similarity
- Exact Cache: Precise response caching to avoid redundancy
The gateway supports both Redis and in-memory caching backends.
Cache Types
Redis Caching
Persistent caching using Redis for distributed deployments:
cache:
redis_host: localhost
redis_port: 6379
In-Memory Caching
Fast caching using local memory for single-instance deployments can be enabled without the Redis setting host.
Route-Level Caching
Basic Caching Configuration
routes:
production:
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
caching:
enabled: true
ttl: 3600 # Cache for 1 hour
Advanced Caching Options
routes:
production:
chat_models:
- model_id: gpt-4o
model: openai/gpt-4o
caching:
enabled: true
ttl: 7200 # Cache for 2 hours
Cache Configuration
Global Cache Settings
cache:
redis_host: redis-server
redis_port: 6379
Cache Keys
Cache keys are automatically generated based on:
- Route name + Message content hash
Best Practices
TTL Configuration
- Short TTL (300-1800s): For dynamic content
- Medium TTL (1800-7200s): For semi-static content
- Long TTL (7200-86400s): For static content
Memory Management
- Set appropriate TTL values
Performance Optimization
- Enable caching for frequently used models
- Use Redis for distributed deployments
- Monitor cache hit rates
Monitoring
Cache Metrics
- Cache Hit Counter: Number of times the gateway hits the cache
Troubleshooting
Common Issues
- Cache Not Working: Verify cache is enabled and Redis is accessible
- High Memory Usage: Adjust TTL values and monitor cache size
- Redis Connection Issues: Check Redis server status and credentials
Next Steps
- Fallback - Set up automatic failover
- Monitoring - Set up observability and metrics
- API Reference - Complete API documentation