auto_cache flag that allows you to explicitly control the caching behavior for your requests on supported providers. This gives you finer-grained control over when a request’s response should be cached or retrieved from cache.
How Auto Cache Works
Theauto_cache flag is a boolean parameter that can be sent within a custom requesty field in your request payload.
"auto_cache": true: This will instruct the router to attempt to cache the response from the provider. If a similar request has been cached previously, it might be served from the cache (depending on the provider’s caching strategy and TTL)."auto_cache": false: This will instruct the router to bypass any automatic caching logic for this specific request and always fetch a fresh response from the provider.- If
auto_cacheis not provided: The router falls back to a default caching behavior which can depend on the origin of the request (e.g., calls from Cline or Roo Code default to caching).
How to Use Auto Cache
To use theauto_cache flag, include it within the requesty object in your request.
Example with Auto Cache
This example demonstrates how to set theauto_cache flag using the OpenAI Python client. The requesty field is passed as an additional parameter.
Python
Javascript
Important Notes
- Explicit Control:
auto_cacheprovides explicit control.trueattempts to cache,falseprevent caching for providers where cache writes incur extra costs. - Default Behavior: If
auto_cacheis not specified in therequestyfield, the caching behavior reverts to defaults. - Provider Support: This flag is respected by providers/models where cache writes incur extra costs, e.g. Anthropic and Gemini.