Skip to main content
Requesty’s auto caching automatically caches long system prompts and repeated content to reduce costs on providers that support prompt caching (Anthropic, Gemini). Cache hits are billed at a fraction of the normal input token cost.
View cache analytics in the Requesty Console.

How Auto Cache Works

The auto_cache flag is a boolean parameter sent within the requesty field in your request payload.
ValueBehavior
trueInstructs the router to cache the response from the provider
falseBypasses caching for this request (useful when cache writes have extra costs)
Not providedFalls back to default behavior based on request origin (e.g., Cline and Roo Code default to caching)

How to Use Auto Cache

Include the auto_cache flag within the requesty object in your request:
import openai

client = openai.OpenAI(
    api_key="YOUR_REQUESTY_API_KEY",
    base_url="https://router.requesty.ai/v1",
)

system_prompt = "YOUR ENTIRE KNOWLEDGEBASE"

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-5",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    extra_body={
        "requesty": {
            "auto_cache": True
        }
    }
)
print(response.choices[0].message.content)

Important Notes

Provider Support: The auto_cache flag is respected by providers where cache writes incur extra costs, including Anthropic and Gemini.
  1. Explicit Control: auto_cache provides explicit control. Set to true to attempt caching, false to prevent caching for providers where cache writes incur extra costs.
  2. Default Behavior: If auto_cache is not specified, the caching behavior reverts to defaults based on request origin.
  3. Cost Savings: Cache hits are billed at a fraction of the normal input token cost. This is especially effective for applications with large system prompts or knowledge bases.

Managed Caching

If you want Requesty to manage caching on your behalf, including custom TTL, cache warming, or advanced caching strategies, reach out to [email protected].
Last modified on May 26, 2026