/v1/messages endpoint, you can place cache_control breakpoints on individual content blocks, exactly like the native Anthropic prompt caching API. This gives you fine-grained control over which parts of your request are cached.

View cache analytics in the Requesty Console.
cache_control can be placed on any content block type: system text, user text, images, documents, tool definitions, tool_use, and tool_result.
Python
TypeScript
Multi-turn with tool results
You set the cache breakpoint on the latest user message. Everything up to and including that message is cached and reused on the next turn, so only the new content below the previous breakpoint is processed. In agentic flows with tool calls, placecache_control on the last content block of each turn to cache the conversation prefix:
Best practices
- Place breakpoints on the first and last system prompt blocks to cache combinations of system instructions.
- Place a breakpoint on the last tool definition so your tool schema is cached.
- In multi-turn conversations, place a breakpoint on the last content block of each turn to incrementally cache the conversation history.
- The cached prefix must be at least 1,024 tokens for Anthropic (2,048 for Claude 3.5 Haiku). Content shorter than that will not be cached.