Documentation Index
Fetch the complete documentation index at: https://docs.requesty.ai/llms.txt
Use this file to discover all available pages before exploring further.
Switch from OpenAI in 2 lines
If youβre already using the OpenAI SDK, point it at Requesty and youβre done. No SDK changes, no new client to learn.Three steps to your first request
Get your API key
Sign up at app.requesty.ai and create a key on the API Keys page. New accounts include free credits to start routing immediately.Export it so the snippets below just work:
Make your first request
Expected response
Expected response
Requesty returns an OpenAI-compatible response with a few extra response headers so you can see which provider served the request and whether it hit cache.
Response body
Response headers
The
HTTP-Referer and X-Title headers are optional but recommended. Requesty uses them to tag your requests in analytics β HTTP-Referer identifies your site URL and X-Title gives your app a human-readable name. Both appear in your analytics dashboards so you can filter traffic by origin.Make it production-ready (bonus)
Two upgrades turn this from a toy into something you can ship. Neither requires new infra.Add metadata so every request is attributable in analytics. Tag by feature, user, or trace ID to slice spend and latency the way you already think about your product.Route to a policy instead of a single model. Create a Fallback Policy once, then reference it by name. If the primary model errors or times out, Requesty tries the next one. No retry logic in your app.Learn more: Request Metadata Β· Fallback Policies Β· Load Balancing.
Pick a model
Every model lives behind one endpoint. Swapmodel in the request to switch providers. No other code changes, no new SDK, no new auth.
Frontier
Claude Opus, GPT-5, Gemini 2.5 Pro. Maximum capability for hard tasks.
Fast & cheap
Haiku, GPT-5 mini, Gemini Flash. Sub-second latency, pennies per million tokens.
Open source
Llama, Qwen, DeepSeek. Hosted or bring-your-own endpoints.
What Requesty adds on top
Fallback routing
Auto-reroute failed requests to backup models. No more 5xx surprises.
Auto-caching
Cut costs up to 80% on repeated prompts with zero configuration.
Usage analytics
Track spend, latency, and errors per key, user, model, or project.
Load balancing
Distribute traffic across providers by cost, latency, or custom weights.
Bring your own keys
Use your own provider accounts and keep existing pricing.
Guardrails & RBAC
Content filtering, approved-model lists, and role-based access.
Use your favorite framework
LangChain
Vercel AI SDK
LlamaIndex
Haystack
Pydantic AI
Axios
Requests
Anthropic SDK
Common questions
How is Requesty different from calling providers directly?
How is Requesty different from calling providers directly?
Direct provider calls give you one model, one auth method, one point of failure. Requesty gives you one endpoint across 300+ models, automatic fallback, shared caching, unified analytics, and one bill. For teams running production AI, thatβs the difference between a side project and an SLA.
Do I pay a markup on tokens?
Do I pay a markup on tokens?
No. You pay provider prices, and you can see exact per-request costs in the analytics dashboard. Use Bring Your Own Keys to keep provider discounts and committed-use pricing.
Is my data used for training?
Is my data used for training?
No. Requesty is a pass-through gateway. We donβt train on your requests or responses. See our data handling and EU routing for residency options.
Which Anthropic SDK endpoint do I use?
Which Anthropic SDK endpoint do I use?
Point the Anthropic SDK at
https://router.requesty.ai/anthropic/v1/messages. See the Anthropic Agent SDKs guide for code samples.Next steps
API Reference
Full endpoint documentation with an interactive playground.
Use with Claude Code
Point Claude Code at Requesty for unified billing and routing.
Configure routing
Set up fallbacks, load balancing, and latency-aware routing.
Join the community
Ask questions, share builds, and meet the team on Discord.