Skip to main content
Requesty normalizes the schema across models and providers through a single API. All inference endpoints use the base URL https://router.requesty.ai/v1.
Download the OpenAPI spec to use with your favorite API client or code generator: Inference API spec

Chat

Generate text completions and conversations using OpenAI Chat Completions, Anthropic Messages, or the Responses API.

Embedding

Create vector embeddings from text for semantic search, similarity matching, and retrieval-augmented generation.

Text to Speech

Convert text into natural-sounding spoken audio with supported TTS models.

Speech to Text

Transcribe audio files into text using speech recognition models.

Images

Generate and edit images using DALL-E, Stable Diffusion, and other image models.

Models

List all available models across providers, with pricing and capability metadata.
All three inference endpoints support web search. Pass the web_search tool and Requesty translates it to the correct provider format (Vertex, Azure, OpenAI, Anthropic, xAI, Perplexity).
EndpointTool definition
POST /v1/chat/completions"tools": [{ "type": "web_search" }]
POST /v1/responses"tools": [{ "type": "web_search" }]
POST /v1/messages"tools": [{ "type": "web_search_20250305", "name": "web_search", "max_uses": 5 }]
See the Web Search guide for full examples, response formats, and streaming details for each endpoint.
Last modified on June 8, 2026