Build AI workflows in n8n with 300+ models through Requesty. Install the Requesty Chat Model community node (
@requesty/n8n-nodes-requesty) and drop it into any AI Agent or Basic LLM Chain. Get strict JSON Schema structured output, native web search, reasoning effort control, and routing policies, all behind one API key.Set up n8n →Use GitHub Copilot Chat in VS Code with 300+ models through Requesty. Add Requesty as a Bring Your Own Key (BYOK) Custom Endpoint provider to get model routing, cost tracking, and fallback policies for Copilot Chat, tools, and MCP servers. Requires VS Code 1.122+.Set up GitHub Copilot →
A comprehensive reference for every router error code — what it means, where it originates, and how to fix it. Plus, you can now group by
status_code in Advanced Analytics to track error trends and costs in your dashboards.Error Codes docs →Pick an exact start and end date for any analytics view. Use relative presets or switch to Date Range mode for full control over the time window.Usage Analytics docs →
Inspect every LLM request in a searchable table with click-to-filter, configurable columns, and pagination. Switch to Traces mode to see multi-step agent runs grouped by
trace_id. Click any row to open a detail panel with the full message timeline, tool call arguments, metadata, guardrail violations, and a model arena for side-by-side comparison.Logs & Traces docs →Download your advanced analytics data as a professional PDF report or CSV spreadsheet. The PDF includes your Requesty logo, date range, and a formatted data table, perfect for sharing cost reports with your team or keeping monthly records.Analytics Exports docs →
Use OpenAI Codex with 300+ models through Requesty. Get model routing, cost tracking, and fallback policies for your Codex coding agent.Set up Codex →
The Get API Key and List API Keys endpoints now include a
group field showing which group each key belongs to, making it easier to manage keys programmatically.API Keys reference →Route requests through the EU endpoint with the newest Gemini models:
vertex/gemini-3.5-flash@eu and vertex/gemini-3.1-flash-lite@eu. Full data residency when combined with the EU endpoint.EU routing docs →Lock down access with Entra ID (Azure AD), Okta, or any OIDC/SAML provider. Members authenticate through your identity provider and land directly in Requesty.Set up SSO →
Create named model allow-lists and attach them to individual API keys or groups. Control exactly which models each key can call without touching your org-wide settings.Create an access list →
Route OpenAI
/v1/responses calls through the gateway with full analytics, fallback, and cost tracking. Custom tool types are supported.Responses API reference →API responses now include a
usage.cost field with the exact dollar amount. For streaming, set stream_options.include_usage to get cost on the final chunk.Cost tracking docs →A new Management API endpoint returns aggregated spend and token counts across your entire org, with the same time filters available on key-level usage.Org Usage API →
The
/v1/models endpoint now returns geolocation data for each model. The Model Library shows EU/US region chips so you can pick the right model before routing.EU routing docs →Connect the Pi coding agent for model routing, cost tracking, and fallback policies across your coding workflows.Set up Pi →
Context length overflows, unsupported image formats, and other provider errors are now translated into plain, actionable messages instead of generic errors.
Two new endpoints:
/v1/audio/speech for text-to-speech and /v1/audio/transcriptions for speech-to-text. Multiple providers including OpenAI and Mistral with automatic fallback.Speech API → · Transcription API →Send image edit requests through
/v1/images/edits with the same multi-provider routing and fallback as generation.Image Edits API →Pass
HTTP-Referer and X-Title headers to label requests by app or site. Filter your analytics dashboard by these values to see cost and latency per integration.Analytics headers docs →Set dollar thresholds on API keys and receive Slack or Microsoft Teams webhooks when spend crosses them. Configurable trigger percentages give you time to act.Configure alerts →
Use Requesty as the backend for the Claude Cowork desktop assistant. All traffic gets unified analytics, cost controls, and model policies.Set up Claude Cowork →
Route OpenCode terminal agent traffic through Requesty. A one-line installer adds analytics tracking to your setup.Set up OpenCode →
The admin panel now lets you set each guardrail policy to Disabled, Report, or Mask individually. A new violations column and detail tab in logs shows exactly what fired.Guardrails docs →
JSON Schema and json_object modes are now available on
/v1/responses, matching the Chat Completions feature set.Structured outputs docs →Azure deployments across European regions are now automatically recognized under the EU filter in the Model Library and routing engine.EU routing docs →
Provider grouping with expand/collapse, region and capability filters, bulk approve/remove, preset quick-filters, and a “New” tab that surfaces recently released models per provider.Manage approved models →
Pick two models, send the same prompt, and see which responds better. The redesigned chat playground also supports image attachments and markdown rendering.
Expandable table rows now show each service account’s API keys, monthly spend, and creator at a glance.Service accounts docs →
The gateway extracts and formats PDF content across providers that support document input. Just include the file in your chat completions request.PDF support docs →
Dedicated model aliases like
coding/ select the right model, provider, and parameters for your workload automatically.Dedicated models docs →Generate embeddings with Google’s latest Gemini Embedding 2.0 model through Requesty, with automatic provider selection.
The analytics dashboard now supports P95 and P99 latency percentiles, pivot tables for multi-dimensional breakdowns, and flexible time ranges including This Week, Month, Quarter, and Year.Usage analytics → · Performance monitoring →
The models list now shows which models support tool calling. Use this to filter models by capability before routing or to build smarter model selection.
Route OpenClaw autonomous agent workloads through Requesty for unified analytics and cost controls.Set up OpenClaw →
Groups can now have their own approved model list, independent of the org-wide setting. Regional model approval handles providers with location-specific deployments correctly.Groups docs →
Set dollar thresholds on your organization and receive Slack notifications when spend crosses them.Alerts docs →
The API keys table is easier to scan with cleaner columns, hover tooltips for long values, and one-click copy. The same improved layout appears in both admin and user views.
Select any two requests in the logs table and open a JSON diff viewer. Added, modified, and removed fields are highlighted with one-click filtering.
Filter the traces page by trace ID, API key name, or user email. Cached percentage is now visible per trace.Session reconstruction docs →
Groups now show spend percentage and budget overrides directly in the table. Admins can adjust limits without navigating away.Groups docs →
Org admins can change member roles at both the organization and group level. Safety checks prevent admins from accidentally demoting themselves.Users and roles docs →
A new column shows how many reasoning tokens each request consumed, giving visibility into model “thinking” costs.
Scope cost, latency, and usage breakdowns to specific API key labels for more targeted reporting.Usage analytics →