- Access 300+ models from OpenAI, Anthropic, Google, Meta, Mistral, Cohere, and many other providers through one API key.
- Enforce strict JSON Schema server side for real structured output, not prompt engineered.
- Give the model a native web search tool for up to date information.
- Tune reasoning effort for reasoning capable models.
- Apply fallback policies, load balancing, and latency routing to keep your workflows responsive.
- Track and manage your spend in a single location.
Prerequisites
- A running n8n instance (Cloud or self hosted).
- A Requesty API key from the API Keys page.
Installation
Follow the community nodes installation guide in the n8n documentation. In your n8n instance, go to Settings > Community Nodes and install:Credentials
- Sign up at app.requesty.ai.
- Generate an API key on the API Keys page.
- In n8n, create a new Requesty API credential and paste your key.
Usage
The Requesty Chat Model node connects to any model available through Requesty’s unified gateway. Use it anywhere n8n accepts a chat model, such as the AI Agent node, Basic LLM Chain, or any AI workflow. Once your API key is saved, the Model dropdown auto populates with all available models. You can also set a model ID directly using an expression, for exampleanthropic/claude-sonnet-4-6 or openai/gpt-5.4. To route through a policy, use the policy/policy-name format such as policy/reliable-coding. See Routing Policies for how to create a policy that automatically falls back between models.
Configuration options
| Option | Default | Description |
|---|---|---|
| Response Format | Text | Text, JSON Object, or JSON Schema (strict structured output) |
| JSON Schema | (example) | The schema the response must match when Response Format is JSON Schema |
| Reasoning Effort | Default | Reasoning level (low, medium, high) for reasoning capable models |
| Base URL | (gateway) | Override the gateway URL for self hosted Requesty deployments |
| Enable Web Search | off | Give the model a native web search tool for up to date information |
| Web Search Context Size | medium | How much context the web search retrieves per query |
| Sampling Temperature | 0.7 | Controls randomness (0 is deterministic, 2 is very random) |
| Maximum Tokens | unlimited | Maximum number of tokens to generate |
| Top P | 1 | Nucleus sampling probability mass |
| Frequency Penalty | 0 | Penalizes token repetition |
| Presence Penalty | 0 | Penalizes already seen tokens |
Key features
- 300+ models: Access models from OpenAI, Anthropic, Google, Meta, Mistral, Cohere, and more.
- Responses API: Built on the Responses API, unlocking richer capabilities than plain chat completions.
- Structured output: Enforce a strict JSON Schema server side (structured outputs).
- Native web search: Let the model search the web for current information (web search).
- Reasoning control: Tune reasoning effort for reasoning capable models (reasoning).
- Intelligent routing: Automatic fallbacks and load balancing across providers.
- Self hosted friendly: Point the node at your own Requesty deployment via the Base URL option.