Skip to main content
POST
/
v1
/
responses
Create response
curl --request POST \
  --url https://router.requesty.ai/v1/responses \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai-responses/gpt-5",
  "input": "Tell me a three sentence bedtime story about a unicorn.",
  "instructions": "<string>",
  "max_output_tokens": 2,
  "stream": true,
  "temperature": 1,
  "top_p": 0.5,
  "parallel_tool_calls": true,
  "tool_choice": "auto",
  "tools": [
    {
      "type": "function",
      "name": "<string>",
      "description": "<string>",
      "parameters": {},
      "strict": true
    }
  ],
  "reasoning": {
    "effort": "low",
    "summary": "auto"
  },
  "text": {
    "format": {
      "type": "text",
      "name": "<string>",
      "strict": true,
      "schema": {}
    }
  },
  "include": [
    "<string>"
  ],
  "metadata": {},
  "store": true,
  "truncation": "<string>",
  "user": "<string>"
}
'
{
  "id": "<string>",
  "object": "response",
  "created_at": 123,
  "model": "<string>",
  "status": "completed",
  "output": [
    {
      "type": "message",
      "id": "<string>",
      "status": "in_progress",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "<string>",
          "refusal": "<string>"
        }
      ],
      "summary": "<array>",
      "encrypted_content": "<string>",
      "name": "<string>",
      "arguments": "<string>",
      "call_id": "<string>"
    }
  ],
  "incomplete_details": {
    "reason": "<string>"
  },
  "error": {
    "code": "<string>",
    "message": "<string>"
  },
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "total_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens_details": {
      "reasoning_tokens": 123
    },
    "cost": 123
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.requesty.ai/llms.txt

Use this file to discover all available pages before exploring further.

Send input to an OpenAI-compatible model and receive a response. This endpoint follows the OpenAI Responses API format and supports all OpenAI models that expose the Responses API natively, as well as compatible models from other providers through Requesty’s routing.

Base URL

https://router.requesty.ai/v1/responses

Authentication

The Responses endpoint accepts either OpenAI-style bearer auth or Anthropic-style x-api-key auth. Use whichever your client library expects.
Authorization: Bearer YOUR_REQUESTY_API_KEY
x-api-key: YOUR_REQUESTY_API_KEY

Headers

HeaderRequiredDescription
Authorization✅ *Bearer token with your Requesty key
x-api-key✅ *Your Requesty API key (alternative)
Content-TypeMust be application/json
* Provide one of Authorization or x-api-key.

Example Request

curl https://router.requesty.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_REQUESTY_API_KEY" \
  -d '{
    "model": "openai-responses/gpt-5",
    "input": "Tell me a three sentence bedtime story about a unicorn."
  }'

Using the OpenAI SDK

The Responses endpoint is fully compatible with the official OpenAI SDK. Just point base_url at Requesty:
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_REQUESTY_API_KEY",
    base_url="https://router.requesty.ai/v1",
)

response = client.responses.create(
    model="openai-responses/gpt-5",
    input="Tell me a three sentence bedtime story about a unicorn.",
)

print(response.output_text)

Model Selection

You can use any model available in the Model Library. Requesty translates the request shape for non-OpenAI providers automatically.
  • OpenAI Models: openai-responses/gpt-5, openai-responses/gpt-5-mini, openai-responses/gpt-4.1, openai-responses/gpt-4o
  • Anthropic Models: anthropic/claude-sonnet-4-5, anthropic/claude-opus-4
  • Google Models: google/gemini-2.5-pro, google/gemini-2.5-flash
  • Other Providers: mistral/mistral-large-2411, meta/llama-3.3-70b-instruct
To route OpenAI models through their native Responses API (required for full feature parity, including file inputs and the response.* event stream), use the openai-responses/ prefix. The standard openai/ prefix routes through Chat Completions under the hood.
While this endpoint uses the OpenAI Responses format, Requesty automatically handles format conversion for non-OpenAI providers, so you can use any supported model with this endpoint.

Input Formats

The input field accepts either a plain string or an array of input items. Use the array form for multi-turn conversations, tool results, and rich content.

String input

{
	"model": "openai-responses/gpt-5",
	"input": "Write a haiku about routers."
}

Multi-turn input

{
	"model": "openai-responses/gpt-5",
	"input": [
		{ "role": "user", "content": "Hi, my name is John." },
		{ "role": "assistant", "content": "Hello John, nice to meet you." },
		{ "role": "user", "content": "What is my name?" }
	]
}

Instructions

Use the instructions parameter to set a system-level prompt that applies to the entire request. It is equivalent to a system or developer message at the start of the conversation.
{
	"model": "openai-responses/gpt-5",
	"instructions": "You are a helpful assistant that always responds in JSON.",
	"input": "Summarize the weather in Paris today."
}

Streaming

Enable streaming by setting stream: true. The response is delivered as Server-Sent Events using the OpenAI Responses event format (response.created, response.output_text.delta, response.completed, etc.).
{
	"model": "openai-responses/gpt-5",
	"input": "Write a short story.",
	"stream": true
}
To receive a final usage block with cost on streaming requests, no additional parameter is required. The response.completed event includes the full usage object.

Vision Support

Send images using the input_image content type. You can pass an image URL or a base64 data URL.
{
	"model": "openai-responses/gpt-5",
	"input": [
		{
			"role": "user",
			"content": [
				{ "type": "input_text", "text": "What is in this image?" },
				{
					"type": "input_image",
					"image_url": "https://example.com/image.jpg"
				}
			]
		}
	]
}

PDF Support

Send PDFs using the input_file content type. You can provide the PDF as either a base64 data URL or a remote URL.
{
	"model": "openai-responses/gpt-5",
	"input": [
		{
			"role": "user",
			"content": [
				{ "type": "input_text", "text": "Summarize this PDF." },
				{
					"type": "input_file",
					"filename": "document.pdf",
					"file_data": "data:application/pdf;base64,<base64-encoded-pdf-data>"
				}
			]
		}
	]
}
See the PDF Support guide for the full list of supported providers.

Tool Use

Define tools the model may call. The Responses API uses a flatter shape than Chat Completions: name, description, and parameters live at the top level of each tool entry.
{
	"model": "openai-responses/gpt-5",
	"input": "What is the weather like in New York?",
	"tools": [
		{
			"type": "function",
			"name": "get_weather",
			"description": "Get the current weather in a given location",
			"parameters": {
				"type": "object",
				"properties": {
					"location": {
						"type": "string",
						"description": "The city and state, e.g. San Francisco, CA"
					}
				},
				"required": ["location"]
			},
			"strict": true
		}
	]
}
To return a tool result on the next turn, send a function_call_output item in input:
{
	"model": "openai-responses/gpt-5",
	"input": [
		{
			"type": "function_call",
			"name": "get_weather",
			"call_id": "call_abc123",
			"arguments": "{\"location\": \"New York, NY\"}"
		},
		{
			"type": "function_call_output",
			"call_id": "call_abc123",
			"output": "{\"temperature\": 68, \"conditions\": \"sunny\"}"
		}
	]
}

Reasoning

For reasoning-capable models (e.g. openai-responses/gpt-5, openai-responses/o3), configure reasoning effort and the optional summary:
{
	"model": "openai-responses/gpt-5",
	"input": "Plan a three day trip to Tokyo.",
	"reasoning": {
		"effort": "medium",
		"summary": "auto"
	}
}
  • effort: low, medium, or high. Lower effort produces faster responses with fewer reasoning tokens.
  • summary: auto, concise, or detailed. Controls whether the model returns a reasoning summary alongside the final answer.

Structured Outputs

Set text.format to enforce JSON-mode or a strict JSON Schema on the output.
{
	"model": "openai-responses/gpt-5",
	"input": "Extract entities from: The quick brown fox jumps over the lazy dog.",
	"text": {
		"format": {
			"type": "json_schema",
			"name": "Entities",
			"strict": true,
			"schema": {
				"type": "object",
				"properties": {
					"animals": { "type": "array", "items": { "type": "string" } }
				},
				"required": ["animals"]
			}
		}
	}
}
See the Structured Outputs guide for full examples.

Response Format

A successful response follows the OpenAI Responses format:
{
	"id": "resp_01ABC123",
	"object": "response",
	"created_at": 1730000000,
	"model": "openai-responses/gpt-5",
	"status": "completed",
	"output": [
		{
			"id": "msg_01ABC123",
			"type": "message",
			"role": "assistant",
			"status": "completed",
			"content": [
				{
					"type": "output_text",
					"text": "Once upon a time, a unicorn..."
				}
			]
		}
	],
	"usage": {
		"input_tokens": 12,
		"input_tokens_details": { "cached_tokens": 0 },
		"output_tokens": 27,
		"output_tokens_details": { "reasoning_tokens": 0 },
		"total_tokens": 39,
		"cost": 0.000234
	}
}
The cost field inside usage is a Requesty extension and reports the USD cost of the request. It is returned by default on non-streaming responses, and on the final response.completed event when streaming. See Cost Tracking.

Error Handling

The API returns standard HTTP status codes:
  • 200 - Success
  • 400 - Bad Request (invalid parameters)
  • 401 - Unauthorized (invalid API key)
  • 403 - Forbidden (insufficient permissions)
  • 429 - Rate Limited
  • 500 - Internal Server Error

Key Differences from OpenAI Chat Completions

  • input instead of messages: Accepts a string or a list of typed items (messages, tool calls, tool results, reasoning).
  • instructions instead of system messages: System prompts are passed via the top-level instructions field.
  • Flat tool shape: Tools declare name, description, and parameters directly, without the nested function wrapper.
  • Content types are prefixed: input_text, input_image, input_file for user inputs; output_text and output_refusal for model outputs.
  • Event-typed streaming: Streaming uses named events (response.created, response.output_text.delta, response.completed) rather than choice deltas.
  • max_output_tokens instead of max_tokens: Caps the total of visible and reasoning tokens.
For seamless compatibility with the OpenAI Python and Node SDKs’ responses.create(...) interface, use this endpoint. For broader portability across providers, consider the Chat Completions endpoint instead.

Headers

x-api-key
string

Your Requesty API key. Alternative to the standard Authorization: Bearer header.

Body

application/json
model
string
default:openai-responses/gpt-5
required

The model to use for the response. To route OpenAI models through their native Responses API, use the openai-responses/ prefix (e.g. openai-responses/gpt-5).

Example:

"openai-responses/gpt-5"

input
required

Text, image, or file inputs to the model. Either a plain string or an array of typed input items.

Example:

"Tell me a three sentence bedtime story about a unicorn."

instructions
string

Inserts a system (or developer) message as the first item in the model's context.

max_output_tokens
integer

Upper bound for the number of tokens that can be generated, including visible output tokens and reasoning tokens.

Required range: x >= 1
stream
boolean

If true, the response is streamed to the client as it is generated using server-sent events.

temperature
number

Sampling temperature between 0 and 2. Higher values produce more random output.

Required range: 0 <= x <= 2
top_p
number

Nucleus sampling: consider tokens with cumulative probability mass up to top_p.

Required range: 0 <= x <= 1
parallel_tool_calls
boolean

Whether to allow the model to run tool calls in parallel.

tool_choice

Controls which (if any) tool is called by the model.

Available options:
auto,
none,
required
tools
object[]

Tools the model may call.

reasoning
object

Reasoning configuration for reasoning-capable models.

text
object

Output text configuration, including structured output format.

include
string[]

Specify additional output data to include in the model response.

metadata
object

Set of key-value pairs that can be attached to the request.

store
boolean

Whether to store the generated model response for later retrieval via API.

truncation
string

The truncation strategy to use for the model response.

user
string

A unique identifier representing your end-user.

Response

Response

id
string
required

Unique identifier for this response.

object
enum<string>
required

Object type.

Available options:
response
created_at
integer
required

Unix timestamp (in seconds) of when the response was created.

model
string
required

Model ID used to generate the response.

status
enum<string>
required

Status of the response generation.

Available options:
completed,
failed,
in_progress,
incomplete
output
object[]
required

Output items from the model. Typically one or more message, function_call, or reasoning items.

incomplete_details
object
error
object
usage
object
Last modified on May 13, 2026