Create Transcription

Transcribe audio into text using OpenAI’s speech-to-text models through Requesty’s routing.

Base URL

https://router.requesty.ai/v1/audio/transcriptions

Authentication

Include your Requesty API key in the request headers:

Authorization: Bearer YOUR_REQUESTY_API_KEY

Example Request

The endpoint accepts multipart/form-data. Send the audio as the file field and the model identifier as the model field.

curl https://router.requesty.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_REQUESTY_API_KEY" \
  -F "model=openai/gpt-4o-transcribe" \
  -F "file=@./meeting.mp3"

Example response:

{
  "text": "Hello, this is a transcription of the audio.",
  "usage": {
    "type": "tokens",
    "input_tokens": 14,
    "output_tokens": 11,
    "total_tokens": 25,
    "input_token_details": {
      "audio_tokens": 14,
      "text_tokens": 0
    }
  }
}

OpenAI SDK

The endpoint is fully compatible with the OpenAI SDK. Just point the client at Requesty’s base URL:

from openai import OpenAI

client = OpenAI(
    base_url="https://router.requesty.ai/v1",
    api_key="YOUR_REQUESTY_API_KEY",
)

with open("meeting.mp3", "rb") as audio:
    transcript = client.audio.transcriptions.create(
        model="openai/gpt-4o-transcribe",
        file=audio,
    )

print(transcript.text)

import OpenAI from "openai";
import fs from "node:fs";

const client = new OpenAI({
  baseURL: "https://router.requesty.ai/v1",
  apiKey: process.env.REQUESTY_API_KEY,
});

const transcript = await client.audio.transcriptions.create({
  model: "openai/gpt-4o-transcribe",
  file: fs.createReadStream("meeting.mp3"),
});

console.log(transcript.text);

Supported Models

Browse the full catalog on the Transcription model library. Today the available transcription models are all from OpenAI:

Model	Best for	Billing
`openai/gpt-4o-transcribe`	Highest accuracy, multilingual	Token based
`openai/gpt-4o-mini-transcribe`	Fast and cost efficient	Token based
`openai/whisper-1`	Drop in replacement for legacy Whisper	Duration based (per second of audio)

Date pinned snapshots (for example openai/gpt-4o-mini-transcribe-2025-12-15) are also available when you need a stable model version.

Supported Audio Formats

The file field accepts the following formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. The maximum upload size per request is 32 MB. For longer recordings, split the audio into chunks and concatenate the resulting transcripts on your side.

Language Hint

Set language to the ISO 639-1 code of the spoken language to improve accuracy and latency. When omitted, the model auto detects the language.

curl https://router.requesty.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_REQUESTY_API_KEY" \
  -F "model=openai/gpt-4o-transcribe" \
  -F "language=fr" \
  -F "file=@./conference.m4a"

Response Format

The response is always a JSON object with the transcribed text and a usage block. The usage block has two possible shapes depending on the model:

Token usage (`gpt-4o-transcribe`, `gpt-4o-mini-transcribe`)

{
  "text": "Hello, world.",
  "usage": {
    "type": "tokens",
    "input_tokens": 14,
    "output_tokens": 11,
    "total_tokens": 25,
    "input_token_details": {
      "audio_tokens": 14,
      "text_tokens": 0
    }
  }
}

Duration usage (`whisper-1`)

{
  "text": "Hello, world.",
  "usage": {
    "type": "duration",
    "seconds": 4.2
  }
}

Use the type discriminator to decide how to render or aggregate usage on your side.

Pricing

Transcription models are priced either per token of input audio (for gpt-4o-transcribe and gpt-4o-mini-transcribe) or per second of input audio (for whisper-1). The exact rate per model is on the Transcription model library. Charges appear in your usage dashboard immediately after the request completes.

Error Handling

The API returns standard HTTP status codes:

200 Success
400 Bad Request (missing file or model, unsupported audio format)
401 Unauthorized (invalid API key)
404 Model not found or not approved for your organization
413 Payload Too Large (audio file exceeds 32 MB)
429 Rate limited
500 Internal Server Error

This endpoint is fully compatible with the OpenAI Audio Transcriptions API. You can use the OpenAI SDK’s client.audio.transcriptions.create() method directly.

To go the other direction and turn text into audio, use the Create Speech endpoint.

Authorizations

Authorization

string

header

required

API key for authentication

Body

multipart/form-data

file

required

The audio file to transcribe. Supported formats are flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm. Maximum upload size is 32 MB.

model

string

required

The speech-to-text model to use, prefixed with the provider slug. Currently only OpenAI models are supported.

Example:

"openai/gpt-4o-transcribe"

language

string

The language of the input audio in ISO 639-1 format (for example, en, fr, ja). Supplying the language improves accuracy and latency. Auto-detected when omitted.

Response

Transcription result

text

string

required

The transcribed text.

Example:

"Hello, world."

usage

object

required

Usage stats for the transcription. The shape depends on how the model is billed: token-based (gpt-4o-transcribe, gpt-4o-mini-transcribe) or duration-based (whisper-1).

Option 1
Option 2

Show child attributes

🚀 Getting Started

🌟 Features

🏢 Organization

🔗 Integrations

⚡ Frameworks

📡 Inference APIs

🔧 Management APIs

Base URL

Authentication

Example Request

OpenAI SDK

Supported Models

Supported Audio Formats

Language Hint

Response Format

Token usage (`gpt-4o-transcribe`, `gpt-4o-mini-transcribe`)

Duration usage (`whisper-1`)

Pricing

Error Handling

Authorizations

Body

Response

🚀 Getting Started

🌟 Features

🏢 Organization

🔗 Integrations

⚡ Frameworks

📡 Inference APIs

🔧 Management APIs

Documentation Index

​Base URL

​Authentication

​Example Request

​OpenAI SDK

​Supported Models

​Supported Audio Formats

​Language Hint

​Response Format

​Token usage (gpt-4o-transcribe, gpt-4o-mini-transcribe)

​Duration usage (whisper-1)

​Pricing

​Error Handling

Authorizations

Body

Response

Base URL

Authentication

Example Request

OpenAI SDK

Supported Models

Supported Audio Formats

Language Hint

Response Format

Token usage (`gpt-4o-transcribe`, `gpt-4o-mini-transcribe`)

Duration usage (`whisper-1`)

Pricing

Error Handling