> ## Documentation Index
> Fetch the complete documentation index at: https://docs.requesty.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Quickstart

> One API. 300+ models. OpenAI-compatible. Route your first request in under 2 minutes.

## Switch from OpenAI in 2 lines

If you're already using the OpenAI SDK, point it at Requesty and you're done. No SDK changes, no new client to learn.

<CodeGroup>
  ```python Python theme={"dark"}
  from openai import OpenAI

  client = OpenAI(
      api_key="REQUESTY_API_KEY",                       # was: OPENAI_API_KEY
      base_url="https://router.requesty.ai/v1",         # was: https://api.openai.com/v1
      default_headers={
          "HTTP-Referer": "https://yourapp.com",        # Optional - your site URL for analytics
          "X-Title": "My App",                          # Optional - your app name for analytics
      },
  )
  ```

  ```typescript TypeScript theme={"dark"}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: "REQUESTY_API_KEY",                         // was: OPENAI_API_KEY
    baseURL: "https://router.requesty.ai/v1",           // was: https://api.openai.com/v1
    defaultHeaders: {
      "HTTP-Referer": "https://yourapp.com",            // Optional - your site URL for analytics
      "X-Title": "My App",                              // Optional - your app name for analytics
    },
  });
  ```
</CodeGroup>

<Tip>
  Every SDK and framework that speaks OpenAI (LangChain, Vercel AI SDK, LlamaIndex, Haystack, Pydantic AI) works with Requesty out of the box. Same for the Anthropic SDK against `/anthropic/v1/messages`.
</Tip>

## Three steps to your first request

<Steps>
  <Step title="Get your API key">
    Sign up at [app.requesty.ai](https://app.requesty.ai/sign-up) and create a key on the [API Keys page](https://app.requesty.ai/api-keys). New accounts include free credits to start routing immediately.

    Export it so the snippets below just work:

    ```bash theme={"dark"}
    export REQUESTY_API_KEY="sk-..."
    ```
  </Step>

  <Step title="Install the SDK">
    <Tabs>
      <Tab title="Python">
        ```bash theme={"dark"}
        pip install openai
        ```
      </Tab>

      <Tab title="TypeScript">
        ```bash theme={"dark"}
        npm install openai
        ```
      </Tab>

      <Tab title="cURL">
        No install needed. cURL ships with every major OS.
      </Tab>
    </Tabs>
  </Step>

  <Step title="Make your first request">
    <CodeGroup>
      ```python Python theme={"dark"}
      import os
      from openai import OpenAI

      client = OpenAI(
          api_key=os.environ["REQUESTY_API_KEY"],
          base_url="https://router.requesty.ai/v1",
          default_headers={
              "HTTP-Referer": "https://yourapp.com",  # Optional - your site URL
              "X-Title": "My App",                    # Optional - your app name
          },
      )

      response = client.chat.completions.create(
          model="openai/gpt-4o",
          messages=[{"role": "user", "content": "Hello, who are you?"}],
      )

      print(response.choices[0].message.content)
      ```

      ```typescript TypeScript theme={"dark"}
      import OpenAI from "openai";

      const client = new OpenAI({
        apiKey: process.env.REQUESTY_API_KEY,
        baseURL: "https://router.requesty.ai/v1",
        defaultHeaders: {
          "HTTP-Referer": "https://yourapp.com",      // Optional - your site URL
          "X-Title": "My App",                        // Optional - your app name
        },
      });

      const response = await client.chat.completions.create({
        model: "openai/gpt-4o",
        messages: [{ role: "user", content: "Hello, who are you?" }],
      });

      console.log(response.choices[0].message.content);
      ```

      ```bash cURL theme={"dark"}
      curl https://router.requesty.ai/v1/chat/completions \
        -H "Authorization: Bearer $REQUESTY_API_KEY" \
        -H "HTTP-Referer: https://yourapp.com" \
        -H "X-Title: My App" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "openai/gpt-4o",
          "messages": [{"role": "user", "content": "Hello, who are you?"}]
        }'
      ```
    </CodeGroup>

    <Accordion title="Expected response" icon="circle-check">
      Requesty returns an OpenAI-compatible response with a few extra response headers so you can see which provider served the request and whether it hit cache.

      ```json Response body theme={"dark"}
      {
        "id": "chatcmpl-9pV...",
        "object": "chat.completion",
        "created": 1738956032,
        "model": "openai/gpt-4o",
        "choices": [
          {
            "index": 0,
            "message": {
              "role": "assistant",
              "content": "I'm an AI assistant made by OpenAI, served to you through Requesty."
            },
            "finish_reason": "stop"
          }
        ],
        "usage": { "prompt_tokens": 13, "completion_tokens": 17, "total_tokens": 30, "cost": 0.0000935 }
      }
      ```

      The `usage` object is returned by default on every non-streaming response. The `cost` field is Requesty's USD cost for the call, you don't need to pass any extra parameter to get it. For streaming, opt in with `stream_options: {"include_usage": true}` to receive a final usage chunk. See [Streaming](/features/streaming#token-usage-and-cost-while-streaming).

      ```http Response headers theme={"dark"}
      x-requesty-provider: openai
      x-requesty-cache: MISS
      x-requesty-latency-ms: 412
      x-requesty-request-id: req_01HYZ...
      ```
    </Accordion>

    <Info>
      The `HTTP-Referer` and `X-Title` headers are **optional** but recommended. Requesty uses them to tag your requests in analytics. `HTTP-Referer` identifies your site URL and `X-Title` gives your app a human-readable name. Both appear in your [analytics dashboards](/features/usage-analytics) so you can filter traffic by origin.
    </Info>
  </Step>

  <Step title="Make it production-ready (bonus)">
    Two upgrades turn this from a toy into something you can ship. Neither requires new infra.

    **Add metadata** so every request is attributable in [analytics](https://app.requesty.ai/analytics). Tag by feature, user, or trace ID to slice spend and latency the way you already think about your product.

    **Route to a policy** instead of a single model. Create a [Fallback Policy](https://app.requesty.ai/routing-policies) once, then reference it by name. If the primary model errors or times out, Requesty tries the next one. No retry logic in your app.

    <CodeGroup>
      ```python Python theme={"dark"}
      response = client.chat.completions.create(
          model="policy/sonnet-with-fallback",  # set up once in the dashboard
          messages=[{"role": "user", "content": "Hello, who are you?"}],
          extra_body={
              "requesty": {
                  "tags": ["quickstart", "chat"],
                  "user_id": "user_1234",
                  "trace_id": "session_abc123",
                  "extra": {
                      "feature": "onboarding",
                      "environment": "production",
                  },
              },
          },
      )
      ```

      ```typescript TypeScript theme={"dark"}
      const response = await client.chat.completions.create({
        model: "policy/sonnet-with-fallback", // set up once in the dashboard
        messages: [{ role: "user", content: "Hello, who are you?" }],
        // @ts-expect-error Requesty extends the OpenAI schema
        requesty: {
          tags: ["quickstart", "chat"],
          user_id: "user_1234",
          trace_id: "session_abc123",
          extra: {
            feature: "onboarding",
            environment: "production",
          },
        },
      });
      ```

      ```bash cURL theme={"dark"}
      curl https://router.requesty.ai/v1/chat/completions \
        -H "Authorization: Bearer $REQUESTY_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "policy/sonnet-with-fallback",
          "messages": [{"role": "user", "content": "Hello, who are you?"}],
          "requesty": {
            "tags": ["quickstart", "chat"],
            "user_id": "user_1234",
            "trace_id": "session_abc123",
            "extra": { "feature": "onboarding", "environment": "production" }
          }
        }'
      ```
    </CodeGroup>

    Learn more: [Request Metadata](/features/request-metadata) · [Fallback Policies](/features/fallback-policies) · [Load Balancing](/features/load-balancing-policies).
  </Step>
</Steps>

## Pick a model

Every model lives behind one endpoint. Swap `model` in the request to switch providers. No other code changes, no new SDK, no new auth.

<CardGroup cols={3}>
  <Card title="Frontier" icon="sparkles" href="https://app.requesty.ai/model-library" arrow>
    Claude Opus, GPT-5, Gemini 2.5 Pro. Maximum capability for hard tasks.
  </Card>

  <Card title="Fast & cheap" icon="bolt" href="https://app.requesty.ai/model-library" arrow>
    Haiku, GPT-5 mini, Gemini Flash. Sub-second latency, pennies per million tokens.
  </Card>

  <Card title="Open source" icon="code" href="https://app.requesty.ai/model-library" arrow>
    Llama, Qwen, DeepSeek. Hosted or bring-your-own endpoints.
  </Card>
</CardGroup>

## What Requesty adds on top

<CardGroup cols={2}>
  <Card title="Fallback routing" icon="route" href="/features/fallback-policies">
    Auto-reroute failed requests to backup models. No more 5xx surprises.
  </Card>

  <Card title="Auto-caching" icon="database" href="/features/auto-caching">
    Cut costs up to 80% on repeated prompts with zero configuration.
  </Card>

  <Card title="Usage analytics" icon="chart-mixed" href="/features/usage-analytics">
    Track spend, latency, and errors per key, user, model, or project.
  </Card>

  <Card title="Load balancing" icon="shuffle" href="/features/load-balancing-policies">
    Distribute traffic across providers by cost, latency, or custom weights.
  </Card>

  <Card title="Bring your own keys" icon="key" href="/features/bring-your-own-keys">
    Use your own provider accounts and keep existing pricing.
  </Card>

  <Card title="Guardrails & RBAC" icon="shield-check" href="/features/guardrails">
    Content filtering, approved-model lists, and role-based access.
  </Card>
</CardGroup>

## Use your favorite framework

<CardGroup cols={4}>
  <Card title="LangChain" icon="link" href="/frameworks/langchain" />

  <Card title="Vercel AI SDK" icon="triangle" href="/frameworks/vercel-ai-sdk" />

  <Card title="LlamaIndex" icon="layer-group" href="/frameworks/llamaindex-ts" />

  <Card title="Haystack" icon="stack" href="/frameworks/haystack" />

  <Card title="Pydantic AI" icon="python" href="/frameworks/pydantic-ai" />

  <Card title="Axios" icon="globe" href="/frameworks/axios" />

  <Card title="Requests" icon="python" href="/frameworks/requests" />

  <Card title="Anthropic SDK" icon="robot" href="/integrations/anthropic-agent-sdks" />
</CardGroup>

## Common questions

<AccordionGroup>
  <Accordion title="How is Requesty different from calling providers directly?" icon="circle-question">
    Direct provider calls give you one model, one auth method, one point of failure. Requesty gives you one endpoint across 300+ models, automatic fallback, shared caching, unified analytics, and one bill. For teams running production AI, that's the difference between a side project and an SLA.
  </Accordion>

  <Accordion title="Do I pay a markup on tokens?" icon="coins">
    No. You pay provider prices, and you can see exact per-request costs in the [analytics dashboard](https://app.requesty.ai/analytics). Use [Bring Your Own Keys](/features/bring-your-own-keys) to keep provider discounts and committed-use pricing.
  </Accordion>

  <Accordion title="Is my data used for training?" icon="shield">
    No. Requesty is a pass-through gateway. We don't train on your requests or responses. See our [data handling and EU routing](/features/eu-routing) for residency options.
  </Accordion>

  <Accordion title="Which Anthropic SDK endpoint do I use?" icon="code">
    Point the Anthropic SDK at `https://router.requesty.ai/anthropic/v1/messages`. See the [Anthropic Agent SDKs guide](/integrations/anthropic-agent-sdks) for code samples.
  </Accordion>
</AccordionGroup>

## Next steps

<CardGroup cols={2}>
  <Card title="Inference APIs" icon="satellite-dish" href="/api-reference/inference-apis">
    Chat completions, embeddings, audio, images, and model listing endpoints.
  </Card>

  <Card title="Management APIs" icon="wrench" href="/api-reference/management-apis">
    Programmatically manage API keys, groups, members, and organization settings.
  </Card>

  <Card title="Use with Claude Code" icon="terminal" href="/integrations/claude-code">
    Point Claude Code at Requesty for unified billing and routing.
  </Card>

  <Card title="Configure routing" icon="sliders" href="/features/fallback-policies">
    Set up fallbacks, load balancing, and latency-aware routing.
  </Card>

  <Card title="Join the community" icon="discord" href="https://discord.com/invite/Td3rwAHgt4">
    Ask questions, share builds, and meet the team on Discord.
  </Card>
</CardGroup>
