Free Models

Requesty offers a selection of models that are free to use. They run through the same gateway as every other model, so you get the same OpenAI-compatible API, routing policies, logging, and analytics with no token charges.

These models are free for now. Pricing may change in the future, and any change will be announced in the changelog ahead of time.

Available free models

Model	Input $/M	Output $/M
`nvidia/nemotron-3-ultra-550b-a55b`	Free	Free
`nvidia/nemotron-3-super-120b-a12b`	Free	Free
`poolside/laguna-xs.2`	Free	Free
`poolside/laguna-m.1`	Free	Free
`google/gemma-4-31b-it`	Free	Free
`nvidia/nemotron-3.5-content-safety`	Free	Free

Daily request limits

Free model usage is limited per organization per day:

Organization	Requests per day
New organizations	50
Paying organizations	200

The limit applies across all free models combined and resets daily. Once you hit the limit, requests to free models return a rate limit error until the next reset. Paid models are not affected.

Usage

Call a free model exactly like any other model on the gateway:

import openai

client = openai.OpenAI(
    api_key="YOUR_REQUESTY_API_KEY",
    base_url="https://router.requesty.ai/v1",
)

response = client.chat.completions.create(
    model="nvidia/nemotron-3-super-120b-a12b",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Resources

Model Library (Free Models) to browse all free models in the Requesty dashboard
Supported Models for the full model catalog
API Limits for gateway rate limiting in general
Fallback Policies to fall back to a paid model when the daily limit is reached

Last modified on June 15, 2026

Supported Models Changelog

⌘I

Getting Started

LLM Gateway

Model Capabilities

Analytics & Monitoring

Access Control

Organization

MCP Gateway

Available free models

Daily request limits

Usage

Resources

​Available free models

​Daily request limits

​Usage

​Resources

Available free models

Daily request limits

Usage

Resources