Skip to main content
Requesty offers a selection of models that are free to use. They run through the same gateway as every other model, so you get the same OpenAI-compatible API, routing policies, logging, and analytics with no token charges.
These models are free for now. Pricing may change in the future, and any change will be announced in the changelog ahead of time.

Available free models

ModelInput $/MOutput $/M
nvidia/nemotron-3-ultra-550b-a55bFreeFree
nvidia/nemotron-3-super-120b-a12bFreeFree
poolside/laguna-xs.2FreeFree
poolside/laguna-m.1FreeFree

Daily request limits

Free model usage is limited per organization per day:
OrganizationRequests per day
New organizations50
Paying organizations200
The limit applies across all free models combined and resets daily. Once you hit the limit, requests to free models return a rate limit error until the next reset. Paid models are not affected.

Usage

Call a free model exactly like any other model on the gateway:
import openai

client = openai.OpenAI(
    api_key="YOUR_REQUESTY_API_KEY",
    base_url="https://router.requesty.ai/v1",
)

response = client.chat.completions.create(
    model="nvidia/nemotron-3-super-120b-a12b",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Resources

Last modified on June 10, 2026