Token Inputs

Model ?

Choose a model to see its description and pricing details

Quick presets ?

Input tokens IN ?

≈ 750 words · 1,000 tokens

Output tokens OUT ?

≈ 375 words · 500 tokens

Requests per day ?

Monthly = daily cost × 30 days

Cost per request

Select a model to see cost

—

per request (1,000 in + 500 out tokens)

Per 1K input tokens

—

Per 1K output tokens

—

Daily (100 req)

—

Monthly estimate

—

All Models Comparison

Same token usage, every model, side by side

Model	Per Request	Daily 100 req	Monthly	Context	Value Score	Relative cost	Try it
Loading models…

Understanding LLM API Costs

What is a token?

A token is roughly 4 characters or 0.75 words in English. "Hello world" is about 3 tokens. Most models price input and output tokens separately — output tokens typically cost more since the model generates each one.

Why are input and output priced differently?

Reading (input) is computationally cheaper than writing (output). Models process your input in parallel, but generate output sequentially — token by token. That's why output tokens cost 3–6× more per token on most APIs.

How accurate are these estimates?

Prices reflect official API pricing as of April 2026. LLM costs have dropped 80%+ since 2025 — models that cost $5–10/M tokens in 2025 now cost under $0.30/M. Prices vary by tier, volume, and region. Batch API discounts (50%) and prompt caching discounts (90%) are not included here.

Which model is best value in 2026?

Budget: Gemini 2.5 Flash-Lite ($0.10/M), DeepSeek V3.2 ($0.14/M), or Llama 3.3 70B ($0.10/M). Production: Claude Sonnet 4.6 ($3/$15) or GPT-4.1 ($2/$8). Reasoning at low cost: DeepSeek R1 ($0.55/M) beats o1/o3 by 80%+. New entrants DeepSeek and Grok Fast have reset the price floor.

Does context length affect cost?

Yes — every token in your context window counts as input. A 10,000-token conversation history adds 10K input tokens to every API call. Use context management techniques (summarization, truncation) to control costs at scale.

How do I reduce my API costs?

Use prompt caching (Anthropic offers 90% discount on cached input tokens). Batch similar requests. Choose the smallest model that fits your quality bar. Truncate conversation history. Use streaming to detect when to stop early.

Calculate real LLM API costsbefore you ship