⚡ Free Tool

Calculate real LLM API costs
before you ship

Enter your token usage, see the exact cost across every major AI provider. No more spreadsheets.

Token Inputs
? Select the AI model you're using (or planning to use). The comparison table below shows all models with your current token settings.
Choose a model to see its description and pricing details
Quick presets ? Common real-world scenarios. Click to load their token counts — each preset shows total tokens and approximate word count.
? Everything you send to the model: your system prompt, conversation history, and user message. Typically 3–4× cheaper than output tokens.
750 words · 1,000 tokens
? The model's response length. Output tokens cost more — the model generates them one at a time. Typical responses: 100–500 tokens.
375 words · 500 tokens
? How many API calls you expect per day. Used to project daily and monthly costs. E.g. 1 for personal use, 100–10,000 for a production app.
Monthly = daily cost × 30 days
Cost per request
Select a model to see cost
per request (1,000 in + 500 out tokens)
Per 1K input tokens
Per 1K output tokens
Daily (100 req)
Monthly estimate

All Models Comparison

Same token usage, every model, side by side
Model Per Request Daily
100 req
Monthly Context Value Score Relative cost Try it
Loading models…

Understanding LLM API Costs

What is a token?

A token is roughly 4 characters or 0.75 words in English. "Hello world" is about 3 tokens. Most models price input and output tokens separately — output tokens typically cost more since the model generates each one.

Why are input and output priced differently?

Reading (input) is computationally cheaper than writing (output). Models process your input in parallel, but generate output sequentially — token by token. That's why output tokens cost 3–6× more per token on most APIs.

How accurate are these estimates?

Prices reflect official API pricing as of April 2026. LLM costs have dropped 80%+ since 2025 — models that cost $5–10/M tokens in 2025 now cost under $0.30/M. Prices vary by tier, volume, and region. Batch API discounts (50%) and prompt caching discounts (90%) are not included here.

Which model is best value in 2026?

Budget: Gemini 2.5 Flash-Lite ($0.10/M), DeepSeek V3.2 ($0.14/M), or Llama 3.3 70B ($0.10/M). Production: Claude Sonnet 4.6 ($3/$15) or GPT-4.1 ($2/$8). Reasoning at low cost: DeepSeek R1 ($0.55/M) beats o1/o3 by 80%+. New entrants DeepSeek and Grok Fast have reset the price floor.

Does context length affect cost?

Yes — every token in your context window counts as input. A 10,000-token conversation history adds 10K input tokens to every API call. Use context management techniques (summarization, truncation) to control costs at scale.

How do I reduce my API costs?

Use prompt caching (Anthropic offers 90% discount on cached input tokens). Batch similar requests. Choose the smallest model that fits your quality bar. Truncate conversation history. Use streaming to detect when to stop early.