LLM API Cost Calculator

Drag the sliders to size a request, then compare the cost across 18 models from Claude, GPT, Gemini, Grok, DeepSeek, and Mistral — per request and per month, with Batch API and prompt-caching discounts.

Your usage

Caching and batch are modeled as multipliers; providers may not combine them. Estimates only.

Cost comparison — 18 models

Loading pricing…

Estimated API cost per request and per month by model, filterable by provider and sortable by column.

One planned request vs. many follow-ups

Every follow-up re-sends the whole conversation as input. This shows what that costs — and how much prompt caching claws back.

Many follow-ups (no cache)
Many follow-ups (cached)
One planned request

Model: each follow-up turn re-sends the context plus all prior turns. The "one planned request" path sends the context once with everything asked up front, for the same total work. Caching prices the re-sent prefix at the model's cache-read rate.

Token counter (estimate)

Paste text for a quick token estimate, then push it into the calculator.

0 est. tokens
0 characters

Rough heuristic (~4 chars/token). Real tokenization is model-specific — for exact Claude counts use the count_tokens API; OpenAI and Google have their own tokenizers.