Token economics: why every word costs money

How AI tokens are priced, what affects your API costs, and the single change that reduces spend by 60% without changing your output quality.

You have a 100-page document. You want the model to help you with page 47. So you paste all 100 pages. The model reads all 100 pages. You pay for all 100 pages. Every call.

A 100-page document is roughly 25,000 tokens. At $3.00 per million input tokens, that is $0.075 per call. Run that 10,000 times — $750. For pages the model never needed.

Now do it right. Send only the relevant section. 2 pages. 500 tokens. Same rate. 10,000 times — $15. Same answer. 98% cheaper. The model did not need the other 98 pages.

What a token actually is

A token is the smallest chunk of text a language model processes. Not always a full word — sometimes part of one, sometimes punctuation on its own. In English, 100 tokens is roughly 75 words. Every model on sourc.dev is priced in tokens — both for what you send (input) and what the model generates (output).

The number that makes this real: a typical API call — a 500-word prompt, a 300-word response — costs roughly $0.003. One third of a cent. At scale, it is the number your budget lives or dies by.

The multiplier effect

Input and output have different prices. Output typically costs 3-5x more than input. This means a model that writes verbose responses is more expensive than one that writes concise ones — even if the quality is identical.

The practical implication: if you are asking the model to summarise something, instruct it to be concise. “Summarise in 3 sentences” costs less than “Summarise this” — because the output is shorter.

The multilingual factor

Tokens are not equal across languages. English runs at roughly one token per word. Finnish, Turkish, and Arabic tokenise 40–60% less efficiently — the same meaning costs significantly more tokens. If you are building for multilingual users, this is not a footnote. It is a line in your cost model.

The one change that matters most

Send only what the model needs to answer the question. If you are asking about a function on line 240, send that function — not the whole file. If your system prompt repeats on every call, every unnecessary word in it costs you on every request, forever. Trim it once and save across a million calls.

Precise beats thorough. Every time.

Start here: Look at your most frequent API call. Open your logs. How many tokens is the average request? That single number tells you everything about where your costs are going.