Throughput

What is throughput?

Throughput is the number of tokens or requests a model API can process per unit of time. It is measured in tokens per second (TPS) for individual requests, or requests per minute (RPM) for aggregate capacity.

Different providers optimise for different throughput profiles. Groq’s custom hardware delivers extremely high TPS for individual requests. OpenAI and Anthropic optimise for high concurrent RPM across many users.

Why it matters

Throughput determines whether your application can scale. A model that is cheap but slow may cost more in practice than a faster, pricier alternative — because your users are waiting and your infrastructure is idle. sourc.dev tracks speed (TPS) as a verified attribute where available.

What is throughput?

Related terms