The battle of the cheapest LLM APIs in 2026. On a balanced workload they tie to the cent — so the real question is which way your tokens lean.
On the standard 1,500-input / 500-output workload, they are exactly tied at $0.00035 per call — about $3.50 per 10,000 calls either way. That's a genuine coincidence of the math: DeepSeek charges more for input but less for output, Gemini the reverse, and on this particular ratio they cancel out.
So the winner is decided entirely by your input-to-output ratio:
| Metric | DeepSeek V3.2 | Gemini 2.5 Flash-Lite |
|---|---|---|
| Vendor | DeepSeek | |
| Input price per M tokens | $0.14 | $0.10 |
| Output price per M tokens | $0.28 | $0.40 |
| Context window | 128K | 1M |
| Output/input ratio | 2.0× | 4.0× |
| Cost per typical call | $0.00035 | $0.00035 |
| Cost per 10,000 calls | $3.50 | $3.50 |
Beyond the per-token math, two practical differences break the tie. First, context window: Gemini 2.5 Flash-Lite offers 1M tokens versus DeepSeek's 128K, so for large-document RAG or long agent histories Gemini is the safer fit (and the input-rate edge compounds there). Second, tokenizer efficiency on non-English text: both handle multilingual content better than the GPT-4 family, but if you serve Russian/CJK at scale, test the actual token counts — the cheaper sticker price can lose to a higher token multiplier.
For output-bound jobs — generating articles, code, long structured responses — DeepSeek's market-low $0.28 output rate makes it hard to beat.
Both vendors support prompt caching (cached input bills at roughly 10% of the standard rate) and batch processing (about 50% off for 24-hour-tolerant jobs). At this price tier the absolute savings are tiny per call but still meaningful at scale — and on a batch job, DeepSeek's output rate effectively drops to $0.14 per million.
This is a tie that your workload breaks. Pick DeepSeek V3.2 for output-heavy generation, Gemini 2.5 Flash-Lite for input-heavy retrieval and long context. Either way you're at the absolute floor of LLM pricing in 2026 — drop your real prompt into the calculator to see which way yours leans.
On a balanced 1,500/500 call they tie at $0.00035. DeepSeek wins output-heavy work ($0.28 vs $0.40 output); Gemini wins input-heavy work ($0.10 vs $0.14 input).
Gemini 2.5 Flash-Lite — RAG is input-heavy so its $0.10 input rate wins, plus it has a 1M context window vs DeepSeek's 128K.
DeepSeek V3.2 — generation is output-heavy and its $0.28 output rate is the lower of the two.