Skip to main content
2025-01-01

Question of the Day

Question of the day · 2026-05-02 ·

One question per day to look beyond the headlines.

Where do DeepSeek V4-Pro’s price cuts actually come from: cheaper input tokens, cache hits, or MoE routing efficiency?

Take-away DeepSeek’s cuts come from pricing the ingestion path separately: cheap prompt tokens plus a deeply discounted cache-hit tier monetize reuse, not MoE routing.

DeepSeek V4-Pro’s price cuts primarily come from cheaper input tokens and cache hits. DeepSeek has significantly reduced API prices by up to 90%, focusing on cutting costs for input prompts and cached data hits, thereby lowering per million token costs [2]. Furthermore, the input cache-hit tier has been reduced to one-tenth of the list price, layered on top of a prior 75% discount, highlighting the strategy to offer lower prices through efficient input token costs and cache hit management [1], [3].

Sources · 2026-05-03