DeepSeekKV Cache
80M Tokens for 4 RMB: DeepSeek Disk Cache Rewrites LLM Inference Costs
DeepSeek's novel architecture enables disk-level caching, slashing API costs 10x. This signals LLM inference shifting from raw compute to engineering
May 4·2 min read