How Claude Code Prompt Caching Cuts AI Costs by Up to 90% and Boosts Efficiency
Prompt Caching in Anthropic's Claude Code replaces repeated processing of identical prompt prefixes with a prefix‑hash cache, slashing input‑token costs by up to 90%, reducing first‑token latency by 79%, and improving throughput, while preserving model output exactly as if no cache were used.
