The $20 Arbitrage: Claude Code + DeepSeek V4 Pro

My Claude Pro subscription lapsed a few weeks ago. I haven't renewed it — not because I stopped using an agentic workflow, but because API-first is now cheaper and I wanted to see if it would actually stick landing.

It did.

When Claude Code's source landed in a public repo this March, DeepSeek shipped an Anthropic-compatible API endpoint within days. Set ANTHROPIC_BASE_URL to DeepSeek's endpoint, drop in your key, and the official Claude Code binary routes to V4 Pro. No forks, no wrapper scripts.

Then V4 Pro's benchmarks dropped: 93.5% on LiveCodeBench — above Gemini 3.1 Pro (91.7%) and Claude Opus 4.6 (88.8%). So what does $20 actually buy you?

Three things that make the API workflow better

1. Prompt caching kills the iteration tax

Once your repo context is warm in a session, subsequent turns cost 1/10th of the input rate — fractions of a cent per million tokens. You can run grep, refactor, and test loops without watching a quota bar.

2. The agent can actually grind

Claude Code runs shell commands, executes tests, and manages state between turns. When something breaks, the difference is stark: failing a test five times on Claude Pro eats a meaningful chunk of your daily limit. Running 50 high-reasoning iterations on a race condition costs less than a cent with V4.

3. 1M context that actually retrieves

V4 Pro uses hybrid attention (Compressed Sparse Attention + Highly Compressed Attention) to reach 1M tokens at roughly 10% of the KV cache cost of V3.2. Needle-in-a-haystack retrieval on MRCR 1M benchmarks at 83.5%. In practice, you can dump a full iOS project into context and it won't confuse a protocol from one module with a similarly named one from another.

$20 — Pick your fighter

Input token budget — $20, spent wisely or not

DeepSeek V4  ███████████████████████  46.0M  ($0.435/1M)
Sonnet 4.6   ███░░░░░░░░░░░░░░░░░░░░   6.7M  ($3.00/1M)
Opus 4.6     ██░░░░░░░░░░░░░░░░░░░░░   4.0M  ($5.00/1M)
Claude Pro   ░░░░░░░░░░░░░░░░░░░░░░░   "generating too fast, slow down"

                                       Each █ ≈ 2M tokens

Assumptions: 50K tokens per full iOS project load, 5K per PR review, 10K per agent fix-and-retry loop.

	Sonnet 4.6	Opus 4.7	DeepSeek V4 Pro
Price / 1M input	$3.00	$5.00	$0.435
SWE-Bench Verified	79.6%	87.6%	80.6%
GPQA Diamond	74.1%	94.2%	90.1%
Reasoning depth	Adaptive	Extended	Non-think / Think High / Think Max
Full project load (50K tok)	133	80	920
PR review (5K tok)	1,333	800	9,200
Fix-retry loop (10K tok)	667	400	4,600
"What is this fn?" (2K tok)	3,333	2,000	23,000

DeepSeek gives you roughly 7x more tasks per dollar than Sonnet 4.6 and 11x more than Opus 4.7.

Regular pricing: what changes after May 31

When the 75% promo ends, the math shifts but the story holds:

Model	Tokens per $1 (input)	Tokens per $1 (output)
DeepSeek V4 Flash	~7.1M	~3.6M
DeepSeek V4 Pro	~575K	~287K
GPT-4o	~400K	~100K
Claude Sonnet 4	~333K	~67K
Claude Opus 4	~67K	~13K

A typical Claude Code session (~100K tokens in, ~20K out):

	Promo price	Regular price
V4 Pro	~$0.06	~$0.24
Sonnet 4	—	~$0.60
V4 Flash	—	~$0.02

At regular price, V4 Pro is still 4x cheaper per session than Sonnet, and Flash exists as a near-free fallback for simple tasks. The gap gets smaller, but it doesn't close.

⚡ Promo pricing: $0.435/1M input, $0.87/1M output — 75% discount active through May 31, 2026. Regular price: $1.74 / $3.48.

The subscription model made sense when the browser UI was the only option. It doesn't anymore. You get a stronger model with 5x the context, you stop worrying about cooldowns mid-refactor, and you bank the $240/year. For anyone spending real hours in an agentic workflow, the switch is overdue.