Published on

The $20 Arbitrage: Claude Code + DeepSeek V4 Pro

Authors

My Claude Pro subscription lapsed a few weeks ago. I haven't renewed it — not because I stopped using an agentic workflow, but because API-first is now cheaper and I wanted to see if it would actually stick landing.

It did.

When Claude Code's source landed in a public repo this March, DeepSeek shipped an Anthropic-compatible API endpoint within days. Set ANTHROPIC_BASE_URL to DeepSeek's endpoint, drop in your key, and the official Claude Code binary routes to V4 Pro. No forks, no wrapper scripts.

Then V4 Pro's benchmarks dropped: 93.5% on LiveCodeBench — above Gemini 3.1 Pro (91.7%) and Claude Opus 4.6 (88.8%). So what does $20 actually buy you?


Three things that make the API workflow better

1. Prompt caching kills the iteration tax

Once your repo context is warm in a session, subsequent turns cost 1/10th of the input rate — fractions of a cent per million tokens. You can run grep, refactor, and test loops without watching a quota bar.

2. The agent can actually grind

Claude Code runs shell commands, executes tests, and manages state between turns. When something breaks, the difference is stark: failing a test five times on Claude Pro eats a meaningful chunk of your daily limit. Running 50 high-reasoning iterations on a race condition costs less than a cent with V4.

3. 1M context that actually retrieves

V4 Pro uses hybrid attention (Compressed Sparse Attention + Highly Compressed Attention) to reach 1M tokens at roughly 10% of the KV cache cost of V3.2. Needle-in-a-haystack retrieval on MRCR 1M benchmarks at 83.5%. In practice, you can dump a full iOS project into context and it won't confuse a protocol from one module with a similarly named one from another.


$20 — Pick your fighter

Input token budget — $20, spent wisely or not

DeepSeek V4  ███████████████████████  46.0M  ($0.435/1M)
Sonnet 4.6   ███░░░░░░░░░░░░░░░░░░░░   6.7M  ($3.00/1M)
Opus 4.6     ██░░░░░░░░░░░░░░░░░░░░░   4.0M  ($5.00/1M)
Claude Pro   ░░░░░░░░░░░░░░░░░░░░░░░   "generating too fast, slow down"

                                       Each █ ≈ 2M tokens

Assumptions: 50K tokens per full iOS project load, 5K per PR review, 10K per agent fix-and-retry loop.

Sonnet 4.6Opus 4.7DeepSeek V4 Pro
Price / 1M input$3.00$5.00$0.435
SWE-Bench Verified79.6%87.6%80.6%
GPQA Diamond74.1%94.2%90.1%
Reasoning depthAdaptiveExtendedNon-think / Think High / Think Max
Full project load (50K tok)13380920
PR review (5K tok)1,3338009,200
Fix-retry loop (10K tok)6674004,600
"What is this fn?" (2K tok)3,3332,00023,000

DeepSeek gives you roughly 7x more tasks per dollar than Sonnet 4.6 and 11x more than Opus 4.7.


Regular pricing: what changes after May 31

When the 75% promo ends, the math shifts but the story holds:

ModelTokens per $1 (input)Tokens per $1 (output)
DeepSeek V4 Flash~7.1M~3.6M
DeepSeek V4 Pro~575K~287K
GPT-4o~400K~100K
Claude Sonnet 4~333K~67K
Claude Opus 4~67K~13K

A typical Claude Code session (~100K tokens in, ~20K out):

Promo priceRegular price
V4 Pro~$0.06~$0.24
Sonnet 4~$0.60
V4 Flash~$0.02

At regular price, V4 Pro is still 4x cheaper per session than Sonnet, and Flash exists as a near-free fallback for simple tasks. The gap gets smaller, but it doesn't close.

Promo pricing: $0.435/1M input, $0.87/1M output — 75% discount active through May 31, 2026. Regular price: $1.74 / $3.48.


The subscription model made sense when the browser UI was the only option. It doesn't anymore. You get a stronger model with 5x the context, you stop worrying about cooldowns mid-refactor, and you bank the $240/year. For anyone spending real hours in an agentic workflow, the switch is overdue.