- Published on
The $20 Arbitrage: Claude Code + DeepSeek V4 Pro
- Authors

- Name
- Brahim El mssilha
- @siempay
My Claude Pro subscription lapsed a few weeks ago. I haven't renewed it — not because I stopped using an agentic workflow, but because API-first is now cheaper and I wanted to see if it would actually stick landing.
It did.
When Claude Code's source landed in a public repo this March, DeepSeek shipped an Anthropic-compatible API endpoint within days. Set ANTHROPIC_BASE_URL to DeepSeek's endpoint, drop in your key, and the official Claude Code binary routes to V4 Pro. No forks, no wrapper scripts.
Then V4 Pro's benchmarks dropped: 93.5% on LiveCodeBench — above Gemini 3.1 Pro (91.7%) and Claude Opus 4.6 (88.8%). So what does $20 actually buy you?
Three things that make the API workflow better
1. Prompt caching kills the iteration tax
Once your repo context is warm in a session, subsequent turns cost 1/10th of the input rate — fractions of a cent per million tokens. You can run grep, refactor, and test loops without watching a quota bar.
2. The agent can actually grind
Claude Code runs shell commands, executes tests, and manages state between turns. When something breaks, the difference is stark: failing a test five times on Claude Pro eats a meaningful chunk of your daily limit. Running 50 high-reasoning iterations on a race condition costs less than a cent with V4.
3. 1M context that actually retrieves
V4 Pro uses hybrid attention (Compressed Sparse Attention + Highly Compressed Attention) to reach 1M tokens at roughly 10% of the KV cache cost of V3.2. Needle-in-a-haystack retrieval on MRCR 1M benchmarks at 83.5%. In practice, you can dump a full iOS project into context and it won't confuse a protocol from one module with a similarly named one from another.
$20 — Pick your fighter
Input token budget — $20, spent wisely or not
DeepSeek V4 ███████████████████████ 46.0M ($0.435/1M)
Sonnet 4.6 ███░░░░░░░░░░░░░░░░░░░░ 6.7M ($3.00/1M)
Opus 4.6 ██░░░░░░░░░░░░░░░░░░░░░ 4.0M ($5.00/1M)
Claude Pro ░░░░░░░░░░░░░░░░░░░░░░░ "generating too fast, slow down"
Each █ ≈ 2M tokens
Assumptions: 50K tokens per full iOS project load, 5K per PR review, 10K per agent fix-and-retry loop.
| Sonnet 4.6 | Opus 4.7 | DeepSeek V4 Pro | |
|---|---|---|---|
| Price / 1M input | $3.00 | $5.00 | $0.435 |
| SWE-Bench Verified | 79.6% | 87.6% | 80.6% |
| GPQA Diamond | 74.1% | 94.2% | 90.1% |
| Reasoning depth | Adaptive | Extended | Non-think / Think High / Think Max |
| Full project load (50K tok) | 133 | 80 | 920 |
| PR review (5K tok) | 1,333 | 800 | 9,200 |
| Fix-retry loop (10K tok) | 667 | 400 | 4,600 |
| "What is this fn?" (2K tok) | 3,333 | 2,000 | 23,000 |
DeepSeek gives you roughly 7x more tasks per dollar than Sonnet 4.6 and 11x more than Opus 4.7.
Regular pricing: what changes after May 31
When the 75% promo ends, the math shifts but the story holds:
| Model | Tokens per $1 (input) | Tokens per $1 (output) |
|---|---|---|
| DeepSeek V4 Flash | ~7.1M | ~3.6M |
| DeepSeek V4 Pro | ~575K | ~287K |
| GPT-4o | ~400K | ~100K |
| Claude Sonnet 4 | ~333K | ~67K |
| Claude Opus 4 | ~67K | ~13K |
A typical Claude Code session (~100K tokens in, ~20K out):
| Promo price | Regular price | |
|---|---|---|
| V4 Pro | ~$0.06 | ~$0.24 |
| Sonnet 4 | — | ~$0.60 |
| V4 Flash | — | ~$0.02 |
At regular price, V4 Pro is still 4x cheaper per session than Sonnet, and Flash exists as a near-free fallback for simple tasks. The gap gets smaller, but it doesn't close.
⚡ Promo pricing: $0.435/1M input, $0.87/1M output — 75% discount active through May 31, 2026. Regular price: $1.74 / $3.48.
The subscription model made sense when the browser UI was the only option. It doesn't anymore. You get a stronger model with 5x the context, you stop worrying about cooldowns mid-refactor, and you bank the $240/year. For anyone spending real hours in an agentic workflow, the switch is overdue.