Claude Sonnet 4 and Opus 4: Is $15/Million Tokens Worth It?

I'm going to be honest: Claude is my favorite AI tool. Has been for a while now. On May 22, 2025, Anthropic released Claude 4 – both Sonnet 4 and Opus 4 – and after using them for a couple weeks, I'm even more convinced.

It's not that Claude is always faster or smarter than ChatGPT. It's that Claude seems to understand what I actually mean, not just what I typed. That sounds fuzzy, but when you're asking AI to help with real work, the difference matters.

The Quick Pricing Breakdown

Let's get the numbers out of the way:

Sonnet 4: $3 per million input tokens / $15 per million output tokens
Opus 4: $15 per million input tokens / $75 per million output tokens

For comparison, GPT-4o runs $2.50 input / $10 output per million tokens. So Sonnet 4 is a bit more expensive than GPT-4o, and Opus 4 is significantly more expensive than everything.

If you're using the consumer apps instead of the API, it's simpler: both Claude Pro and ChatGPT Plus cost $20/month. Same price, different tools. Your choice.

Hand-drawn chart comparing prices: Sonnet 4 at $3/$15, Opus 4 at $15/$75, and GPT-4o at $2.50/$10 per million tokens

So What Changed with Claude 4?

Both models are what Anthropic calls "hybrid reasoning models" – they can give you a quick answer or take extra time to think through complex problems. We talked about extended thinking when Claude 3.7 Sonnet launched, and it's even better now. (If you want the full history, I wrote about Claude 3 when it first dropped.)

The headline improvements:

Better at coding – Sonnet 4 hit 72.7% on SWE-bench, a test that measures how well AI can fix real bugs in real codebases. That's up from 62.3% with Claude 3.7 Sonnet.
Better at following instructions – When you tell it exactly what you want, it actually does that now. Less "creative interpretation" of your requests.
Extended thinking with tools – Claude can now search the web or use other tools while it's thinking through a problem. Think about that: it can pause mid-reasoning to go look something up.
Better memory – 200K token context window with improved ability to remember what you talked about earlier in long conversations.

Sonnet 4: The New Default

For most people, Sonnet 4 is the answer. It's what you get on the free tier, what powers Claude Pro, and what you should use for 90% of your work.

The benchmark numbers are impressive – that 72.7% SWE-bench score beats GPT-4o's 33.2% on the same test. But benchmarks don't tell the whole story.

What I actually notice day-to-day:

Writes more naturally – Less of that "AI voice" that makes everything sound like it was generated by a marketing department.
Gets the point faster – I don't have to over-explain what I want. Claude picks up on context.
Argues back appropriately – If I ask for something that doesn't quite make sense, it'll push back gently instead of just doing the dumb thing I asked for.

GitHub actually chose Sonnet 4 to power their new coding agent in GitHub Copilot. That's not nothing – they could have picked any model, and they picked this one.

Hand-drawn split diagram: left side shows Sonnet 4 for everyday tasks (emails, drafts, quick coding), right side shows Opus 4 for heavy lifting (complex debugging, long projects, multi-file changes)

Opus 4: When You Need the Heavy Machinery

Opus 4 is 5x the price of Sonnet 4. That's a lot. So when does it actually make sense?

Anthropic calls it "the world's best coding model." Cursor agrees, calling it "state-of-the-art for coding and a leap forward in complex codebase understanding."

The real difference is sustained performance on long, complex tasks. Opus 4 can:

Work continuously for hours on a coding task without losing context
Handle changes across multiple files while keeping track of how they connect
Debug problems that require understanding the whole system, not just one function

I think of it this way: Sonnet 4 is for tasks. Opus 4 is for projects.

If you're asking Claude to write an email, analyze a document, or explain something – Sonnet 4. If you're asking it to refactor a significant chunk of code, debug something gnarly, or work through a multi-step research project – that's Opus 4 territory.

Claude vs ChatGPT: My Honest Take

This isn't really a "which one is better" question anymore. Both are extremely capable. The question is which one fits how you work.

Zapier put it well: Claude is a better partner for creative work and coding. ChatGPT has more bells and whistles (image generation, custom GPTs, web browsing).

Here's my current split:

Task	My Pick	Why
Business writing	Claude	Better at tone and nuance
Code debugging	Claude	72.7% vs 33.2% on SWE-bench isn't close
Quick coding	Either	GPT-4o scores 90.2% on HumanEval (simple functions)
Image generation	ChatGPT	Claude doesn't do images
Web research	ChatGPT	Better built-in browsing
Long documents	Claude	200K vs 128K context window
Custom chatbots	ChatGPT	GPTs are still ahead here

When Is $75/Million Tokens Worth It?

Let's do some quick math. A million output tokens is roughly 750,000 words – that's about 10 novels worth of text.

For most small businesses, you're not getting anywhere near that. A heavy user might generate 100,000 output tokens per month, which would be $7.50 at Opus 4 prices or $1.50 at Sonnet 4 prices.

The real cost difference matters for:

Developers building AI products – If you're processing thousands of requests, the price adds up
Heavy API users – If you're running Claude through automation workflows constantly
Long-running agent tasks – Opus 4's ability to work for hours means more tokens burned

If you're using Claude through the web or app with a $20/month Pro subscription? You're not paying per-token anyway. Use Opus when you need it, Sonnet when you don't. The subscription covers both.

Hand-drawn flowchart: 'Is this a quick task?' If yes, use Sonnet 4. If no, 'Is it complex coding or multi-hour work?' If yes, use Opus 4. If no, still use Sonnet 4.

What I Actually Use Day-to-Day

Here's my real workflow:

Morning emails and writing – Claude Sonnet 4 (regular mode)
Code debugging – Claude Sonnet 4 with extended thinking. For really gnarly stuff, Opus 4.
Research – ChatGPT's Deep Research for web stuff, Claude for analyzing documents I upload
Proposals and complex documents – Claude Sonnet 4 with extended thinking. (Artifacts is great for quick mockups.)
Quick questions – Whatever's already open
Image needs – ChatGPT (Claude still can't generate images)

Claude is my main driver. ChatGPT is the specialist I call in for specific jobs. That split works for me.

The Bottom Line

Is $15/million tokens worth it? Wrong question. Sonnet 4 is $15 for outputs, and that's the one you'll use 90% of the time. Opus 4's $75 is for when you're doing serious coding work that justifies the cost.

If you're on the fence between Claude Pro and ChatGPT Plus, both cost $20/month. Try both. See which one feels right for how you work.

For me, Claude just... gets it. When I explain what I want, even badly, it understands. That's worth the $20/month and then some.

Want Help Figuring Out Which AI Tools Make Sense?

Every business is different. Maybe ChatGPT's image generation matters more for your workflow. Maybe you need something that can handle massive documents. Maybe you don't need any AI subscription at all yet.

If you want an honest opinion – not a sales pitch – let's talk. I'll tell you what I actually think makes sense for your situation, even if the answer is "don't spend money on this yet."