AI Strategy

Match the Model Tier to the Job

Sonnet 5 made the headlines, but it is the mid tier. Opus 4.8 is the flagship. Here is how an owner picks the right AI model tier for each job.

By Leon Soliman · 2026-06-30 · 3 min read

The Headline Is Rarely the Strongest Model

When a lab announces a new model and the press runs with it, owners are tempted to assume the name in the headline is the most capable thing available. It usually is not. Anthropic's recent attention went to Sonnet 5, but Sonnet 5 is the balanced mid tier of the Claude family. The flagship, the most capable tier, is Opus 4.8. Below Sonnet sits Haiku, the fast and low-cost tier, alongside Fable 5, a specialized member of the same family. OpenAI ships the same shape: GPT-5.6 arrives as Sol, the flagship, Terra, the balanced tier positioned at roughly half the cost of the prior generation at similar performance, and Luna, the fast and lowest-cost tier.

This is now the standard structure across every serious lab, and it exists for a reason. A single model cannot be the strongest, the cheapest, and the fastest at once. So labs split the family into tiers and let buyers choose. The headline tends to land on whichever release is most newsworthy or most widely deployed, which is often the mid tier rather than the flagship. Reading the press will tell you what launched. It will not tell you which tier belongs in your business.

Reserve the Flagship, Run the Mid Tier, Scale the Fast Tier

The flagship tier earns its premium on genuinely hard problems: dense legal or financial reasoning, multi-step analysis where one wrong assumption invalidates the output, complex code, and work where a mistake is expensive to catch later. For that class of task, paying for Opus 4.8 or Sol is the cheap option, because the cost of a weak answer dwarfs the cost of the better model. This is where you do not economize.

The mid tier is where most of the day actually runs. Sonnet 5 or Terra will handle drafting, summarizing, customer replies, research synthesis, and the steady stream of routine knowledge work to a standard most teams will not be able to distinguish from the flagship. The fast tier then takes the high-volume, low-stakes load, the classification and tagging and bulk processing that runs thousands of times an hour, where speed and cost per call matter more than the last few points of capability. Match the tier to the job and you spend money where it changes the outcome.

How an Owner Should Actually Choose

You do not need to track version numbers to make this decision well. You need a short rule. Ask what happens if the answer is wrong. If a wrong answer is expensive, slow to detect, or hard to reverse, route the task to the flagship. If a wrong answer is cheap to spot and fix, the mid tier is the right call and the savings are real. If the task runs at high volume and each call is low stakes, the fast tier is built for exactly that. The same logic holds whether you standardize on the Claude family or the GPT family, because both ship the same three roles.

The trap is buying by name. Standardizing everything on the flagship means overpaying across thousands of routine calls for capability that never gets used. Standardizing everything on the model in the headlines means quietly under-powering the handful of hard problems that justified bringing AI in at all. Neither is a strategy. The owners who get this right treat the tier as a deliberate choice per workload, the same way they would never put the senior partner on the photocopying or hand the merger to the intern.

Frequently asked questions

Is the model in the news the most powerful one available?

Usually not. The headline tends to land on the most newsworthy or most widely deployed release, which is often the balanced mid tier. Sonnet 5 drew the attention, but Opus 4.8 is the more capable flagship. Check the tier, not the press coverage.

Do we always need the flagship model to be safe?

No. The flagship is more capable, but the mid tier is enough for the large majority of daily work and costs less. Reserve the flagship for genuinely hard problems where a wrong answer is expensive or hard to catch, and run routine work on the mid tier.

How do I decide which tier a given task needs?

Ask what a wrong answer costs. Expensive, slow to detect, or hard to reverse means flagship. Cheap to spot and fix means the mid tier. High volume and low stakes means the fast tier. That single question handles most decisions.

The model that makes the news is rarely the one your hardest problem needs, and almost never the one your routine work should pay for. Pick the tier on purpose and the spend follows the value.

AI Strategy Claude OpenAI CostControl

More from the Servola Journal

AI Strategy

When You Cannot Buy The Best Model

2026-06-30 · 3 min read

Read the article →

AI Strategy

Claude Science And Your R&D Diligence

2026-06-30 · 3 min read

Read the article →

AI Strategy

Claude Sonnet 5 And Your Next Move

2026-06-30 · 3 min read

Read the article →

Servola

If your teams are defaulting every task to one model, you are either overpaying for power you do not use or under-powering the work that matters most. Servola maps your workloads to the right model tier so each job runs on the right engine.

Request a private introduction About Servola →

Servola is technology counsel for a small number of families and offices. When a decision cannot be delegated, we sit on your side of the table.

Servola Systems GmbH · Ludwigshafen, Germany · [email protected]

← All articles