Infrastructure

DeepSeek V4 Is Free to Own. Most Companies Will Still Rent It by the Token.

DeepSeek V4 is open-weight and frontier-class. So why do most companies still rent intelligence by the token? The real answer is governance, not price.

By Leon Soliman · 2026-06-20 · 3 min read

Key takeaways

DeepSeek released V4 on April 24, 2026 as open-weight models under the MIT license, meaning the weights can be downloaded, run on your own hardware, and fine-tuned without a per-token contract.
DeepSeek's flagship V4-Pro is a 1.6 trillion parameter mixture-of-experts model that, on agentic benchmarks, reportedly scores alongside closed frontier systems such as GPT-5.5 and Claude Opus 4.7.
Despite credible open-weight options, a handful of closed vendors still take the large majority of enterprise API spend in 2026, with open-weight models a small minority, per industry estimates.
Self-hosting is not automatically cheaper: 2026 cost analyses put the true total cost of ownership at three to five times the raw GPU line once engineers, updates, and idle capacity are counted.
For owners, the decision is governance, not price. Data residency under GDPR can make a self-hosted or private endpoint the only compliant option, regardless of which is cheaper.

What did DeepSeek actually release with V4?

DeepSeek released V4 on April 24, 2026 as open-weight models under the MIT license, in two sizes named V4-Pro and V4-Flash. Open-weight matters more than open-source marketing: the weights are published on Hugging Face, so a company can run the model on its own machines, fine-tune it on its own data, and serve it to its own users without a per-token relationship with DeepSeek. The flagship V4-Pro is a 1.6 trillion parameter mixture-of-experts model with roughly 49 billion parameters active per token, and on agentic benchmarks it reportedly lands alongside closed frontier systems such as GPT-5.5 and Claude Opus 4.7. This is the line that DeepSeek's own V3 and R1 releases crossed first in late 2024 and early 2025. Frontier-class capability is no longer something only a few vendors can rent you.

If the model is free to own, why does almost everyone rent it?

Because renting hides the real work, and the real work is expensive in ways a price-per-million-tokens table never shows. Industry estimates still put the great majority of 2026 enterprise API spend with a handful of closed vendors, and open-weight models at only a small minority, even though credible open weights now exist. The reason is not ignorance. Running a frontier model in-house means GPUs, a serving stack, model updates, security, and the people who keep all of it alive. 2026 cost analyses put the true total cost of ownership at three to five times the raw hardware line once you count engineering salaries and idle capacity. For a team spending a few thousand a month on an API, hiring an inference engineer to save it costs far more than it saves. Renting is often the correct answer. What costs people money is renting without ever asking the question.

So when is self-hosting genuinely the better call?

When the question stops being about price and starts being about control. The honest crossover where self-hosting beats API economics on cost alone tends to sit somewhere in the range of fifty thousand to two hundred thousand dollars of monthly API spend, depending on how much you actually use the model. The more durable reasons are not financial. If you operate under GDPR, a self-hosted or private endpoint may be the only configuration where your data never leaves a perimeter you control, regardless of cost that quarter. If the model is core to your product rather than a convenience, owning the weights means a vendor cannot deprecate, reprice, or refuse you at will. A family office or an owner-led firm wants the thing it will still control in five years, and that is rarely the cheapest line on the invoice today.

What should an owner do before the next AI invoice?

Separate the capability question from the ownership question, because they are not the same decision. First, decide what the model is to you: a utility you consume, or an asset you depend on. A utility you can almost always rent. An asset is worth owning. Second, demand a true total cost of ownership, not a token price; if your team only shows you the API line, they have not done the analysis. Third, treat data residency and vendor concentration as board-level risks, not engineering preferences, because that is where open weights like DeepSeek V4 change what is possible rather than just what is cheap. Servola advises on AI infrastructure and the build-versus-rent decision, with one accountable owner and no vendor agenda.

Frequently asked questions

Is DeepSeek V4 really free to use commercially?

The weights are released under the MIT license, which permits commercial use, self-hosting, and fine-tuning. Free to own does not mean free to run; you still pay for the hardware, the engineering, and the operational overhead of serving it yourself.

Is self-hosting an open-weight model always cheaper than a closed API?

No. 2026 cost analyses put the true total cost of ownership at roughly three to five times the raw GPU spend once engineers, model updates, and idle capacity are included. Below a meaningful usage threshold, a closed API is usually the cheaper and simpler choice.

Why would a regulated or owner-led firm self-host at all?

Mostly for control rather than cost. Data residency under GDPR can make a self-hosted or private endpoint the only compliant option, and owning the weights removes the risk of a vendor repricing, deprecating, or restricting a model your business depends on.

DeepSeek V4 did not make frontier intelligence cheaper to rent. It made it possible to own. Most firms still paying by the token in 2027 will be the ones who never stopped to ask which they actually needed.

AI Infrastructure Open Weights DeepSeek Self-Hosting Governance