What did DeepSeek actually release with V4?

DeepSeek released V4 on April 24, 2026 as open-weight models under the MIT license, in two sizes named V4-Pro and V4-Flash. Open-weight matters more than open-source marketing: the weights are published on Hugging Face, so a company can run the model on its own machines, fine-tune it on its own data, and serve it to its own users without a per-token relationship with DeepSeek. The flagship V4-Pro is a 1.6 trillion parameter mixture-of-experts model with roughly 49 billion parameters active per token, and on agentic benchmarks it reportedly lands alongside closed frontier systems such as GPT-5.5 and Claude Opus 4.7. This is the line that DeepSeek's own V3 and R1 releases crossed first in late 2024 and early 2025. Frontier-class capability is no longer something only a few vendors can rent you.

If the model is free to own, why does almost everyone rent it?

Because renting hides the real work, and the real work is expensive in ways a price-per-million-tokens table never shows. Industry estimates still put the great majority of 2026 enterprise API spend with a handful of closed vendors, and open-weight models at only a small minority, even though credible open weights now exist. The reason is not ignorance. Running a frontier model in-house means GPUs, a serving stack, model updates, security, and the people who keep all of it alive. 2026 cost analyses put the true total cost of ownership at three to five times the raw hardware line once you count engineering salaries and idle capacity. For a team spending a few thousand a month on an API, hiring an inference engineer to save it costs far more than it saves. Renting is often the correct answer. What costs people money is renting without ever asking the question.

So when is self-hosting genuinely the better call?

When the question stops being about price and starts being about control. The honest crossover where self-hosting beats API economics on cost alone tends to sit somewhere in the range of fifty thousand to two hundred thousand dollars of monthly API spend, depending on how much you actually use the model. The more durable reasons are not financial. If you operate under GDPR, a self-hosted or private endpoint may be the only configuration where your data never leaves a perimeter you control, regardless of cost that quarter. If the model is core to your product rather than a convenience, owning the weights means a vendor cannot deprecate, reprice, or refuse you at will. A family office or an owner-led firm wants the thing it will still control in five years, and that is rarely the cheapest line on the invoice today.

What should an owner do before the next AI invoice?

Separate the capability question from the ownership question, because they are not the same decision. First, decide what the model is to you: a utility you consume, or an asset you depend on. A utility you can almost always rent. An asset is worth owning. Second, demand a true total cost of ownership, not a token price; if your team only shows you the API line, they have not done the analysis. Third, treat data residency and vendor concentration as board-level risks, not engineering preferences, because that is where open weights like DeepSeek V4 change what is possible rather than just what is cheap. Servola advises on AI infrastructure and the build-versus-rent decision, with one accountable owner and no vendor agenda.