What Mistral open-sourced

On July 2, 2026 the French AI lab Mistral released Leanstral 1.5 under an Apache-2.0 license, with open weights on Hugging Face and a free API endpoint. According to Mistral's own announcement it is a mixture-of-experts model, 119B total parameters with about 6B active and a 256k-token context, specialized in Lean 4 formal theorem proving and autoformalization. In plain terms, Lean 4 proving means turning a claim about your software into a statement a machine can check line by line, so correctness is demonstrated rather than assumed.

The benchmark results are close to saturation: 100 percent on the miniF2F validation and test sets, 587 of 672 PutnamBench problems solved, 87 percent on FATE-H and 34 percent on FATE-X, with strong test-time scaling from 44 problems at a small compute budget up to 587 at a large one. More telling for owners, in testing the model uncovered 5 previously unknown bugs across 57 code repositories, including a critical overflow flaw in a zigzag-decoding library that conventional testing would typically miss.

Writing code was the easy half

Getting an AI to write code is largely a solved problem. Proving that the code is correct is not, and that is the half that matters for anything you cannot afford to have wrong. As models generate more of your software, the volume of code outruns the human capacity to review it, and the verification layer quietly becomes the bottleneck and the real assurance.

This is why a proof engine like Leanstral matters more than another code generator. Provable correctness is moving from an academic luxury toward a realistic procurement bar for critical and regulated software, where a demonstrated proof carries weight that a passing test suite never could. The owner who can show that a component is proven correct, not merely tested, holds a stronger position with regulators, insurers and customers alike.

Owning the checker

The sovereignty point here is not another chatbot. It is the verifier. Because Leanstral ships as European open weights under Apache-2.0, you can run the checking layer in-house, inspect exactly what it does, and keep it, rather than calling a US API and trusting a black box you can neither see nor retain.

That keeps the trust layer of your software supply chain auditable and under your control. When machines write the code, the decisive question is who owns the machine that checks the machine. A proof engine you run yourself answers that question in your favour, and it does so without asking you to surrender your most sensitive code to an outside service.