Formal Learning Theory Kernel

Lean 4 formalization of the fundamental theorem of statistical learning, with paradigm separations and a measurability refinement.

CI status Docs build status Lean v4.29.0-rc6 Zero sorry declarations

21,522 lines · 354 theorems · 53 files · mathlib4 fde0cc5

A typed premise with 42 nodes scoped a human-guided, AI-driven proof search across three learning paradigms (PAC, Online, and Gold-style). Every theorem in the kernel is fully proved.

The infrastructure the types demanded produced mathematics the premise did not predict: a Borel-analytic separation correcting Krapp-Wirth 2024, a compression scheme via approximate minimax (multiplicative weights update), and a monadic structure on measurable batch learners closed under Boolean combination, majority vote, and piecewise interpolation.

Main results

Navigate

Related work

The only prior attempt at formalizing PAC learning theory with VC dimension is Google's formal-ml, incomplete in Lean 3 with one sorry. Zhang et al.'s lean-stat-learning-theory is complementary: it formalizes concentration inequalities, covering numbers, and Dudley's entropy integral in Lean 4. This kernel formalizes the characterization theorems, paradigm separations, and measurability theory that theirs does not.