Measurability Infrastructure for Learning Theory #
This file defines the MeasurableConceptClass typeclass, which bundles
the measure-theoretic regularity conditions needed for PAC learning theory.
Background #
The Fundamental Theorem of Statistical Learning (PAC ↔ finite VC dimension) requires measurability assumptions that are often left implicit in pen-and-paper proofs. Krapp & Wirth (2024, arXiv:2410.10243) systematically extract these conditions. This file formalizes them as Lean4 typeclass infrastructure.
The three bundled conditions are:
mem_measurable: every concept in C is a measurable functionall_measurable: all concepts X → Bool are measurable (for disagreement sets)wellBehaved: the uniform convergence bad event is NullMeasurableSet (theWellBehavedVCcondition from Symmetrization.lean)
Condition 3 is the non-trivial one. For countable concept classes, it holds automatically. For uncountable classes, the existential quantifier in the UC event {∃ h ∈ C, |TrueErr - EmpErr| ≥ ε} does not preserve MeasurableSet, and the NullMeasurableSet weakening is needed. This was discovered during the Lean4 formalization (Session 7) and is a genuine measure-theoretic subtlety absent from standard textbook presentations.
Relationship to ad hoc predicates #
This typeclass replaces explicit hypothesis threading in theorem signatures:
(hmeas_C : ∀ h ∈ C, Measurable h)→MeasurableConceptClass.mem_measurable(hc_meas : ∀ c : Concept X Bool, Measurable c)→MeasurableConceptClass.all_measurable(hWB : WellBehavedVC X C)→MeasurableConceptClass.wellBehaved
Combined with MeasurableBatchLearner (Learner/Core.lean), these two typeclasses
provide the complete regularity infrastructure for PAC learning proofs.
A concept class with the measure-theoretic regularity needed for PAC theory.
Bundles three conditions:
- Every concept in C is measurable
- All concepts are measurable (needed for disagreement set measurability)
- The UC bad event satisfies NullMeasurableSet (WellBehavedVC)
Condition 3 is the deep one: for uncountable C, the existential {∃ h ∈ C, |TrueErr - EmpErr| ≥ ε} is NOT MeasurableSet in general. WellBehavedVC asserts it is NullMeasurableSet, which suffices for integration (lintegral_indicator_one₀).
- mem_measurable (h : Concept X Bool) : h ∈ C → Measurable h
Every concept in C is measurable
- all_measurable (c : Concept X Bool) : Measurable c
All concepts X → Bool are measurable (for disagreement sets)
- wellBehaved : WellBehavedVC X C
Uniform convergence bad event is NullMeasurableSet
Instances
Bridge API: typeclass → explicit hypotheses #
These bridge lemmas allow incremental migration of existing theorems.
Each theorem currently takes explicit hmeas_C, hc_meas, hWB arguments.
With these bridges, callers can write:
MeasurableConceptClass.hmeas_C C
instead of threading the hypothesis manually.
Instances #
TODO: Add automatic instances for common cases:
- Finite concept classes (WellBehavedVC holds automatically)
- Concept classes over MeasurableSingletonClass spaces
- Countable concept classes (existential preserves measurability)
UniversallyMeasurableSpace: domain-level measurability #
When the domain X is "nice enough" (e.g., MeasurableSingletonClass, countable, or standard Borel), EVERY concept class over X automatically satisfies MeasurableConceptClass. This is a property of the space, not the class.
This typeclass captures: "X is regular enough that measurability of learning
events is never an issue." It resolves theorems like uc_does_not_imply_online
which quantify over ALL concept classes, not a specific one.
A measurable space where all Bool-valued functions are measurable and all concept classes are well-behaved (WellBehavedVC).
This is a domain-level property: it says the σ-algebra on X is rich enough that learning-theoretic measurability is automatic.
Examples:
- Any MeasurableSingletonClass space (discrete σ-algebra)
- Any countable space
- Standard Borel spaces (ℝⁿ with Borel σ-algebra)
The key consequence: for any C over X, the UC bad event {∃ h ∈ C, |TrueErr - EmpErr| ≥ ε} is NullMeasurableSet automatically.
- all_concepts_measurable (c : Concept X Bool) : Measurable c
All Bool-valued functions on X are measurable
- all_classes_wellBehaved (C : ConceptClass X Bool) : WellBehavedVC X C
All concept classes over X have well-behaved uniform convergence events
Instances
UniversallyMeasurableSpace implies MeasurableConceptClass for every C.
UniversallyMeasurableSpace bridge API #
Bridge Instances (L1 ↔ L5) #
Krapp-Wirth Ghost Gap Infrastructure #
Formalization of the ghost-gap machinery from Krapp & Wirth (2024, arXiv:2410.10243). Uses sSup over value sets (not ⨆) to avoid class-inference ambiguity. V-measurability is ONE-SIDED (not absolute) to match WellBehavedVC's event shape.
Equations
- absGhostGap h c m p = |oneSidedGhostGap h c m p|
Instances For
Equations
- ghostGapVals C c m p = {r : ℝ | ∃ h ∈ C, r = oneSidedGhostGap h c m p}
Instances For
Equations
- absGhostGapVals C c m p = {r : ℝ | ∃ h ∈ C, r = absGhostGap h c m p}
Instances For
Equations
- ghostGapSup C c m p = sSup (ghostGapVals C c m p)
Instances For
Equations
- absGhostGapSup C c m p = sSup (absGhostGapVals C c m p)
Instances For
Krapp-Wirth Measurability Conditions (Definition 3.2) #
V-measurability uses the ONE-SIDED ghost gap sup (not absolute value). This is needed for the implication KrappWirthWellBehaved → WellBehavedVC, because WellBehavedVC's event is one-sided.
The paper-faithful ABSOLUTE version is KrappWirthVAbs, kept separately.
V-measurability (one-sided): the ghost gap sup map is measurable.
Equations
- KrappWirthV X C = ∀ (c : Concept X Bool) (m : ℕ), Measurable (ghostGapSup C c m)
Instances For
V-measurability (absolute, paper-faithful): the abs ghost gap sup is measurable.
Equations
- KrappWirthVAbs X C = ∀ (c : Concept X Bool) (m : ℕ), Measurable (absGhostGapSup C c m)
Instances For
U-measurability: the UC gap map is measurable.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Krapp-Wirth well-behavedness: measurable hypotheses + V + U. Extends MeasurableHypotheses (L1). Strictly stronger than MeasurableConceptClass (our condition).
- V_measurable : KrappWirthV X C
- U_measurable : KrappWirthU X C
Instances
Finite-Grid Attainment #
EmpiricalError on m samples takes values in {0/m, 1/m, ..., m/m}. So the one-sided ghost gap takes values in a finite set (differences of grid values). Therefore sSup is attained, and {sSup ≥ ε} = {∃ h ∈ C, gap(h) ≥ ε}.
Equations
- empErrGrid m = if m = 0 then {0} else Finset.image (fun (k : ℕ) => ↑k / ↑m) (Finset.range (m + 1))
Instances For
Equations
- ghostGapGrid m = Finset.image (fun (ab : ℝ × ℝ) => ab.1 - ab.2) ((empErrGrid m).product (empErrGrid m))
Instances For
Implication Chain: KrappWirth → WellBehavedVC #
KrappWirthWellBehaved → WellBehavedVC. Map measurability → event NullMeasurability.
KrappWirthWellBehaved → MeasurableConceptClass.
Separation Interface (Open Questions) #
OPEN: Does finite VC + measurable hypotheses imply WellBehavedVC?
Equations
- WellBehavedVC_automatic = ∀ (X : Type) [inst : MeasurableSpace X] (C : ConceptClass X Bool), MeasurableHypotheses X C → VCDim X C < ⊤ → WellBehavedVC X C
Instances For
OPEN: Does WellBehavedVC (NullMeasurable events) separate from KrappWirthWellBehaved (measurable maps)?
Equations
- KrappWirth_separation = ∃ (X : Type) (x : MeasurableSpace X) (C : ConceptClass X Bool), MeasurableHypotheses X C ∧ WellBehavedVC X C ∧ ¬KrappWirthWellBehaved X C
Instances For
Measurable-Target Variants #
The Borel-analytic bridge theorem proves NullMeasurableSet for bad events only when the target concept c is measurable. These variants restrict the quantification to measurable targets.
WellBehavedVC restricted to measurable targets. This is the correct target for the Borel-analytic positive bridge: Borel parameterization ⇒ analytic bad event ⇒ NullMeasurableSet, but only when c is measurable (so the ghost-gap map is measurable).
Equations
- One or more equations did not get rendered due to their size.
Instances For
OPEN QUESTION (measurable-target version): Does WellBehavedVCMeasTarget separate from KrappWirthWellBehaved? The Borel-analytic bridge (BorelAnalyticBridge.lean) closes this.