Borel-Analytic Bridge for Statistical Learning Theory #
This file proves that NullMeasurableSet is the exactly right level of measurability for the Fundamental Theorem of Statistical Learning with Borel-parameterized concept classes over Polish spaces.
Main results #
paramWitnessSet_measurable: the witness graph {(θ, p) | gap(θ, p) ≥ ε/2} is Borelborel_param_badEvent_analytic: projection to sample space is analytic (Σ₁¹)analyticSet_nullMeasurableSet: analytic → NullMeasurableSet (DST bridge lemma)borel_param_wellBehavedVCMeasTarget: Borel parameterization → WellBehavedVCMeasTarget
The separation #
The counterexample (singleton class over analytic non-Borel A ⊆ ℝ) shows the bad event can be analytic but NOT Borel, hence WellBehavedVCMeasTarget holds but KrappWirthWellBehaved fails. See Theorem/BorelAnalyticSeparation.lean.
References #
- Suslin (1917): projections of Borel sets are analytic
- Lusin (1925): analytic sets are universally measurable
- Krapp & Wirth (2024, arXiv:2410.10243): MeasurableSet conditions for FTSL
- This kernel: NullMeasurableSet weakening discovered during Lean4 formalization
Core Definitions #
Ghost sample pairs: two independent samples of size m.
Equations
- GhostPairs X m = ((Fin m → X) × (Fin m → X))
Instances For
The witness set in parameter × sample space: {(θ, p) | EmpErr(h_θ, ghost, c) - EmpErr(h_θ, train, c) ≥ ε/2}. This is Borel when e and c are measurable (Theorem A).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The bad event in sample space: projection of the witness set. Existential over the parameter: {p | ∃ θ, gap(θ, p) ≥ ε/2}. This is analytic when the witness set is Borel (Theorem B).
Equations
- paramBadEvent e c m ε = Prod.snd '' paramWitnessSet e c m ε
Instances For
Patched evaluation: combine two concept families using a region selector. patchEval(θ₁, θ₂, ρ)(x) = e₁(θ₁)(x) if r(ρ)(x), else e₂(θ₂)(x). Used for the closure principle (Theorem F).
Instances For
Theorem A: Measurable witness graph #
The witness set {(θ, p) | ghost-gap ≥ ε/2} is MeasurableSet when the evaluation map e and target c are measurable. This is the Borel half of the Borel-analytic bridge.
Theorem B: Bad event is analytic (Suslin projection) #
The bad event (projection of witness set) is analytic. Projection of a Borel set from a StandardBorelSpace is analytic (Suslin). This is the key step: existential quantification over parameters produces an analytic (Σ₁¹) set, which may not be Borel.
Theorem C' (F4c): Sample-space specialization #
Equations
- GhostPairMeasure D m = (MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D)
Instances For
Analytic subsets of the ghost sample space (Fin m → X) × (Fin m → X) are
NullMeasurableSet under the product probability measure. A specialisation of
analyticSet_nullMeasurableSet from PureMath/AnalyticMeasurability.lean to the type
the symmetrization argument actually consumes.
Theorem D: Positive bridge - bad event is NullMeasurableSet #
Positive bridge. If a concept class is parameterized by a Borel measurable map
Θ → Concept X from a standard Borel space Θ, then the symmetrization bad event is
analytic, hence NullMeasurableSet. The bad event is a Suslin projection of a Borel
witness set (the projection along Θ of {(θ, p) | gap(eval θ, p) ≥ ε / 2}), and
projections of Borel sets are analytic by definition. This is the entry point through
which Borel parameterization implies the regularity required by the fundamental
theorem.
Theorem E (F5): Class-level corollary #
Class-level corollary: every Borel-parameterized concept class with a measurable
evaluation map satisfies WellBehavedVCMeasTarget. Composes
borel_param_nullMeasurableSet_bad_event over all measurable targets c. The
measurable-target variant of WellBehavedVC is what the kernel actually proves; the
unrestricted variant remains open and is the subject of the Borel-analytic separation
witness in Theorem/BorelAnalyticSeparation.lean.
Theorem F (F6): Closure principle for patching #
Closure under patching: if e₁ : Θ₁ → Concept X, e₂ : Θ₂ → Concept X, and a
region selector r : Ρ → Concept X Bool are jointly measurable, then so is the
piecewise evaluation
(θ₁, θ₂, ρ, x) ↦ if r ρ x then e₁ θ₁ x else e₂ θ₂ x
on the combined parameter space. The basic compositional fact behind
patch_borel_param_wellBehavedVCMeasTarget.
The patched union of two Borel-parameterized classes (with a measurable region
selector) is itself Borel-parameterized over Θ₁ × Θ₂ × Ρ, and therefore satisfies
WellBehavedVCMeasTarget. Immediate from patchEval_measurable and
borel_param_wellBehavedVCMeasTarget. Surfacing this corollary at the class level makes
the closure of the measurable-target hypothesis under amalgamation explicit.