Symmetrization and Ghost Sample Infrastructure #

Reusable symmetrization/ghost sample machinery for uniform convergence bounds. This file provides the symmetrization argument (SSBD Chapter 4/6, Kakade-Tewari Lecture 19) that converts a one-sided uniform convergence question into a double-sample question, then bounds the double-sample event via exchangeability + growth function.

Main results #

hoeffding_one_sided : one-sided Hoeffding for bounded [0,1] losses
symmetrization_step : P[∃h: TrueErr-EmpErr ≥ ε] ≤ 2·P_{double}[∃h: EmpErr'-EmpErr ≥ ε/2]
double_sample_pattern_bound : double-sample bound via exchangeability + growth function
symmetrization_uc_bound : two-sided UC bound 4·GF(C,2m)·exp(-mε²/8)
growth_exp_le_delta : arithmetic: sample complexity makes the UC bound ≤ δ

Infrastructure #

DoubleSampleMeasure : D^m ⊗ D^m as the product of two independent pi measures
MergedSample : Fin (2*m) → X with the Fin.append isomorphism
SplitMeasure : uniform measure over (2m choose m) splits for exchangeability argument

Design notes #

All theorems use the STANDARD Approach A (exchangeability + permutation) for T3, NOT the relaxed iid Rademacher approach. This is the structurally correct argument that avoids introducing unnecessary independence assumptions.

Helper Definitions (DoubleSampleMeasure, ValidSplit, etc. in MathLib.Exchangeability) #

T1: One-sided Hoeffding Inequality #

source

theorem hoeffding_one_sided {X : Type u} [MeasurableSpace X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (h c : Concept X Bool) (m : ℕ) (hm : 0 < m) (t : ℝ) (ht : 0 < t) (_ht1 : t ≤ 1) (hmeas : MeasurableSet {x : X | h x ≠ c x}) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {xs : Fin m → X | EmpiricalError X Bool h (fun (i : Fin m) => (xs i, c (xs i))) (zeroOneLoss Bool) ≤ TrueErrorReal X h c D - t} ≤ ENNReal.ofReal (Real.exp (-2 * ↑m * t ^ 2))

One-sided Hoeffding: for iid Bernoulli(p) draws, the empirical average undershoots the mean by ≥ t with probability ≤ exp(-2mt²).

Proof strategy (3 steps):

MGF bound (Hoeffding's lemma): For X ∈ [0,1] with E[X] = p, E[exp(s(X-p))] ≤ exp(s²/8).
- Adapt from cosh_le_exp_sq_half infrastructure in Rademacher.lean.
- Key: convexity of exp on [0,1] gives E[exp(sX)] ≤ p·exp(s) + (1-p)·exp(0), then the s²/8 bound follows from ln(1 + x) ≤ x and Taylor expansion.
```
have mgf_bound : ∀ (s : ℝ),
  ∫ x, Real.exp (s * (indicator x - p)) ∂D ≤ Real.exp (s^2 / 8) := by ...
```
Product independence: E[exp(s·∑(X_i-p))] = ∏ E[exp(s(X_i-p))] ≤ exp(ms²/8).
- Uses MeasureTheory.Measure.pi independence structure.
- Needs: Measure.pi integral factorization for product of functions.
- MEASURABILITY: fun xs => Real.exp (s * ∑ i, f (xs i)) is measurable (composition of measurable functions).
```
have product_bound : ∀ (s : ℝ),
  ∫ xs, Real.exp (s * ∑ i, (indicator (xs i) - p)) ∂Measure.pi (fun _ => D)
  ≤ Real.exp (m * s^2 / 8) := by ...
```
Exponential Markov + optimize: P[∑(X_i-p) ≤ -mt] = P[exp(-s·∑(X_i-p)) ≥ exp(smt)] ≤ exp(-smt + ms²/8). Optimize over s: set s = 4t to get ≤ exp(-2mt²).
- Uses Markov's inequality in ENNReal form.
- CAST ISSUE: Markov gives ENNReal bound, need to convert exp(-2mt²) between ENNReal.ofReal and the measure value.
```
have markov_step : ∀ (s : ℝ) (hs : 0 < s),
  Measure.pi (fun _ => D) {xs | ∑ i, (indicator (xs i) - p) ≤ -(m : ℝ) * t}
  ≤ ENNReal.ofReal (Real.exp (-(s * m * t) + m * s^2 / 8)) := by ...
have optimize : Real.exp (-(4*t * m * t) + m * (4*t)^2 / 8)
  = Real.exp (-2 * m * t^2) := by ring_nf
```

CAST ISSUES to watch:

m : ℕ needs cast to ℝ in the exponent: (m : ℝ)
EmpiricalError returns ℝ, TrueErrorReal returns ℝ, good — no ENNReal gap
The measure value is ENNReal, the bound exp(-2mt²) is ℝ≥0∞ via ENNReal.ofReal

References: SSBD Lemma B.3, Hoeffding (1963)

T2: Symmetrization Step #

source

theorem symmetrization_step {X : Type u} [MeasurableSpace X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (C : ConceptClass X Bool) (c : Concept X Bool) (hmeas_C : ∀ h ∈ C, Measurable h) (hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) (hm_large : 2 * Real.log 2 ≤ ↑m * ε ^ 2) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {xs : Fin m → X | ∃ h ∈ C, TrueErrorReal X h c D - EmpiricalError X Bool h (fun (i : Fin m) => (xs i, c (xs i))) (zeroOneLoss Bool) ≥ ε} ≤ 2 * ((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D)) {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2}

Symmetrization: the probability of a large gap TrueErr-EmpErr is at most twice the probability of a large gap EmpErr'-EmpErr on the double sample.

Proof strategy (6 steps):

Witness selection: For S in the bad event, ∃h* ∈ C with TrueErr(h*) - EmpErr_S(h*) ≥ ε.

-- In the bad event set, extract h* by classical choice
have h_witness : ∀ xs ∈ bad_event, ∃ h* ∈ C,
  TrueErrorReal X h* c D - EmpiricalError X Bool h* (sample xs) (zeroOneLoss Bool) ≥ ε

Ghost sample mean: E_{S'}[EmpErr_{S'}(h*)] = TrueErr(h*) ≥ EmpErr_S(h*) + ε.
- Uses: MeasureTheory.integral_pi to compute E[EmpErr] over product measure.
- KEY LEMMA: For fixed h, E_{D^m}[EmpiricalError(h,S)] = TrueErrorReal(h,c,D). This is because EmpErr = (1/m)∑ indicator(x_i), and E[indicator(x_i)] = TrueErrorReal.
```
have expected_emp_err : ∀ h* : Concept X Bool,
  ∫ xs, EmpiricalError X Bool h* (sample xs) (zeroOneLoss Bool)
    ∂(Measure.pi (fun _ : Fin m => D))
  = TrueErrorReal X h* c D := by ...
```

Hoeffding on ghost sample: P_{S'}[EmpErr_{S'}(h*) < TrueErr(h*) - ε/2] ≤ exp(-mε²/2).

Apply hoeffding_one_sided with t = ε/2.
The hm_large hypothesis ensures exp(-mε²/2) < 1/2: 2·ln2 ≤ mε² ⟹ mε²/2 ≥ ln2 ⟹ exp(-mε²/2) ≤ 1/2.

have hoeffding_ghost : ∀ h* ∈ C,
  Measure.pi (fun _ : Fin m => D)
    {xs' | EmpiricalError X Bool h* (sample xs') (zeroOneLoss Bool)
      < TrueErrorReal X h* c D - ε/2}
  ≤ ENNReal.ofReal (Real.exp (-m * (ε/2)^2 * 2)) := by
    intro h* _; exact hoeffding_one_sided D h* c m hm (ε/2) (by linarith) (by ...) (by ...)

Complementary probability: P_{S'}[EmpErr_{S'}(h*) - EmpErr_S(h*) ≥ ε/2] ≥ 1/2.
- From step 2: TrueErr(h*) ≥ EmpErr_S(h*) + ε
- From step 3: P[EmpErr_{S'} ≥ TrueErr - ε/2] ≥ 1/2
- Chain: EmpErr_{S'} ≥ TrueErr - ε/2 ≥ EmpErr_S + ε - ε/2 = EmpErr_S + ε/2

Conditional to unconditional: The witness h* from step 1 also witnesses the double-sample event ∃h∈C: EmpErr'-EmpErr ≥ ε/2. So: P_{S'}[double event | S bad] ≥ 1/2.

have conditional_bound : ∀ xs ∈ bad_event,
  Measure.pi (fun _ : Fin m => D)
    {xs' | ∃ h ∈ C, EmpiricalError ... xs' - EmpiricalError ... xs ≥ ε/2}
  ≥ ENNReal.ofReal (1/2) := by ...

Fubini integration: By Measure.prod_apply and Fubini: P_{S,S'}[double event] = ∫S P{S'}[double event | S] ≥ (1/2) · P_S[bad event] ⟹ P_S[bad event] ≤ 2 · P_{S,S'}[double event].

-- Uses: MeasureTheory.Measure.prod_apply or lintegral_prod
-- MEASURABILITY: the double-sample event is measurable as a finite union
-- of sets of the form {(xs,xs') | EmpErr'(h) - EmpErr(h) ≥ ε/2} for h ∈ C.
-- Since C may be infinite, measurability requires care: the sup over h
-- must be shown to be measurable. For finite restriction patterns (≤ 2^m
-- on Fin m → Bool), this is a finite union.

MEASURABILITY CONCERNS:

{xs | ∃ h ∈ C, ...} is NOT obviously measurable for infinite C. Strategy: decompose via restriction patterns. On any fixed xs, the set of labelings {(h(xs 0), ..., h(xs(m-1))) | h ∈ C} has at most GF(C,m) ≤ 2^m elements. So the ∃h event is a finite union of measurable sets.
EmpiricalError is a finite sum of measurable functions, hence measurable.
The product σ-algebra on (Fin m → X) × (Fin m → X) is generated by cylinder sets, and our events are in this σ-algebra.

References: SSBD Lemma 4.5, Kakade-Tewari Lecture 19 Lemma 1

T3: Double Sample Pattern Bound (Approach A — Standard Exchangeability) #

source

theorem per_hypothesis_gap_bound {X : Type u} [MeasurableSpace X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (h c : Concept X Bool) (hmeas_h : Measurable h) (hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) :

have μ := MeasureTheory.Measure.pi fun (x : Fin m) => D; (μ.prod μ) {p : (Fin m → X) × (Fin m → X) | EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ≤ ENNReal.ofReal (Real.exp (-(↑m * ε ^ 2 / 8)))

Per-hypothesis Hoeffding on the double sample: for a FIXED hypothesis h, the probability that EmpErr(h,S') - EmpErr(h,S) ≥ ε/2 under D^m ⊗ D^m is at most exp(-mε²/8).

Proof: The gap = (1/m)∑ᵢ (Zᵢ' - Zᵢ) where Zᵢ = 1[h(xᵢ)≠c(xᵢ)], Zᵢ' = 1[h(x'ᵢ)≠c(x'ᵢ)] are iid Bernoulli(p) with p = TrueError(h,c,D). The differences Wᵢ = Zᵢ' - Zᵢ are independent, bounded in [-1,1], and centered (E[Wᵢ] = 0). By Hoeffding's inequality: P[(1/m)∑Wᵢ ≥ ε/2] ≤ exp(-mε²/8).

This uses the sub-Gaussian machinery from T1, extended to the product space.

Proof sketch:

Pair D^m ⊗ D^m ≅ (D⊗D)^m via the natural isomorphism (Fin m → X) × (Fin m → X) ≃ᵐ Fin m → X × X
Define g : X × X → ℝ, g(a,b) = 1[h(b)≠c(b)] - 1[h(a)≠c(a)] Then g ∈ [-1,1], E_{D⊗D}[g] = 0, so HasSubgaussianMGF g 1 (D⊗D)
The gap = (1/m)∑ᵢ g(xᵢ, x'ᵢ) where pairs are iIndepFun under (D⊗D)^m
By measure_sum_ge_le_of_iIndepFun: P[∑g ≥ mε/2] ≤ exp(-(mε/2)²/(2m)) = exp(-mε²/8)

Mathlib chain: iIndepFun_pi + HasSubgaussianMGF.of_map + measure_sum_ge_le_of_iIndepFun

source

theorem restriction_pattern_count {X : Type u} [MeasurableSpace X] [Infinite X] (C : ConceptClass X Bool) (c : Concept X Bool) (n : ℕ) (z : Fin n → X) :

{p : Fin n → Bool | ∃ h ∈ C, ∀ (i : Fin n), p i = decide (h (z i) ≠ c (z i))}.ncard ≤ GrowthFunction X C n

The number of distinct restriction patterns of C on any n points is at most GF(C,n). For z : Fin n → X, define patterns(z) = {p : Fin n → Bool | ∃ h ∈ C, ∀ i, p i = (h(z i) ≠ c(z i))}. Then patterns(z).ncard ≤ GrowthFunction X C n by definition of GrowthFunction.

source

theorem finite_exchangeability_bound {Ω : Type u_1} {G : Type u_2} [MeasurableSpace Ω] [Fintype G] [Nonempty G] {ν : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure ν] (T : G → Ω → Ω) (S : Set Ω) (hT : ∀ (g : G), MeasureTheory.MeasurePreserving (T g) ν ν) (hS0 : MeasureTheory.NullMeasurableSet S ν) (B : ENNReal) (hpointwise : ∀ (z : Ω), ∑ g : G, (T g ⁻¹' S).indicator 1 z ≤ B * ↑(Fintype.card G)) :

ν S ≤ B

Generic finite exchangeability bound. Given a measure-preserving family of transformations on a probability space, a NullMeasurableSet S, and a pointwise bound on the sum of preimage indicators, conclude ν(S) ≤ B.

source

def WellBehavedVC (X : Type u) [MeasurableSpace X] (C : ConceptClass X Bool) :

Prop

A concept class is well-behaved if the ghost gap event is null-measurable. This is the minimal regularity assumption for the symmetrization proof.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

theorem exchangeability_chain_bound {X : Type u} [MeasurableSpace X] [Infinite X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (C : ConceptClass X Bool) (c : Concept X Bool) (_hmeas_C : ∀ h ∈ C, Measurable h) (_hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) (_hε2 : ε ≤ 2) (hC : Set.Nonempty C) (hE_nullmeas : MeasureTheory.NullMeasurableSet {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D))) :

have μ := MeasureTheory.Measure.pi fun (x : Fin m) => D; (μ.prod μ) {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ≤ ENNReal.ofReal (↑(GrowthFunction X C (2 * m)) * Real.exp (-(↑m * ε ^ 2 / 8)))

source

theorem double_sample_pattern_bound {X : Type u} [MeasurableSpace X] [Infinite X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (C : ConceptClass X Bool) (c : Concept X Bool) (hmeas_C : ∀ h ∈ C, Measurable h) (hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) (hE_nullmeas : MeasureTheory.NullMeasurableSet {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D))) :

((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D)) {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ≤ ENNReal.ofReal (↑(GrowthFunction X C (2 * m)) * Real.exp (-(↑m * ε ^ 2 / 8)))

On the double sample, the probability that any hypothesis has EmpErr' - EmpErr ≥ ε/2 is bounded by GF(C,2m) · exp(-mε²/8).

Proof strategy (Approach A — standard exchangeability, 5 steps):

EXCHANGEABILITY: Under D^m ⊗ D^m, the 2m draws z₁,...,z_{2m} are iid from D. The joint distribution is invariant under permutations of {1,...,2m}.

Key lemma: P_{D^m⊗D^m}[event(S,S')] = E_z[P_{split}[event | z]] where z = merged sample and the split is uniformly random among all C(2m,m) ways to partition z into two groups of m.

-- Measure.pi permutation invariance
have pi_perm_invariant : ∀ (σ : Equiv.Perm (Fin (2*m))),
  (Measure.pi (fun _ : Fin (2*m) => D)).map (fun z i => z (σ i))
  = Measure.pi (fun _ : Fin (2*m) => D) := by ...
-- Consequence: the event probability equals the split-averaged probability
have exchangeability :
  DoubleSampleMeasure D m {p | ∃ h ∈ C, gap(p) ≥ ε/2}
  = ∫ z, SplitMeasure m {vs | ∃ h ∈ C, gap(split z vs) ≥ ε/2}
    ∂(Measure.pi (fun _ : Fin (2*m) => D)) := by ...

CONDITIONING: For fixed merged sample z of 2m points:
- C restricts to at most GF(C,2m) distinct labeling patterns on z (deterministic).
- For each pattern p, define: diff(p, split) = EmpErr_{S'}(p) - EmpErr_S(p) = (1/m) ∑{i∈S'} a_i - (1/m) ∑{i∈S} a_i where a_i = 1[pattern(z_i) ≠ c(z_i)] ∈ {0,1}.
```
-- Number of distinct patterns
have num_patterns : ∀ (z : MergedSample X m),
  Set.ncard {p : Fin (2*m) → Bool | ∃ h ∈ C, ∀ i, p i = (h (z i) ≠ c (z i))}
  ≤ GrowthFunction X C (2*m) := by ...
```
PER-PATTERN HOEFFDING ON SPLITS: For fixed z and fixed pattern p: Under uniformly random split (S,S') of z into two groups of m: diff(p, split) = (1/m) ∑{i∈S'} a_i - (1/m) ∑{i∈S} a_i
This is a function of the random partition. By Hoeffding's inequality for sampling without replacement (Serfling 1974): P_split[diff ≥ ε/2] ≤ exp(-mε²/8)
Alternative derivation: Hoeffding without replacement from Hoeffding with replacement (iid signs) via coupling. The without-replacement bound is actually TIGHTER (variance reduction), but the with-replacement bound suffices.
```
-- Per-pattern concentration
have per_pattern_bound : ∀ (z : MergedSample X m) (a : Fin (2*m) → ℝ)
  (ha : ∀ i, a i ∈ Set.Icc 0 1),
  SplitMeasure m {vs | (1/m) * ∑ i ∈ second_group vs, a i
    - (1/m) * ∑ i ∈ first_group vs, a i ≥ ε/2}
  ≤ ENNReal.ofReal (Real.exp (-(m : ℝ) * (ε/2)^2 / 2)) := by ...
-- Note: m*(ε/2)^2/2 = mε²/8
```

UNION BOUND: P_split[∃ pattern: diff ≥ ε/2 | z] ≤ (number of patterns) · max_pattern P_split[diff ≥ ε/2] ≤ GF(C,2m) · exp(-mε²/8)

have union_bound : ∀ (z : MergedSample X m),
  SplitMeasure m {vs | ∃ h ∈ C, gap(split z vs, h) ≥ ε/2}
  ≤ ENNReal.ofReal (GrowthFunction X C (2*m) * Real.exp (-(m : ℝ) * ε^2 / 8))
  := by ...

INTEGRATE: P_{D^m⊗D^m}[event] = E_z[P_split[event|z]] (by step 1) ≤ E_z[GF(C,2m) · exp(-mε²/8)] (by step 4, pointwise) = GF(C,2m) · exp(-mε²/8) (bound is independent of z)
```
-- The bound is a constant, so integrating gives the same constant
-- (using IsProbabilityMeasure for the 2m-fold product)
```

Infrastructure needed:

Fin.sumFinEquiv : Fin m ⊕ Fin n ≃ Fin (m + n) (available in Mathlib)
mergeSamples / splitMergedSample (defined above)
SplitMeasure and ValidSplit (defined above)
Measure.pi permutation invariance (to be proved or imported)
Hoeffding for sampling without replacement
GrowthFunction on 2m points + sauer_shelah_exp_bound from Rademacher.lean

MEASURABILITY CONCERNS:

The merged sample z ↦ P_split[event|z] must be measurable as a function of z. Since the event is a finite union over patterns, and each pattern's indicator is a measurable function of z (finite evaluation), this follows.
GrowthFunction X C (2*m) is a natural number (deterministic), no measurability issue.

References: SSBD Theorem 6.7, Hoeffding (1963), Serfling (1974)

source

theorem hoeffding_one_sided_upper {X : Type u} [MeasurableSpace X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (h c : Concept X Bool) (m : ℕ) (hm : 0 < m) (t : ℝ) (ht : 0 < t) (_ht1 : t ≤ 1) (hmeas : MeasurableSet {x : X | h x ≠ c x}) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {xs : Fin m → X | EmpiricalError X Bool h (fun (i : Fin m) => (xs i, c (xs i))) (zeroOneLoss Bool) ≥ TrueErrorReal X h c D + t} ≤ ENNReal.ofReal (Real.exp (-2 * ↑m * t ^ 2))

Upper-tail Hoeffding: for iid Bernoulli(p) draws, the empirical average overshoots the mean by ≥ t with probability ≤ exp(-2mt²).

This is the mirror of hoeffding_one_sided (which bounds the lower tail). The proof uses the same sub-Gaussian machinery with Z_i = indicator(x_i) - p (instead of p - indicator(x_i)).

source

theorem symmetrization_step_lower {X : Type u} [MeasurableSpace X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (C : ConceptClass X Bool) (c : Concept X Bool) (hmeas_C : ∀ h ∈ C, Measurable h) (hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) (hm_large : 2 * Real.log 2 ≤ ↑m * ε ^ 2) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {xs : Fin m → X | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (xs i, c (xs i))) (zeroOneLoss Bool) - TrueErrorReal X h c D ≥ ε} ≤ 2 * ((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D)) {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) ≥ ε / 2}

Symmetrization step for the lower tail: P[∃h: EmpErr-TrueErr ≥ ε] ≤ 2·P_{double}[∃h: EmpErr_S-EmpErr_{S'} ≥ ε/2].

Mirror of symmetrization_step for the opposite direction. Uses hoeffding_one_sided_upper instead of hoeffding_one_sided.

T4: Symmetrization Uniform Convergence Bound (two-sided) #

source

theorem symmetrization_uc_bound {X : Type u} [MeasurableSpace X] [Infinite X] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (C : ConceptClass X Bool) (c : Concept X Bool) (hmeas_C : ∀ h ∈ C, Measurable h) (hc_meas : Measurable c) (m : ℕ) (hm : 0 < m) (ε : ℝ) (hε : 0 < ε) (hm_large : 2 * Real.log 2 ≤ ↑m * ε ^ 2) (hE_nullmeas : MeasureTheory.NullMeasurableSet {p : (Fin m → X) × (Fin m → X) | ∃ h ∈ C, EmpiricalError X Bool h (fun (i : Fin m) => (p.2 i, c (p.2 i))) (zeroOneLoss Bool) - EmpiricalError X Bool h (fun (i : Fin m) => (p.1 i, c (p.1 i))) (zeroOneLoss Bool) ≥ ε / 2} ((MeasureTheory.Measure.pi fun (x : Fin m) => D).prod (MeasureTheory.Measure.pi fun (x : Fin m) => D))) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {xs : Fin m → X | ∃ h ∈ C, |TrueErrorReal X h c D - EmpiricalError X Bool h (fun (i : Fin m) => (xs i, c (xs i))) (zeroOneLoss Bool)| ≥ ε} ≤ ENNReal.ofReal (4 * ↑(GrowthFunction X C (2 * m)) * Real.exp (-(↑m * ε ^ 2 / 8)))

The symmetrization uniform convergence bound: two-sided version. P[∃h∈C: |TrueErr-EmpErr| ≥ ε] ≤ 4·GF(C,2m)·exp(-mε²/8).

Proof strategy (4 steps):

Decompose absolute value: |TrueErr - EmpErr| ≥ ε ↔ (TrueErr - EmpErr ≥ ε) ∨ (EmpErr - TrueErr ≥ ε)

have abs_decomp : ∀ (a b : ℝ),
  |a - b| ≥ ε ↔ a - b ≥ ε ∨ b - a ≥ ε := by
  intro a b; constructor
  · intro h; by_cases h' : a - b ≥ ε
    · exact Or.inl h'
    · exact Or.inr (by linarith [abs_sub_comm a b, le_abs_self (a - b)])
  · intro h; cases h with
    | inl h => exact le_trans (le_of_eq (abs_of_nonneg (by linarith))) (by linarith)
    | inr h => exact le_trans (le_of_eq (abs_of_nonpos (by linarith) ▸ ...)) ...

Upper tail: P[∃h∈C: TrueErr-EmpErr ≥ ε] ≤ 2·GF(C,2m)·exp(-mε²/8)
- Direct application of symmetrization_step + double_sample_pattern_bound.
Lower tail: P[∃h∈C: EmpErr-TrueErr ≥ ε] ≤ 2·GF(C,2m)·exp(-mε²/8)
- Apply the symmetric argument: swap roles of S and S' in the double sample.
- Equivalently, apply symmetrization_step to the event EmpErr-TrueErr ≥ ε and bound the double-sample event {EmpErr_S - EmpErr_{S'} ≥ ε/2}.
- The bound is symmetric because D^m ⊗ D^m is symmetric under swapping factors.
```
have swap_symmetry :
  DoubleSampleMeasure D m {p | ∃ h ∈ C, EmpErr(S) - EmpErr(S') ≥ ε/2}
  = DoubleSampleMeasure D m {p | ∃ h ∈ C, EmpErr(S') - EmpErr(S) ≥ ε/2} :=
  Measure.prod_swap ...
```

Union bound: P[|gap| ≥ ε] ≤ P[gap ≥ ε] + P[gap ≤ -ε] ≤ 2·GF·exp(...) + 2·GF·exp(...) = 4·GF(C,2m)·exp(-mε²/8)

-- Uses: MeasureTheory.measure_union_le for the union of two events
-- CAST: 2 * X + 2 * X = 4 * X in ENNReal (need ENNReal.add_mul or similar)

References: SSBD Theorem 6.7, Kakade-Tewari Lecture 19

T5: Arithmetic — Growth Function × Exponential ≤ δ #

source

theorem growth_exp_le_delta {X : Type u} [MeasurableSpace X] (C : ConceptClass X Bool) (v : ℕ) (hv : 0 < v) (m : ℕ) (hm : 0 < m) (ε δ : ℝ) (hε : 0 < ε) (hδ : 0 < δ) (hδ1 : δ < 1) (hv_bound : ∀ (n : ℕ), v ≤ n → GrowthFunction X C n ≤ ∑ i ∈ Finset.range (v + 1), n.choose i) (hm_bound : (16 * Real.exp 1 * (↑v + 1) / ε ^ 2) ^ (v + 1) / δ ≤ ↑m) :

4 * ↑(GrowthFunction X C (2 * m)) * Real.exp (-(↑m * ε ^ 2 / 8)) ≤ δ ∧ 2 * Real.log 2 ≤ ↑m * ε ^ 2

Sorry-free UC proof: composing symmetrization + arithmetic #

These theorems close the sorry in uc_bad_event_le_delta (Generalization.lean) by composing symmetrization_uc_bound with growth_exp_le_delta. They live here because Symmetrization.lean has access to both components, whereas Generalization.lean cannot import Symmetrization.lean (circular).

source