Documentation

FLT_Proofs.Theorem.PACBayes

PAC-Bayes Bounds #

McAllester's PAC-Bayes bound for finite hypothesis classes. The Gibbs learner draws h ~ Q (posterior) and classifies with h. The bound relates the Gibbs learner's true error to its empirical error plus a complexity term involving KL(Q‖P).

Main results #

References #

noncomputable def gibbsError {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (Q : FinitePMF H) (hs : HConcept X Bool) (c : Concept X Bool) (D : MeasureTheory.Measure X) :

The Gibbs error: expected true error under posterior Q. E_{h~Q}[D{x | h(x) ≠ c(x)}].

Equations
Instances For
    noncomputable def gibbsEmpError {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (Q : FinitePMF H) (hs : HConcept X Bool) (c : Concept X Bool) {m : } (S : Fin mX) :

    Empirical Gibbs error: expected empirical error under Q. E_{h~Q}[EmpErr(h, S)].

    Equations
    Instances For
      theorem pac_bayes_per_hypothesis {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : HConcept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ) (hm : 0 < m) (δ : ) ( : 0 < δ) (hδ1 : δ 1) (h₀ : H) (hbound_le_one : (Real.log (1 / (P.prob h₀ * δ)) / (2 * m)) 1) :
      (MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin mX | TrueErrorReal X (hs h₀) c D > EmpiricalError X Bool (hs h₀) (fun (i : Fin m) => (S i, c (S i))) (zeroOneLoss Bool) + (Real.log (1 / (P.prob h₀ * δ)) / (2 * m))} ENNReal.ofReal (P.prob h₀ * δ)

      Per-hypothesis Hoeffding with prior-weighted tail. For each h with prior weight P(h), the probability that TrueErr(h) exceeds EmpErr(h,S) + √(log(1/(P(h)·δ))/(2m)) is at most P(h)·δ.

      This is Hoeffding's inequality with t = √(log(1/(P(h)·δ))/(2m)).

      We require the bound parameter t ≤ 1 for Hoeffding's applicability.

      theorem pac_bayes_all_hypotheses {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] [Nonempty H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : HConcept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ) (hm : 0 < m) (δ : ) ( : 0 < δ) (hδ1 : δ 1) :
      (MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin mX | ∀ (h : H), TrueErrorReal X (hs h) c D EmpiricalError X Bool (hs h) (fun (i : Fin m) => (S i, c (S i))) (zeroOneLoss Bool) + (Real.log (1 / (P.prob h * δ)) / (2 * m))} ENNReal.ofReal (1 - δ)

      Simultaneous PAC-Bayes bound: with probability ≥ 1-δ, ALL hypotheses h simultaneously satisfy TrueErr(h) ≤ EmpErr(h,S) + √(log(1/(P(h)·δ))/(2m)).

      theorem pac_bayes_finite {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] [Nonempty H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : HConcept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ) (hm : 0 < m) (δ : ) ( : 0 < δ) (hδ1 : δ 1) :
      (MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin mX | ∀ (Q : FinitePMF H), gibbsError Q hs c D gibbsEmpError Q hs c S + ((crossEntropyFinitePMF Q P + Real.log (1 / δ)) / (2 * m))} ENNReal.ofReal (1 - δ)

      McAllester's PAC-Bayes bound (finite hypothesis class, union-bound version).

      With probability ≥ 1-δ over the sample S of size m, for ALL posteriors Q:

      E_{h~Q}[TrueErr(h)] ≤ E_{h~Q}[EmpErr(h,S)] + √((crossEntropyFinitePMF Q P + log(1/δ)) / (2m))

      where crossEntropyFinitePMF Q P = ∑_h Q(h)·log(1/P(h)) = KL(Q‖P) + H(Q).

      This is the union-bound version. The tight change-of-measure version replaces crossEntropyFinitePMF with klDivFinitePMF and adds log(m).

      TODO: Prove the tight version via change of measure (Catoni 2007): E_Q[TrueErr] ≤ E_Q[EmpErr] + √((KL(Q‖P) + log(2√m/δ))/(2m))

      Reference: McAllester, COLT 1999.