PAC-Bayes Bounds #

McAllester's PAC-Bayes bound for finite hypothesis classes. The Gibbs learner draws h ~ Q (posterior) and classifies with h. The bound relates the Gibbs learner's true error to its empirical error plus a complexity term involving KL(Q‖P).

Main results #

pac_bayes_per_hypothesis: per-hypothesis Hoeffding with prior-weighted tail
pac_bayes_all_hypotheses: simultaneous bound for all h via union bound
pac_bayes_finite: the PAC-Bayes bound (Jensen over Q)

References #

McAllester, "PAC-Bayesian Model Averaging", COLT 1999
McAllester, "Simplified PAC-Bayesian Margin Bounds", COLT 2003

source

noncomputable def gibbsError {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (Q : FinitePMF H) (hs : H → Concept X Bool) (c : Concept X Bool) (D : MeasureTheory.Measure X) :

ℝ

The Gibbs error: expected true error under posterior Q. E_{h~Q}[D{x | h(x) ≠ c(x)}].

Equations

gibbsError Q hs c D = expectFinitePMF Q fun (h : H) => TrueErrorReal X (hs h) c D

Instances For

source

noncomputable def gibbsEmpError {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (Q : FinitePMF H) (hs : H → Concept X Bool) (c : Concept X Bool) {m : ℕ} (S : Fin m → X) :

ℝ

Empirical Gibbs error: expected empirical error under Q. E_{h~Q}[EmpErr(h, S)].

Equations

gibbsEmpError Q hs c S = expectFinitePMF Q fun (h : H) => EmpiricalError X Bool (hs h) (fun (i : Fin m) => (S i, c (S i))) (zeroOneLoss Bool)

Instances For

source

theorem pac_bayes_per_hypothesis {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : H → Concept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ℕ) (hm : 0 < m) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ ≤ 1) (h₀ : H) (hbound_le_one : √(Real.log (1 / (P.prob h₀ * δ)) / (2 * ↑m)) ≤ 1) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin m → X | TrueErrorReal X (hs h₀) c D > EmpiricalError X Bool (hs h₀) (fun (i : Fin m) => (S i, c (S i))) (zeroOneLoss Bool) + √(Real.log (1 / (P.prob h₀ * δ)) / (2 * ↑m))} ≤ ENNReal.ofReal (P.prob h₀ * δ)

Per-hypothesis Hoeffding with prior-weighted tail. For each h with prior weight P(h), the probability that TrueErr(h) exceeds EmpErr(h,S) + √(log(1/(P(h)·δ))/(2m)) is at most P(h)·δ.

This is Hoeffding's inequality with t = √(log(1/(P(h)·δ))/(2m)).

We require the bound parameter t ≤ 1 for Hoeffding's applicability.

source

theorem pac_bayes_all_hypotheses {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] [Nonempty H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : H → Concept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ℕ) (hm : 0 < m) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ ≤ 1) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin m → X | ∀ (h : H), TrueErrorReal X (hs h) c D ≤ EmpiricalError X Bool (hs h) (fun (i : Fin m) => (S i, c (S i))) (zeroOneLoss Bool) + √(Real.log (1 / (P.prob h * δ)) / (2 * ↑m))} ≥ ENNReal.ofReal (1 - δ)

Simultaneous PAC-Bayes bound: with probability ≥ 1-δ, ALL hypotheses h simultaneously satisfy TrueErr(h) ≤ EmpErr(h,S) + √(log(1/(P(h)·δ))/(2m)).

source

theorem pac_bayes_finite {X : Type u} [MeasurableSpace X] {H : Type u_1} [Fintype H] [Nonempty H] (D : MeasureTheory.Measure X) [MeasureTheory.IsProbabilityMeasure D] (c : Concept X Bool) (hc_meas : Measurable c) (hs : H → Concept X Bool) (hhs_meas : ∀ (h : H), Measurable (hs h)) (P : FinitePMF H) (hP_pos : ∀ (h : H), 0 < P.prob h) (m : ℕ) (hm : 0 < m) (δ : ℝ) (hδ : 0 < δ) (hδ1 : δ ≤ 1) :

(MeasureTheory.Measure.pi fun (x : Fin m) => D) {S : Fin m → X | ∀ (Q : FinitePMF H), gibbsError Q hs c D ≤ gibbsEmpError Q hs c S + √((crossEntropyFinitePMF Q P + Real.log (1 / δ)) / (2 * ↑m))} ≥ ENNReal.ofReal (1 - δ)

McAllester's PAC-Bayes bound (finite hypothesis class, union-bound version).

With probability ≥ 1-δ over the sample S of size m, for ALL posteriors Q:

E_{h~Q}[TrueErr(h)] ≤ E_{h~Q}[EmpErr(h,S)] + √((crossEntropyFinitePMF Q P + log(1/δ)) / (2m))

where crossEntropyFinitePMF Q P = ∑_h Q(h)·log(1/P(h)) = KL(Q‖P) + H(Q).

This is the union-bound version. The tight change-of-measure version replaces crossEntropyFinitePMF with klDivFinitePMF and adds log(m).

TODO: Prove the tight version via change of measure (Catoni 2007): E_Q[TrueErr] ≤ E_Q[EmpErr] + √((KL(Q‖P) + log(2√m/δ))/(2m))

Reference: McAllester, COLT 1999.

Documentation

FLT_Proofs.Theorem.PACBayes

PAC-Bayes Bounds #

Main results #

References #