PAC Learning Criteria #
PAC (Probably Approximately Correct), Agnostic PAC, and Exact (Angluin) learning.
PAC quantifier structure (standard, realizable) #
∃ L mf, ∀ ε δ > 0, ∀ D (probability on X), ∀ c ∈ C, let m := mf ε δ D^m { xs : Fin m → X | err_D(L(labeled xs)) ≤ ε } ≥ 1 - δ
where D^m = Measure.pi (fun _ : Fin m => D) is the i.i.d. product measure and the labeled sample is constructed as fun i => (xs i, c (xs i)).
Key design decision #
The sample distribution is the CONCRETE product measure D^m, NOT an existentially quantified Dm. The existential ∃Dm formulation is strictly weaker than standard PAC: it allows Dm to depend on the target concept c, making PACLearnable trivially true for all C when X is finite (via memorizer + point mass).
Instead, Dm = Measure.pi (fun _ : Fin m => D), which:
- Does NOT depend on c (D is quantified before c)
- Correctly captures independent sampling
- Uses Mathlib's Measure.pi, which provides IsProbabilityMeasure instance when each factor is a probability measure
Measurability note #
The set { xs | D { x | L.learn (labeled xs) x ≠ c x } ≤ ε } is measured by Measure.pi via outer measure. Full measurability of this set requires the error function xs ↦ D{x | L(S(xs)) x ≠ c x} to be measurable - a deep technical condition that specific proofs (Hoeffding, Sauer-Shelah) will establish.
PAC (Probably Approximately Correct) learning. The central definition of computational learning theory.
Sample space: Fin m → X with i.i.d. product measure D^m. Labels: derived deterministically from target concept c (realizable case). Error: D-probability of disagreement between learner output and c.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Agnostic PAC learning: no realizability assumption. The learner competes against the best hypothesis in H.
Sample space: Fin m → X × Bool with i.i.d. product measure D^m. Here D is a distribution over X × Bool (the joint distribution over instances and labels, possibly noisy). No deterministic labeling assumption.
The learner's error is compared to the BEST hypothesis in H: err(L(S)) ≤ min_{h ∈ H} err(h) + ε.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Exact learning (Angluin model): learn using membership + equivalence queries. This is a deterministic, query-based paradigm - no distributional assumptions. This is a genuinely different paradigm from PAC - deterministic, query-based, with no distributional assumptions.
Equations
- ExactLearnable X C = ∃ (L : ActiveLearner X Bool), ∀ c ∈ C, ∀ (mq : MembershipOracle X Bool), mq.target = c → L.learnMQ mq = c