3 PAC Learning and the Fundamental Theorem
This is the central chapter of the book. Every paradigm that follows, online learning, Gold-style identification, universal learning, will be understood partly by how it resembles and partly by how it differs from what is established here. The chapter builds toward a single destination: the Fundamental Theorem of Statistical Learning, which characterizes PAC learnability through a web of nine equivalent conditions. The path there is the content. Each lemma is a foothold; the VC characterization is the summit; the full equivalence web is the view from the top.
The chapter is organized as an ascent.
The PAC framework (Section 3.1): the definition and its moving parts. Brief, the definition is not the centerpiece.
The VC characterization proof (Section 3.2): the full proof that \(\mathrm{VCdim}(\mathcal{H}) {\lt} \infty \) if and only if \(\mathcal{H}\) is PAC learnable, built in four stages through Sauer–Shelah, uniform convergence, ERM analysis, and the converse.
The Fundamental Theorem (Section 3.3): all nine equivalent conditions, the equivalence web, and the directions we proved versus those we cited.
The agnostic setting (Section 3.4): why dropping the realizability assumption changes sample complexity from \(\Theta (d/\varepsilon )\) to \(\Theta (d/\varepsilon ^2)\), and why this gap is not an artifact.
Lower bounds and No Free Lunch (Section 3.5): the matching lower bound \(\Omega (d/\varepsilon )\) and the full NFL proof.
Computational interlude (Section 3.6): the information–computation gap.