Formal Learning Theory Kernel: Blueprint v1: The Proof Architecture

3.3.1 The Proof Architecture

We have already proved (in Section 3.2):

\[ \ref{F:finvc} \Rightarrow \ref{F:uc} \Rightarrow \ref{F:erm} \Rightarrow \ref{F:pac} \qquad \text{and}\qquad \neg \ref{F:finvc} \Rightarrow \neg \ref{F:pac}. \]

This establishes \(\ref{F:finvc} \Leftrightarrow \ref{F:uc} \Leftrightarrow \ref{F:erm} \Leftrightarrow \ref{F:pac}\).

The remaining directions:

[leftmargin=*,itemsep=4pt]
\(\ref{F:finvc} \Leftrightarrow \ref{F:fingrowth}\): By the Sauer–Shelah lemma (Lemma 3.10), \(\mathrm{VCdim}(\mathcal{H}) = d\) implies \(\Pi _\mathcal {H}(m) \leq (em/d)^d {\lt} 2^m\) for \(m\) large enough. Conversely, if \(\mathrm{VCdim}(\mathcal{H}) = \infty \), then \(\Pi _\mathcal {H}(m) = 2^m\) for all \(m\) (since \(\mathcal{H}\) shatters a set of every finite size). This is immediate from the definition of VC dimension.
\(\ref{F:pac} \Leftrightarrow \ref{F:agpac}\): Agnostic PAC learnability trivially implies realizable PAC learnability (set \(c^* \in \mathcal{H}\), so the best hypothesis has zero risk). The converse uses the uniform convergence property: if \(\mathcal{H}\) has finite VC dimension, then uniform convergence holds, and Proposition 3.14 shows ERM succeeds in the agnostic setting as well (with the \(\varepsilon ^2\) sample complexity discussed in Section 3.4).
\(\ref{F:finvc} \Leftrightarrow \ref{F:compression}\): This is the deepest equivalence. We sketch the argument below.
\(\ref{F:online} \Rightarrow \ref{F:finvc}\): If \(\mathrm{Ldim}(\mathcal{H}) {\lt} \infty \), then \(\mathrm{VCdim}(\mathcal{H}) \leq \mathrm{Ldim}(\mathcal{H}) {\lt} \infty \), since any shattered set gives a complete binary mistake tree of the same depth (textbook Chapter 10, Combinatorial Dimensions).