7.3 The Borel-analytic separation theorem

Proposition 7.11 says that every reasonable concept class satisfies the kernel’s measurability hypothesis via a classical descriptive-set-theory route. What the bridge does not rule out is that the underlying refinement might be cosmetic: perhaps every class that admits any kind of measurability certificate also happens to admit a Borel one, and the WellBehavedVCMeasTarget $\subsetneq $ KrappWirthWellBehaved inclusion is vacuous in content. This section rules that out with a witness.

Krapp and Wirth 2024. Shortly before this formalization effort began, Lothar Sebastian Krapp and Laura Wirth posted Measurability in the Fundamental Theorem of Statistical Learning (arXiv:2410.10243, October 2024). The paper performs exactly the audit this chapter is built around: it scrutinizes the standard proofs of the fundamental theorem of statistical learning in the agnostic setting, identifies the measurability assumptions that are tacitly imposed, and extracts a self-contained proof that makes those assumptions precise. The paper’s measurability hypothesis on the ghost-gap bad event is that it be Borel in the product Borel structure. The refinement in this chapter observes that the Borel hypothesis is strictly stronger than the proof needs, and this section constructs a concept class that realizes the strict gap. The Krapp-Wirth paper is correct: everything stated in it under the Borel hypothesis also holds under the weaker null-measurability hypothesis used here. What the present chapter adds is that the Borel hypothesis is a genuine commitment, not a free choice, and the commitment can be relaxed.

The witness is a family of very thin concept classes on the real line. For each set $A \subseteq \mathbb {R}$, let $\mathbf{1}_{\{ a\} }$ denote the indicator function of the singleton $\{ a\} $, regarded as an element of $\{ 0,1\} ^\mathbb {R}$.

Definition 7.12 Singleton class over a set

Lean: singletonClassOn

For $A \subseteq \mathbb {R}$, define the singleton class over $A$ as

\[ C_A \; =\; \{ \, \mathbf{0}\, \} \, \cup \, \bigl\{ \, \mathbf{1}_{\{ a\} }\, :\, a \in A\, \bigr\} , \]

where $\mathbf{0}$ is the everywhere-false concept ($\mathbf{0}(x) = 0$ for every $x \in \mathbb {R}$).

A singleton class looks unremarkable from the outside. It has VC dimension at most $1$, which makes it PAC learnable under any reasonable measurability hypothesis (one can learn the identity of the non-trivial hypothesis from a single labeled example). Its individual hypotheses are all individually measurable in the strongest possible sense: the zero concept is constant, and each $\mathbf{1}_{\{ a\} }$ is the characteristic function of a Borel singleton.

Lemma 7.13 Every hypothesis in $C_A$ is measurable

Lean: singletonClassOn_measurable

For every $A \subseteq \mathbb {R}$, every hypothesis $h \in C_A$ is Borel measurable.

The surprise is that individual measurability of every hypothesis does not lift to measurability of the ghost-gap bad event for the class as a whole. The bad event is an indexed union over $C_A$, and the index set $A$ can be arbitrarily irregular. The kernel formalizes the reduction from the bad event to a planar witness.

Definition 7.14 Planar witness event

Lean: planarWitnessEvent

For $A \subseteq \mathbb {R}$, the planar witness event is the subset of $\mathbb {R}^2$ given by

\[ W_A \; =\; \bigl\{ \, (x, y) \in \mathbb {R}^2 \; :\; y \in A\ \text{and}\ x \neq y\, \bigr\} . \]

\begin{tikzpicture} [scale=1.0,
axis/.style={-{Stealth[length=2.5mm]}, thick, black},
diag/.style={dashed, thick, notegray},
fiber/.style={thick, thmred},
setA/.style={thick, defblue, fill=defblue!10},
lbl/.style={font=\small}
]
% Axes
\draw[axis] (-0.3, 0) -- (5.2, 0) node[lbl, right] {$x$};
\draw[axis] (0, -0.3) -- (0, 4.5) node[lbl, above] {$y$};

% Diagonal y = x
\draw[diag] (-0.2, -0.2) -- (4.6, 4.6) node[lbl, right, pos=0.9] {$y = x$};

% A as a shaded horizontal strip at y = a1, a2, a3
\foreach \ay in {1.2, 2.3, 3.6} {
\draw[fiber] (-0.05, \ay) -- (5.05, \ay);
}
\node[lbl, text=thmred, anchor=west] at (5.1, 1.2) {$y = a_1$};
\node[lbl, text=thmred, anchor=west] at (5.1, 2.3) {$y = a_2$};
\node[lbl, text=thmred, anchor=west] at (5.1, 3.6) {$y = a_3$};

% Mark the points removed from each fiber: x = y = a_i
\foreach \ay in {1.2, 2.3, 3.6} {
\fill[white, draw=thmred, thick] (\ay, \ay) circle (2.5pt);
}

% Planar witness label
\node[lbl, defblue, anchor=south west] at (0.2, 3.9)
{$W_A = \{(x,y)\,:\,y \in A,\ x \neq y\}$};
\node[lbl, notegray, anchor=south west] at (0.2, 3.5)
{(three sample fibers shown)};
\end{tikzpicture} — Figure 7.1 The planar witness $W_A$ depicted for three values $a_1, a_2, a_3 \in A$. Each fiber $\{ y = a_i\} $ contributes the horizontal line at height $a_i$, punctured at the diagonal point $(a_i, a_i)$. The full set $W_A$ is the union of these punctured horizontal lines over all $y \in A$. When $A$ is analytic, $W_A$ is analytic. When $A$ is not Borel, $W_A$ is not Borel either.

Lemma 7.15 Bad event reduces to planar-witness preimage

Lean: singleton_badEvent_eq_preimage_planar

For every $A \subseteq \mathbb {R}$, the one-step ghost-gap bad event $\mathrm{Bad}(C_A, 1, 1/2)$ on $(\mathrm{Fin}\, 1 \to \mathbb {R}) \times (\mathrm{Fin}\, 1 \to \mathbb {R})$ equals the preimage of the planar witness $W_A$ under the sample-pair projection $(s, s') \mapsto (s(0), s'(0))$.

The reduction replaces a class-indexed existential quantifier with a fixed planar set, trading one measurability problem for another. The remaining problem is to understand the measurability of the planar set.

Theorem 7.16 $W_A$ is analytic for analytic $A$

Lean: planarWitnessEvent_analytic

If $A \subseteq \mathbb {R}$ is analytic then $W_A \subseteq \mathbb {R}^2$ is analytic.

Proof ▼

The set $W_A$ is the intersection of the Borel set $\{ (x, y) : x \neq y\} $ with the cylinder $\mathbb {R}\times A$. Analytic sets are closed under intersection with Borel sets, so $W_A$ is analytic.

Theorem 7.17 $W_A$ is not Borel for some analytic non-Borel $A$

Lean: planarWitnessEvent_not_measurable

There exists an analytic set $A \subseteq \mathbb {R}$ such that $W_A$ is analytic but not Borel. The classical construction takes any complete analytic subset of $\mathbb {R}$, for example the set of codes of ill-founded trees on $\mathbb {N}$ under a standard encoding; this set is analytic and not Borel. The projection $A \mapsto W_A$ preserves the non-Borel property in the following sense: if $W_A$ were Borel, then the fiber of $W_A$ over any $x_0 \notin A$ would be Borel, but that fiber is exactly $A$, a contradiction.

With $C_A$ defined, the planar reduction established, and the planar witness classified by descriptive set theory, the main separation theorem assembles in a single line of logic.

Borel-analytic separation (Corollary 7.18).

Statement. There exists a concept class $C \subseteq \{ 0,1\} ^\mathbb {R}$ such that $C$ carries WellBehavedVCMeasTarget but $C$ does not carry KrappWirthWellBehaved.

Witness. Let $A \subseteq \mathbb {R}$ be any analytic non-Borel set, for example the set of codes of ill-founded trees as in Theorem 7.17. Take $C = C_A$, the singleton class over $A$.

Why the witness works. Every hypothesis in $C_A$ is individually Borel measurable (Lemma 7.13), so the class has the individual regularity one might expect. The ghost-gap bad event coincides, by Lemma 7.15, with the preimage of $W_A$ under the sample-pair projection. The planar witness $W_A$ is analytic by Theorem 7.16 and is not Borel by Theorem 7.17. The sample-pair projection is a homeomorphism onto its image, so the bad event is itself analytic and not Borel. Analytic sets are null-measurable by Corollary 7.10, giving WellBehavedVCMeasTarget. Borel sets are not: the bad event fails KrappWirthWellBehaved by the non-Borel witness.

What the witness exploits. Individual measurability of the hypotheses in a class does not lift to measurability of the class-indexed bad event, because the bad event is an uncountable union whose indexing set can be as wild as the analytic hierarchy permits. The Borel $\sigma $-algebra is stable under countable unions but not under analytic-indexed unions; the completion under any Borel measure is exactly the extra layer needed to accommodate such unions. The singleton class is the minimal construction that exposes this gap: it has VC dimension $1$, a single non-trivial hypothesis per index, and a planar bad event whose irregularity is inherited entirely from $A$.

[ , [

Corollary 7.18 Borel-analytic separation

Lean: singleton_badEvent_not_measurable

There exists a concept class $C_A \subseteq \{ 0,1\} ^\mathbb {R}$ such that $C_A$ satisfies WellBehavedVCMeasTarget but fails KrappWirthWellBehaved. Consequently the fundamental theorem of statistical learning (Theorem 3.7) holds for $C_A$ under the kernel’s NullMeasurableSet refinement and does not hold for $C_A$ under the Krapp-Wirth Borel hypothesis.

Remark 7.19 What the refinement changes and what it does not

The refinement changes the shape of the measurability hypothesis used in the fundamental theorem from a set-theoretic constraint on the class (Borel bad event) to a measure-theoretic constraint (null-measurable bad event). For every class that admits a Borel parameterization, both hypotheses hold and the two versions of the theorem are interchangeable. For classes outside Borel parameterization, only the null-measurability version applies. The Krapp-Wirth paper and this kernel therefore prove the same theorem on the same universe of practically interesting classes; the difference is at the edge of the universe, where concept classes over analytic non-Borel sets live. That edge is sparse in practice but non-empty, and Corollary 7.18 exhibits it in full.

The chain of five theorems formalized in FLT_Proofs.Theorem.BorelAnalyticSeparation thus completes the chapter. Every step is sorry-free, cross-linked with FLT_Proofs.Complexity.BorelAnalyticBridge on the analytic measurability side, and cited from FLT_Proofs.Complexity.Measurability where the WellBehavedVCMeasTarget typeclass lives.