2.1 Atomic vocabulary

Definition 2.1 Domain, Label, Concept

Lean: Concept

A domain \(X\) is a set whose elements are called instances. A label set \(Y\) is a set of possible outputs; in binary classification, \(Y = \{ 0, 1\} \). A concept is a function \(c : X \to Y\).

Equivalently, when \(Y = \{ 0,1\} \), a concept \(c\) can be identified with the subset \(\{ x \in X : c(x) = 1\} \subseteq X\). Both perspectives are used throughout the literature. We adopt the function view as primary and the set view when it simplifies combinatorial arguments (as in shattering, Chapter 3).

Definition 2.2 Hypothesis, Target Concept, Proper vs. Improper

A hypothesis \(h : X \to Y\) is a candidate prediction rule that a learning algorithm might output. The target concept \(c^* \in \mathcal{C}\) is the specific concept that generated the training data. Learning is proper if the algorithm’s output is constrained to lie in \(\mathcal{C}\), and improper if it may use a larger hypothesis space \(\mathcal{H} \supseteq \mathcal{C}\).