Documentation

FLT_Proofs.Basic

Basic Types for Formal Learning Theory #

Foundational vocabulary: domains, labels, concepts, concept classes, hypothesis spaces, inductive bias, version spaces, and loss functions. These are the base types that all learners, criteria, and complexity measures are defined in terms of.

Alternative definitions for ConceptClass and InductiveBias are provided as commented-out variants for different proof contexts (decidable, RE, measurable, multiclass, Bayesian).

Atomic Types #

def Concept (X : Type u) (Y : Type v) :
Type (max u v)

A concept is a function from domain to label. This is the atomic unit that concept classes collect and learners try to approximate.

Equations
Instances For
    @[reducible, inline]
    abbrev ConceptClass (X : Type u) (Y : Type v) :
    Type (max v u)

    A concept class is a set of concepts. Used by every paradigm, complexity measure, and criterion.

    Primary definition: Set of functions. Used for PAC/agnostic PAC where concept classes are sets over which VC dimension, Rademacher complexity, covering numbers, etc. are measured. Alternative definitions below for contexts requiring decidability, enumerability, or measurability.

    Equations
    Instances For

      Every concept in C is a measurable function. Krapp-Wirth precondition: Γ(h) ∈ Σ_Z for all h ∈ H.

      Instances
        @[reducible, inline]
        abbrev HypothesisSpace (X : Type u) (Y : Type v) :
        Type (max v u)

        A hypothesis space is a set of candidate concepts that the learner searches over. When H = C (the realizable case), every concept in the target class is available. When H ⊂ C or H ⊃ C, we are in the improper/agnostic regime.

        Structurally identical to ConceptClass but semantically distinct: ConceptClass is the ground truth collection; HypothesisSpace is what the learner has access to.

        Equations
        Instances For
          @[reducible, inline]
          abbrev Hypothesis (X : Type u) (Y : Type v) :
          Type (max u v)

          A single hypothesis - an element of the hypothesis space. Just a Concept by another name, but the semantic distinction matters: a hypothesis is a learner's GUESS, a concept is a ground truth.

          Equations
          Instances For
            def TargetConcept (X : Type u) (Y : Type v) (C : ConceptClass X Y) :
            Type (max 0 u v)

            The target concept is the specific concept the learner is trying to identify/approximate. It's an element of the concept class.

            Equations
            Instances For
              def IsProper (X : Type u) (Y : Type v) (C : ConceptClass X Y) (H : HypothesisSpace X Y) :

              The proper learning flag: whether the learner's hypothesis space equals the concept class. Proper: H = C. Improper: H ⊃ C (learner can output hypotheses outside C). This distinction matters for computational hardness results (proper_improper_separation).

              Equations
              Instances For

                Structured Types #

                structure InductiveBias (X : Type u) (Y : Type v) :
                Type (max u v)

                Inductive bias: a preference ordering or scoring over hypotheses. This is NOT just a set; it bundles a hypothesis space with a way to RANK hypotheses. In Bayesian learning: the prior distribution. In SRM: the complexity hierarchy. In NFL theorem: the object whose absence makes learning impossible.

                • hypotheses : HypothesisSpace X Y

                  The hypothesis space the bias operates over

                • preference : Concept X Y

                  Preference score: lower is more preferred. Could be complexity, prior probability, etc.

                • not_preferred (h : Concept X Y) : hself.hypothesesh'self.hypotheses, self.preference h' self.preference h

                  Hypotheses outside the space are not preferred over those inside: everything outside H is penalized at least as much as anything inside H. This works for both MDL (description length) and Bayesian (-log prior) readings.

                Instances For
                  structure VersionSpace (X : Type u) (Y : Type v) :
                  Type (max u v)

                  Version space: the set of hypotheses consistent with all data seen so far. Central to Gold-style learning; the learner narrows the version space as data arrives, and convergence = version space shrinks to the target.

                  Defined using HypothesisSpace and a consistency predicate over data.

                  • hypotheses : HypothesisSpace X Y

                    The full hypothesis space

                  • data : List (X × Y)

                    Data seen so far: list of (input, label) pairs

                  • consistent : Set (Concept X Y)

                    The consistent subset - hypotheses that agree with all data

                  • consistent_sub : self.consistent self.hypotheses

                    Consistency condition: every consistent hypothesis is in H and agrees with data

                  • consistent_agrees (h : Concept X Y) : h self.consistentpself.data, h p.1 = p.2
                  Instances For

                    Realizability Flags #

                    def Realizable (X : Type u) (Y : Type v) (C : ConceptClass X Y) (H : HypothesisSpace X Y) :

                    Realizability assumption: the target concept is in the hypothesis space. This is a Prop because it's a condition on the learning setup, not a type.

                    Equations
                    Instances For
                      def Agnostic (X : Type u) (Y : Type v) (_C : ConceptClass X Y) (_H : HypothesisSpace X Y) :

                      Agnostic setting: no assumption that target ∈ H. The learner competes against the best hypothesis in H.

                      Equations
                      Instances For

                        Loss Functions #

                        Loss functions bridge the domain/label types to ℝ for measuring error. They are universal across paradigms (PAC uses expected loss, online uses cumulative loss).

                        def LossFunction (Y : Type v) :

                        A loss function measures the discrepancy between a prediction and true label.

                        Equations
                        Instances For
                          noncomputable def zeroOneLoss (Y : Type v) [DecidableEq Y] :

                          The 0-1 loss for classification.

                          Equations
                          Instances For
                            noncomputable def squaredLoss :

                            Squared loss for regression (Y = ℝ).

                            Equations
                            Instances For