Documentation

FLT_Proofs.Computation

Computation and Formal Language Theory #

Computability-theoretic substrate for Gold-style learning theory. Contains:

Alphabets and Formal Languages #

class Alphabet (Sym : Type u_1) :
Type u_1

An alphabet for formal language theory. Finite type with decidable equality.

Instances
    @[reducible, inline]
    abbrev Word (Sym : Type u_1) :
    Type u_1

    A word over an alphabet.

    Equations
    Instances For
      def FormalLanguage (Sym : Type u_1) :
      Type u_1

      A formal language: a set of words.

      Equations
      Instances For

        Automata #

        These are wrappers providing the learning-theoretic interface. Mathlib has DFA/NFA; we bridge to those where possible.

        structure DFA' (Sym : Type u_1) (Q : Type u_2) [Fintype Q] [DecidableEq Q] :
        Type (max u_1 u_2)

        Deterministic finite automaton. Bridge to Mathlib's DFA.

        • step : QSymQ

          Transition function

        • start : Q

          Start state

        • accept : Set Q

          Accept states

        Instances For
          def DFA'.accepts {Sym : Type u_1} {Q : Type u_2} [Fintype Q] [DecidableEq Q] (M : DFA' Sym Q) :

          The language accepted by a DFA.

          Equations
          Instances For
            structure NFA' (Sym : Type u_1) (Q : Type u_2) [Fintype Q] :
            Type (max u_1 u_2)

            Nondeterministic finite automaton.

            • step : QSymSet Q
            • start : Set Q
            • accept : Set Q
            Instances For
              def IsRegularLang (Sym : Type u_1) (L : FormalLanguage Sym) :

              A regular language is a language accepted by some DFA.

              Equations
              Instances For
                structure CFG (Sym : Type u_1) :
                Type (max 1 u_1)

                Context-free grammar.

                • Variable : Type

                  Nonterminal symbols

                • productions : self.VariableSet (List (self.Variable Sym))

                  Production rules: Variable → list of (Variable ⊕ Terminal)

                • start : self.Variable

                  Start symbol

                Instances For
                  def CFG.DerivesStep {Sym : Type u_1} (G : CFG Sym) (sf1 sf2 : List (G.Variable Sym)) :

                  One-step CFG derivation: replace one nonterminal by a production.

                  Equations
                  Instances For
                    def CFG.Derives {Sym : Type u_1} (G : CFG Sym) :
                    List (G.Variable Sym)List (G.Variable Sym)Prop

                    Multi-step CFG derivation: reflexive-transitive closure of one-step.

                    Equations
                    Instances For
                      def CFG.Language {Sym : Type u_1} (G : CFG Sym) :

                      The language generated by a CFG: all terminal strings derivable from start.

                      Equations
                      Instances For
                        def IsContextFree (Sym : Type u_1) (L : FormalLanguage Sym) :

                        A context-free language is a language generated by some CFG.

                        Equations
                        Instances For
                          structure PDA (Sym : Type u_1) (Q : Type u_2) (Γ : Type u_3) :
                          Type (max (max u_1 u_2) u_3)

                          Pushdown automaton - extends DFA with a stack.

                          • step : QOption SymΓSet (Q × List Γ)
                          • start : Q
                          • startStack : Γ
                          • accept : Set Q
                          Instances For

                            Computability Infrastructure #

                            Critical for Gold-style learning: EX-learnability connects to the arithmetic hierarchy.

                            structure TuringMachine :

                            A Turing machine (abstract). We use this as an opaque type representing effective computability. Bridges to Mathlib's computability library.

                            • index :

                              Opaque representation; we work with the computability predicate, not the machine.

                            Instances For
                              def GodelNumbering (α : Type u_1) :
                              Type u_1

                              Gödel numbering: an effective encoding of objects as natural numbers.

                              Equations
                              Instances For
                                def PartialComputable (α : Type u_1) (β : Type u_2) [Encodable α] [Encodable β] :
                                Type (max 0 u_1 u_2)

                                Partial computable function. Bridges to Mathlib's Computable / Partrec. A function f : α →. β is partial computable iff there exists a Nat.Partrec function g on ℕ that commutes with the encodings: (f a).map encode = g (encode a). Note: uses Encodable (not Primcodable); the Partrec witness operates on codes.

                                Equations
                                Instances For
                                  def RESet (α : Type u_1) [Encodable α] :
                                  Type u_1

                                  Recursively enumerable set.

                                  Equations
                                  Instances For
                                    def LimitingRecursive (α : Type u_1) (β : Type u_2) [Encodable α] [Encodable β] :
                                    Type (max 0 u_1 u_2)

                                    Limiting recursion: functions that converge in the limit. f is limiting-recursive if there exists a total computable g such that f(x) = lim_{s→∞} g(x, s). This is the computational substrate of EX-learning.

                                    Equations
                                    Instances For
                                      def Delta02Class (α : Type u_1) [Encodable α] :
                                      Type u_1

                                      Δ₀₂ class: sets whose characteristic function is limiting-recursive. EX-learnable concept classes have Δ₀₂ membership.

                                      Equations
                                      Instances For

                                        Information-Theoretic Primitives #

                                        noncomputable def KolmogorovComplexity (α : Type u_1) [Encodable α] (x : α) :

                                        Kolmogorov complexity: the length of the shortest program that produces x. Defined via Nat.Partrec.Code (Mathlib's effective program codes). K(x) = inf { encode(p) : p.eval 0 = some (encode x) }. If no program produces x, sInf ∅ = 0 by convention (vacuous).

                                        Equations
                                        Instances For
                                          def DescriptionLength (α : Type u_1) :
                                          Type u_1

                                          Description length: length of encoding under a specific description language. Unlike Kolmogorov complexity, this is relative to a chosen coding scheme.

                                          Equations
                                          Instances For
                                            structure MDLPrinciple (X : Type u) (Y : Type v) [Inhabited (Concept X Y)] :
                                            Type (max u v)

                                            Minimum Description Length principle: choose the hypothesis that minimizes description_length(H) + description_length(data | H). Analogous to Kolmogorov complexity. Analogous to Bayesian MAP.

                                            • hypothesisLength : Concept X Y

                                              Description language for hypotheses

                                            • dataLength : Concept X YList (X × Y)

                                              Description language for data given hypothesis

                                            • select (data : List (X × Y)) (H : Set (Concept X Y)) : Concept X Y

                                              MDL selection: minimize total description length. Uses Classical.epsilon to select an h ∈ H minimizing hypothesisLength h + dataLength h data. Inhabited provides the Nonempty witness for epsilon.

                                            Instances For
                                              structure MMLPrinciple (X : Type u) (Y : Type v) [Inhabited (Concept X Y)] :
                                              Type (max u v)

                                              Minimum Message Length: Bayesian analog of MDL. MML seeks the hypothesis that minimizes expected message length under a prior and likelihood model.

                                              • hypothesisLength : Concept X Y
                                              • dataFit : Concept X YList (X × Y)
                                              • select (data : List (X × Y)) (H : Set (Concept X Y)) : Concept X Y

                                                MML selection: minimize total message length (hypothesisLength + dataFit). Inhabited provides the Nonempty witness for epsilon.

                                              Instances For
                                                noncomputable def AlgorithmicProbability (α : Type u_1) [Encodable α] (x : α) :

                                                Algorithmic probability: probability of x under the universal distribution. P(x) = Σ_{p : Code, p(0)=encode(x)} 2^{-|p|}. Related to Kolmogorov complexity by the coding theorem: -log P(x) ≈ K(x). Uses Classical.propDecidable for the halting condition (undecidable in general).

                                                Equations
                                                Instances For
                                                  structure SRM (X : Type u) (Y : Type v) [DecidableEq Y] [Inhabited (Concept X Y)] :
                                                  Type (max u v)

                                                  Structural Risk Minimization: choose hypothesis from the complexity level that minimizes bound on generalization error. Connects VC dimension to inductive bias.

                                                  • levels : Set (Concept X Y)

                                                    Nested hypothesis classes of increasing complexity

                                                  • mono (n : ) : self.levels n self.levels (n + 1)

                                                    Monotonicity: larger index = richer class

                                                  • empiricalRisk : Concept X YList (X × Y)

                                                    Empirical risk: fraction of training errors

                                                  • penalty :

                                                    Complexity penalty: regularizer per level

                                                  • select (data : List (X × Y)) : × Concept X Y

                                                    SRM selection: minimize empiricalRisk(h, data) + penalty(n) over (n, h ∈ levels n). Uses Classical.epsilon for the argmin; Inhabited provides the witness.

                                                  Instances For

                                                    Execution Traces (for process-level concepts) #

                                                    def ExecutionTrace (X : Type u) (Y : Type v) :
                                                    Type (max v u)

                                                    An execution trace records the sequence of hypotheses a learner produces.

                                                    Equations
                                                    Instances For
                                                      def PartialTrace (X : Type u) (Y : Type v) :
                                                      Type (max v u)

                                                      A partial trace that may not be defined at all time steps.

                                                      Equations
                                                      Instances For