Computation and Formal Language Theory #

Variable : Type
Nonterminal symbols
productions : self.Variable → Set (List (self.Variable ⊕ Sym))
Production rules: Variable → list of (Variable ⊕ Terminal)
start : self.Variable
Start symbol

Instances For

source

def CFG.DerivesStep {Sym : Type u_1} (G : CFG Sym) (sf1 sf2 : List (G.Variable ⊕ Sym)) :

Prop

One-step CFG derivation: replace one nonterminal by a production.

Equations

G.DerivesStep sf1 sf2 = ∃ (pre : List (G.Variable ⊕ Sym)) (suf : List (G.Variable ⊕ Sym)) (v : G.Variable), ∃ p ∈ G.productions v, sf1 = pre ++ [Sum.inl v] ++ suf ∧ sf2 = pre ++ p ++ suf

Instances For

source

def CFG.Derives {Sym : Type u_1} (G : CFG Sym) :

List (G.Variable ⊕ Sym) → List (G.Variable ⊕ Sym) → Prop

Multi-step CFG derivation: reflexive-transitive closure of one-step.

Equations

G.Derives = Relation.ReflTransGen G.DerivesStep

Instances For

source

def CFG.Language {Sym : Type u_1} (G : CFG Sym) :

FormalLanguage Sym

The language generated by a CFG: all terminal strings derivable from start.

Equations

G.Language = {w : Word Sym | G.Derives [Sum.inl G.start] (List.map Sum.inr w)}

Instances For

source

def IsContextFree (Sym : Type u_1) (L : FormalLanguage Sym) :

Prop

A context-free language is a language generated by some CFG.

Equations

IsContextFree Sym L = ∃ (G : CFG Sym), G.Language = L

Instances For

source

structure PDA (Sym : Type u_1) (Q : Type u_2) (Γ : Type u_3) :

Type (max (max u_1 u_2) u_3)

Pushdown automaton - extends DFA with a stack.

step : Q → Option Sym → Γ → Set (Q × List Γ)
start : Q
startStack : Γ
accept : Set Q

Instances For

Computability Infrastructure #

Critical for Gold-style learning: EX-learnability connects to the arithmetic hierarchy.

source

structure TuringMachine :

Type

A Turing machine (abstract). We use this as an opaque type representing effective computability. Bridges to Mathlib's computability library.

index : ℕ
Opaque representation; we work with the computability predicate, not the machine.

Instances For

source

def GodelNumbering (α : Type u_1) :

Type u_1

Gödel numbering: an effective encoding of objects as natural numbers.

Equations

GodelNumbering α = (α ≃ ℕ)

Instances For

source

def PartialComputable (α : Type u_1) (β : Type u_2) [Encodable α] [Encodable β] :

Type (max 0 u_1 u_2)

Partial computable function. Bridges to Mathlib's Computable / Partrec. A function f : α →. β is partial computable iff there exists a Nat.Partrec function g on ℕ that commutes with the encodings: (f a).map encode = g (encode a). Note: uses Encodable (not Primcodable); the Partrec witness operates on codes.

Equations

PartialComputable α β = { f : α →. β // ∃ (g : ℕ →. ℕ), Nat.Partrec g ∧ ∀ (a : α), Part.map Encodable.encode (f a) = g (Encodable.encode a) }

Instances For

source

def RESet (α : Type u_1) [Encodable α] :

Type u_1

Recursively enumerable set.

Equations

RESet α = { S : Set α // ∃ (f : ℕ → Option α), S = {a : α | ∃ (n : ℕ), f n = some a} }

Instances For

source

def LimitingRecursive (α : Type u_1) (β : Type u_2) [Encodable α] [Encodable β] :

Type (max 0 u_1 u_2)

Limiting recursion: functions that converge in the limit. f is limiting-recursive if there exists a total computable g such that f(x) = lim_{s→∞} g(x, s). This is the computational substrate of EX-learning.

Equations

LimitingRecursive α β = { f : α → β // ∃ (g : α → ℕ → β), ∀ (x : α), ∃ (s₀ : ℕ), ∀ s ≥ s₀, g x s = f x }

Instances For

source

def Delta02Class (α : Type u_1) [Encodable α] :

Type u_1

Δ₀₂ class: sets whose characteristic function is limiting-recursive. EX-learnable concept classes have Δ₀₂ membership.

Equations

Delta02Class α = { S : Set α // ∃ (f : α → ℕ → Bool), ∀ (x : α), x ∈ S ↔ ∃ (s₀ : ℕ), ∀ s ≥ s₀, f x s = true }

Instances For

Information-Theoretic Primitives #

source

noncomputable def KolmogorovComplexity (α : Type u_1) [Encodable α] (x : α) :

ℕ

Kolmogorov complexity: the length of the shortest program that produces x. Defined via Nat.Partrec.Code (Mathlib's effective program codes). K(x) = inf { encode(p) : p.eval 0 = some (encode x) }. If no program produces x, sInf ∅ = 0 by convention (vacuous).

Equations

KolmogorovComplexity α x = sInf {x_1 : ℕ | ∃ (p : Nat.Partrec.Code) (_ : p.eval 0 = Part.some (Encodable.encode x)), Encodable.encode p = x_1}

Instances For

source

def DescriptionLength (α : Type u_1) :

Type u_1

Description length: length of encoding under a specific description language. Unlike Kolmogorov complexity, this is relative to a chosen coding scheme.

Equations

DescriptionLength α = (α → ℕ)

Instances For

source

structure MDLPrinciple (X : Type u) (Y : Type v) [Inhabited (Concept X Y)] :

Type (max u v)

Minimum Description Length principle: choose the hypothesis that minimizes description_length(H) + description_length(data | H). Analogous to Kolmogorov complexity. Analogous to Bayesian MAP.

hypothesisLength : Concept X Y → ℕ
Description language for hypotheses
dataLength : Concept X Y → List (X × Y) → ℕ
Description language for data given hypothesis
select (data : List (X × Y)) (H : Set (Concept X Y)) : Concept X Y
MDL selection: minimize total description length. Uses Classical.epsilon to select an h ∈ H minimizing hypothesisLength h + dataLength h data. Inhabited provides the Nonempty witness for epsilon.

Instances For

source

structure MMLPrinciple (X : Type u) (Y : Type v) [Inhabited (Concept X Y)] :

Type (max u v)

Minimum Message Length: Bayesian analog of MDL. MML seeks the hypothesis that minimizes expected message length under a prior and likelihood model.

hypothesisLength : Concept X Y → ℝ
dataFit : Concept X Y → List (X × Y) → ℝ
select (data : List (X × Y)) (H : Set (Concept X Y)) : Concept X Y
MML selection: minimize total message length (hypothesisLength + dataFit). Inhabited provides the Nonempty witness for epsilon.

Instances For

source

noncomputable def AlgorithmicProbability (α : Type u_1) [Encodable α] (x : α) :

ℝ

Algorithmic probability: probability of x under the universal distribution. P(x) = Σ_{p : Code, p(0)=encode(x)} 2^{-|p|}. Related to Kolmogorov complexity by the coding theorem: -log P(x) ≈ K(x). Uses Classical.propDecidable for the halting condition (undecidable in general).

Equations

AlgorithmicProbability α x = ∑' (p : Nat.Partrec.Code), if p.eval 0 = Part.some (Encodable.encode x) then 2 ^ (-↑(Encodable.encode p)) else 0

Instances For

source

structure SRM (X : Type u) (Y : Type v) [DecidableEq Y] [Inhabited (Concept X Y)] :

Type (max u v)

Structural Risk Minimization: choose hypothesis from the complexity level that minimizes bound on generalization error. Connects VC dimension to inductive bias.

levels : ℕ → Set (Concept X Y)
Nested hypothesis classes of increasing complexity
mono (n : ℕ) : self.levels n ⊆ self.levels (n + 1)
Monotonicity: larger index = richer class
empiricalRisk : Concept X Y → List (X × Y) → ℝ
Empirical risk: fraction of training errors
penalty : ℕ → ℝ
Complexity penalty: regularizer per level
select (data : List (X × Y)) : ℕ × Concept X Y
SRM selection: minimize empiricalRisk(h, data) + penalty(n) over (n, h ∈ levels n). Uses Classical.epsilon for the argmin; Inhabited provides the witness.