MWU update step and one-step potential bound #

Given a column c that the row player hits (M r c = true), the MWU update multiplies weights c by 1 - η and leaves all other weights unchanged. The one-step potential bound Φ' ≤ Φ · (1 - η · v) follows from the row player's best-response property composed with the hypothesis that every PMF-valued column mixture admits a pure response with payoff at least v.

source

def ProbabilityTheory.mwuUpdateWeights {C : Type u_1} [Fintype C] {R : Type u_2} (M : R → C → Bool) (η : ℝ) (hη1 : η < 1) (cfg : MWUConfig C) (r : R) :

MWUConfig C

One step of the multiplicative-weights update: columns c with M r c = true have their weights scaled by 1 - η; all other weights are unchanged.

Equations

ProbabilityTheory.mwuUpdateWeights M η hη1 cfg r = { weights := fun (c : C) => cfg.weights c * if M r c = true then 1 - η else 1, weights_pos := ⋯ }

Instances For

source

theorem ProbabilityTheory.best_response_payoff_weights {R : Type u_1} {C : Type u_2} [Fintype R] [Fintype C] [Nonempty C] (M : R → C → Bool) (v : ℝ) (hrow : ∀ (q : FintypePMF C), ∃ (r : R), v ≤ ∑ c : C, q.prob c * if M r c = true then 1 else 0) (cfg : MWUConfig C) :

v * cfg.potential ≤ ∑ c : C, cfg.weights c * if M ⋯.choose c = true then 1 else 0

Best-response payoff in terms of weights: a pure response against the normalized PMF gives expected payoff at least v · Φ when weighed by the un-normalized weight vector.

source

theorem ProbabilityTheory.potential_one_step_bound {R : Type u_1} {C : Type u_2} [Fintype R] [Fintype C] [Nonempty C] (M : R → C → Bool) (η : ℝ) (hη : 0 ≤ η) (hη1 : η < 1) (v : ℝ) (hrow : ∀ (q : FintypePMF C), ∃ (r : R), v ≤ ∑ c : C, q.prob c * if M r c = true then 1 else 0) (cfg : MWUConfig C) :

(mwuUpdateWeights M η hη1 cfg ⋯.choose).potential ≤ cfg.potential * (1 - η * v)

One-step MWU potential bound: after one update, the potential is multiplied by (1 - η · v) or less.

Documentation

ZPM.Probability.Decision.Minimax.MWU.Update

MWU update step and one-step potential bound #