Randomized program

From Glossary

Jump to: navigation, search

A randomized policy is a distribution over the policy set, LaTeX: X, that describes a policy as a random variable. This is not restricted to stochastic programs. The randomized form is called a randomized program, and it is the following semi-infinite program in response space:

LaTeX: 
\max \sum_{x} w(x) f(x) : w \in W, \sum_x w(x)g(x) \le 0, \sum_x w(x)h(x) = 0,

where LaTeX: \textstyle W = \left \{w: X \to [0, 1]: w(x) > 0 for finitely many LaTeX: x \in X, and LaTeX: \textstyle \sum_x w(x) = 1.

In this form, the randomized program is an ordinary linear program if LaTeX: X is finite. More generally, the definition of LaTeX: W renders the summations well defined. One could interpret LaTeX: w as a randomized policy: use LaTeX: x with probability LaTeX: w(x). A pure strategy is when LaTeX: w(x^*) = 1 for some LaTeX: x^* (so LaTeX: w(x)=0 for LaTeX: \textstyle x \ne x^*); otherwise, LaTeX: w is called a mixed strategy.

One key fact is that the solutions to the original mathematical program (in standard form) correspond to pure strategies in this randomized form. Further, a key to the Lagrangian Strong Duality Theorem is that every mixed strategy is dominated by a pure strategy. Moreover, this underlies the Generalized Lagrange Multiplier method, and there is no loss in optimality to restrict mixed strategies to satisfy the Haar condition: LaTeX: \textstyle \left | \{ x: w(x) > 0 \} \right | \le m+1.


Views
Personal tools