# Randomized program

A randomized policy is a distribution over the policy set, $LaTeX: X,$ that describes a policy as a random variable. This is not restricted to stochastic programs. The randomized form is called a randomized program, and it is the following semi-infinite program in response space:

$LaTeX: \max \sum_{x} w(x) f(x) : w \in W, \sum_x w(x)g(x) \le 0, \sum_x w(x)h(x) = 0,$

where $LaTeX: \textstyle W = \left \{w: X \to [0, 1]: w(x) > 0$ for finitely many $LaTeX: x \in X,$ and $LaTeX: \textstyle \sum_x w(x) = 1.$

In this form, the randomized program is an ordinary linear program if $LaTeX: X$ is finite. More generally, the definition of $LaTeX: W$ renders the summations well defined. One could interpret $LaTeX: w$ as a randomized policy: use $LaTeX: x$ with probability $LaTeX: w(x).$ A pure strategy is when $LaTeX: w(x^*) = 1$ for some $LaTeX: x^*$ (so $LaTeX: w(x)=0$ for $LaTeX: \textstyle x \ne x^*);$ otherwise, $LaTeX: w$ is called a mixed strategy.

One key fact is that the solutions to the original mathematical program (in standard form) correspond to pure strategies in this randomized form. Further, a key to the Lagrangian Strong Duality Theorem is that every mixed strategy is dominated by a pure strategy. Moreover, this underlies the Generalized Lagrange Multiplier method, and there is no loss in optimality to restrict mixed strategies to satisfy the Haar condition: $LaTeX: \textstyle \left | \{ x: w(x) > 0 \} \right | \le m+1.$

# Range constraint

A constraint of the form $LaTeX: \textstyle a \le g(x) \le b.$

# Rank-one correction

This is an update to an approximation of a matrix, notably to the hessian inverse, by adding a matrix of rank 1 to it. This generally takes the form: $LaTeX: \textstyle H^T = H + vv^T,$ where $LaTeX: v$ is a nonzero column vector (so $LaTeX: vv^T$ is a rank 1 matrix).

# Rank-two correction

This is an update to an approximation of a matrix, notably to the hessian inverse, by adding a matrix of rank 2 to it. This generally takes the form of two rank-one updates: $LaTeX: \textstyle H^T = H + vv^T + uu^T,$ where $LaTeX: v$ and $LaTeX: u$ are linearly independent column vectors (so $LaTeX: \textstyle [vv^T + uu^T]$ is a rank 2 matrix).

See convergence.

# Rates of substitution

Arises when equations are split into dependent (or basic) variables and independent (or nonbasic) variables. In linear programming, we rewrite $LaTeX: Ax=b$ as $LaTeX: Bu + Nv = b,$ where $LaTeX: u$ is the vector of basic variables and $LaTeX: v$ is the vector of nonbasics. Then, the original equations are equivalent to $LaTeX: \textstyle u = b' + Mv,$ where $LaTeX: \textstyle b' = B^{-1} b$ and $LaTeX: \textstyle M = -B^{-1}N.$ This implies that the rate of substitution between $LaTeX: u_i$ and $LaTeX: v_j$ is $LaTeX: M_{(i, j)}$ because it describes the marginal rate at which $LaTeX: u_i$ must change in response to a change in $LaTeX: v_j$ to maintain the equations with all other $LaTeX: v$'s held fixed.

# Ray

A translated half-line: $LaTeX: \textstyle \left \{x + t\, h : t \ge 0 \right \},$ where $LaTeX: h$ is a recession direction. We call $LaTeX: x$ the root, and we say the ray is rooted at $LaTeX: x.$ (Also see extreme ray.)

# Recession cone

(also called characteristic cone), of a given set, $LaTeX: S.$ The set of recession directions for

$LaTeX: \textstyle S: \left \{h: \left \{x+th: t \ge 0 \right \} \in S \mbox{ for some } x \in S \right \}.$

For example, let $LaTeX: rc(S)$ denote the recession cone of S. Then, $LaTeX: \textstyle rc(S)=S$ if $LaTeX: \textstyle S = \mathbb{R}^n$ or if $LaTeX: S=\left \{0 \right \}.$ If

$LaTeX: \textstyle S = \left \{x: Ax \le b \right \}, rc(S) = \left \{h: Ah \le 0 \right \}.$

# Recession direction

At $LaTeX: \textstyle x \in S,$ this is a nonzero vector, $LaTeX: h,$ for which the associated ray is contained in $LaTeX: S.$ If $LaTeX: h$ is a recession direction for some $LaTeX: \textstyle x \in S,$ it is called a recession direction for $LaTeX: S.$

# Reduce

Same as presolve.

# Reduced cost

In linear programming, it is the sum of the direct cost $LaTeX: (c_j)$ of a nonbasic variable $LaTeX: (j)$ and the indirect costs from induced changes in the levels of the basic variables (to satisfy the equations). For a dual price vector, $LaTeX: p,$ the reduced cost vector is is $LaTeX: c - pA.$

This applies to the case of linear constraints: $LaTeX: \textstyle \max \left \{f(x): Ax = b, x \ge 0 \right \},$ where $LaTeX: \textstyle \mbox{rank}(A) = m$ and $LaTeX: f \in C^2.$ The variables are partitioned into $LaTeX: x = (v,w),$ with the corresponding partition of $LaTeX: \textstyle A = \begin{bmatrix} B & C \end{bmatrix},$ such that the mathematical program is equivalent to:

$LaTeX: \max \left \{ f(v, w): Bv + Cw = b, (v, w) \ge 0 \right \}.$

The method assumes $LaTeX: B$ is nonsingular (i.e., only variables with linearly independent columns in $LaTeX: A$ can be grouped), and the nondegeneracy assumption: $LaTeX: v > 0.$ Now dw is the independent direction of change, and $LaTeX: \textstyle dv = -B^{-1}\begin{bmatrix} C & dw \end{bmatrix},$ thus keeping $LaTeX: \textstyle x + dx = (v+dv, w + dw)$ on the hyperplanes -- i.e., $LaTeX: \textstyle A[x + dx] = Ax + A[dx] = b + 0 = b.$

The method considers a direction for the independent variables, which then determines the direction for the dependent variables. In particular, $LaTeX: dw$ is chosen by evaluating the gradient of the objective: $LaTeX: \textstyle f(B^{-1} [b-Cw,w]).$ This gradient (with respect to $LaTeX: w$) is called the reduced gradient:

$LaTeX: r = \nabla w[f(x)] - \nabla v[f(x)]B^{-1} C$

at $LaTeX: x = (v, w).$ Then, $LaTeX: dw = r$ completes the first part of the iteration. The second part is to select a step size, which can be blocked by the non-negativity of $LaTeX: v.$ The resulting change can cause the working set to change such that the partition changes.

Also see the generalized reduced gradient method.

# Redundant constraint

A constraint whose removal does not change the feasible region. Suppose $LaTeX: \textstyle g_i(x) \le 0$ is redundant and $LaTeX: S_i$ denotes the feasibility region without this constraint (so $LaTeX: x \in S_i$ implies $LaTeX: \textstyle g_i(x) \le 0$). Then, the constraint is strongly redundant if $LaTeX: \textstyle {x} \in S_i$ implies $LaTeX: g_i(x) < 0;$ it is weakly redundant if it is redundant and $LaTeX: g_{i}(x) = 0$ for some $LaTeX: x \in S_i.$

# Refinery problems

There are a myriad of these, including fluid blending problems, particularly the pooling problem, to determine a least costly operation to satisfy demands for products.

# Reformulation

Obtaining a new formulation of a problem that is in some sense better, but equivalent to a given formulation. For example, consider the packing constraint: at most 1 element can be selected from $LaTeX: \textstyle \left \{1, \dots, n \right \}.$ Letting $LaTeX: x_j$ be the associated binary variable, the following formulations have the same feasibility region:

1. $LaTeX: x_i + x_j \le 1 \mbox{ for all } i < j, j=2, \dots,n;$
2. $LaTeX: x_1 + x_2 + \dots + x_n \le 1.$

Formulation 2 is better because its LP relaxation is sharper: the relaxed region for 2 is a proper subset of the relaxed region for 1.

# Reformulation-linearization technique

(RLT). The Reformulation-linearization-technique (rlt) is a general methodology for recasting difficult optimization problems into higher-dimensional spaces for the purpose of obtaining tight bounds. As suggested by the name, it consists of the two basic steps of reformulation and linearization. Given a mixed 0-1 linear program in $LaTeX: n$ binary variables, the reformulation step creates additional redundant nonlinear constraints by multiplying the constraints by product factors of the binary variables $LaTeX: x$ and their complements $LaTeX: (1-x),$ and subsequently enforces the idempotent identity that $LaTeX: x^2=x.$ The linearization step then substitutes a continuous variable for each distinct product of variables. A hierarchy of relaxations result, depending on the form of the product factors employed. An explicit algebraic characterization of the convex hull is available at the highest level, level-n. The method is also applicable to mixed 0-1 polynomial and to continuous, nonconvex programs. Furthermore, it can be applied to general discrete optimization problems by using product factors of Lagrange interpolating polynomials.

# Regular constraint

A global constraint that is defined on a fixed-length sequence of finite-domain variables and stating that the values taken by this sequence of variables belongs to a given regular language. It allows us to express relations between the variables of a sequence such as the maximum length sub-sequence of identical values.

Formally, a deterministic finite automaton (DFA) is described by a 5-tuple $LaTeX: M = (Q, \Sigma, \delta, q_0, F)$ where $LaTeX: Q$ is a finite set of states, $LaTeX: \Sigma$ is an alphabet, $LaTeX: \delta: Q \times \Sigma \rightarrow Q$ is a transition function, $LaTeX: q_0 \in Q$ is the initial state, and $LaTeX: {F} \subseteq Q$ is the set of final (or accepting) states. Given an input string, the automaton starts in the initial state $LaTeX: q_{0}$ and processes the string one symbol at the time, applying the transition function $LaTeX: \delta$ at each step to update the current state. If the last state reached belongs to the set of final states $LaTeX: F$, then the string is accepted. All strings that are accepted by M generates the language $LaTeX: {L}(M)$.

Now, let $LaTeX: M = (Q, \Sigma, \delta, q_0, F)$ be a DFA and let $LaTeX: X = \{x_1, s_2, \dots x_n\}$ be a set of variables with domain $LaTeX: D(x_{i}) \subseteq \Sigma$ for $LaTeX: 1\leq i \leq n$. Then $LaTeX: regular(X,M) = \{ (d_1, \dots, d_n) | \forall i, d_i \in D(x_i), d_1d_2 \dots d_n \in L(M) \}$.

# Regular point

This pertains to a mathematical program whose functions are in $LaTeX: C^1,$ and the issue is whether the Lagrange Multiplier Rule is valid. A regular point is a point that satisfies some constraint qualification, but some authors are more specific and require the Lagrange constraint qualification:

Let $LaTeX: G(x)$ denote the matrix whose rows are the gradients of all active constraints. Then, $LaTeX: x$ is a regular point if $LaTeX: G(x)$ has full row rank.

This gives additional properties (e.g., see the tangent plane).

In this context, a regular point is also called Lagrange regular. The mathematical program is [Lagrange] regular if every feasible point is [Lagrange] regular.

# Reified constraint

If a constraint $LaTeX: C$ is reified by a Boolean variable $LaTeX: r$ then $LaTeX: r$ is true only if $LaTeX: C$ holds and false otherwise, i.e. $LaTeX: C \leftrightarrow r$. As an example consider $LaTeX: x = y \leftrightarrow b$ where $LaTeX: x = y$ is a constraint reified by $LaTeX: b$.

Reification is the primary mechanism, for example, in composing a higher level constraint from a logical combination of other constraints: Constraint $LaTeX: A$ could be $LaTeX: (w = 1 \vee z = 17)$ for variables $LaTeX: w,z$.

# Relative interior

The interior of a set when considered in the smallest subspace containing it. In particular, the polyhedron, $LaTeX: \left \{x: Ax \ge b \right \},$ can be defined as the intersection of the inequalities that are forced, say $LaTeX: \textstyle \left \{x: Qx = q \right \},$ and the others, say $LaTeX: \textstyle \left \{x: Px \ge p \right \}$ (so $LaTeX: \textstyle A = \begin{bmatrix} P & Q \end{bmatrix}$ and $LaTeX: \textstyle b = \begin{bmatrix} p & q \end{bmatrix}$ ). Then, the relative interior of the original polyhedron is $LaTeX: \textstyle \left \{x: Qx = q \mbox{ and } Px > p \right \}.$

# Relaxation

$LaTeX: P'$ is a relaxation of $LaTeX: P$ if: (1) the feasible region of $LaTeX: P'$ contains the feasible region of $LaTeX: P,$ and (2) the objective value in $LaTeX: P',$ say $LaTeX: F(x),$ is no worse than that of $LaTeX: P,$ say $LaTeX: f(x),$ for all $LaTeX: x$ in the domain of $LaTeX: P$ (for maximization, this means $LaTeX: \textstyle F(x) \ge f(x)$ for all $LaTeX: \textstyle x \in X$ ).

Some of the most common are as follows:
1. dropping integer restrictions (viz., we refer to the LP relaxation of a (linear) MIP);
2. dropping some constraints (viz., an active set method).
3. aggregating constraints (viz., using surrogate constraint);
4. using duality (viz., GLM).

One relaxation is sharper than another if its objective value is never further from the original. One way this can occur is for the objectives to be the same and the feasibility region to be less. In particular, one LP relaxation of a MIP is sharper than another if its feasible region is contained in the other.

# Reoptimization

A term sometimes used to mean a mathematical program is solved after some parameter values have been changed. The idea is to use the original solution, and perhaps some information generated by the algorithm, to solve the related problem more efficiently than starting from scratch. An example is using an advanced basis to begin the simplex method.

# Requirements space

Originally coined in linear programming, this represents the set of non-negative linear combinations of the columns of the matrix $LaTeX: \textstyle (A): \left \{Ax: x \ge 0 \right \}.$ For a linear program in standard form, the requirements space is the set of feasible right-hand sides, and it is a convex, polyhedral cone.

# Residuals

The difference, $LaTeX: h(x)-b,$ when seeking a solution to the system of equations, $LaTeX: h(x)=b.$ In particular, if we seek to solve $LaTeX: Ax=b,$ and $LaTeX: x$ is some candidate solution, reached by an algorithm, the residual is $LaTeX: \textstyle r = A x - b.$

# Response space

The set of [sub-]range values for the functions of a mathematical program in standard form:

$LaTeX: \left \{(b, c, z): g(x) \le b, h(x) = c \mbox{ and } f(x) \ge z \mbox{ for some } x \in X \right \}.$

Here are a few examples. Also see the supplement on response space.

# Restricted basis entry rule

This is a restriction on which variables can enter the basis in a simplex method. A common rule arises in separable programming, which uses specially ordered sets: a group of non-negative variables must sum to 1 such that at most two variables are positive, and if two are positive, they must be adjacent. For example, suppose the variables are $LaTeX: (x_1,x_2,x_3).$ Then, it is feasible to have $LaTeX: (.5,.5,0)$ and $LaTeX: (0,.2,.8),$ but it is not feasible to have $LaTeX: (.5,0,.5)$ or $LaTeX: (.2,.2,.6).$ In this case the rule is not to permit a variable to enter the basis unless it can do so without violating the adjacency requirement. For example, if $LaTeX: x_1$ is currently basic, $LaTeX: x_3$ would not be considered for entry.

Another restricted entry rule pertains to the delta form of separable programming (plus other applications): Do not admit a variable into the basis unless its predecessor variables are at their upper bound. This means there is an ordered set of bounded variables, say $LaTeX: (x_1,x_2,\dots,x_n)$ such that $LaTeX: \textstyle 0 \le x \le (U_1,\dots,U_n).$ Then, $LaTeX: x_k$ is not considered for basis entry unless $LaTeX: x_j=U_j$ for all $LaTeX: j < k.$

# Reverse convex constraint

$LaTeX: \textstyle g(x) \ge 0,$ where $LaTeX: g$ is a convex function.

# Reverse convex program

Minimizing a concave function, usually on a polyhedron. If the feasibility region is convex (as in the case of a polyhedron), non-empty and compact, it has an extreme point that is optimal.

# Right-hand side

(RHS). When considering constraints of the form $LaTeX: \textstyle g(x) \le b$ and $LaTeX: \textstyle h(x) = c,$ the vector $LaTeX: (b,c)$ is called the right-hand side.

# Rim data

In a linear program, the data values are $LaTeX: (A, b, c)$ plus (possibly) bounds, say $LaTeX: \textstyle L \le x \le U.$ The $LaTeX: A$ matrix pertains to row-column information, whereas all bounds $LaTeX: (L, U),$ right-hand sides $LaTeX: (b),$ and cost coefficients $LaTeX: (c)$ pertain to a column or a row. The latter are called rim data elements because of their position in the following schematic:

$LaTeX: \begin{array}{l|c|l} & column name & \\ \hline \hline \mbox{obj name} & c & \dots\mbox{Min or Max} \\ \hline \hline \mbox{row name} & A & = b \\ \hline \hline \mbox{Lower bound} & L & \\ \mbox{Upper bound} & U & \\ \hline \hline \end{array}$

# Robust optimization

A term given to an approach to deal with uncertainty, similar to the recourse model of stochastic programming, except that feasibility for all possible realizations (called scenarios) is replaced by a penalty function in the objective. As such, the approach integrates goal programming with a scenario-based description of problem data. To illustrate, consider the LP:

$LaTeX: \min cx + dy: Ax=b, Bx + Cy = e, x, \ge 0,$

where $LaTeX: d, B, C$ and $LaTeX: e$ are random variables with possible realizations $LaTeX: \textstyle \left \{(d(s), B(s), C(s), e(s): s \in \left \{1, \dots, N \right \} \right \}$ ($LaTeX: N =$ number of scenarios). The robust optimization model for this LP is:

$LaTeX: \min f(x, y(1), \dots, y(N)) + wP(z(1), \dots, z(N)): Ax=b, x \ge 0,$

$LaTeX: B(s)x + C(s)y(s) + z(s) = e(s), \mbox{ and } y(s) \ge 0, \mbox{ for all } s = 1, \dots, N,$

where f is a function that measures the cost of the policy, $LaTeX: P$ is a penalty function, and $LaTeX: w > 0$ (a parameter to trade off the cost of infeasibility). One example of $LaTeX: f$ is the expected value: $LaTeX: \textstyle f(x, y) = cx + \sum_{s} d(s) y(s) p(s),$ where $LaTeX: p(s) =$ probability of scenario s. In a worst-case model, $LaTeX: \textstyle f(x,y) = \max_{s} d(s) y(s).$ The penalty function is defined to be zero if $LaTeX: (x, y)$ is feasible (for all scenarios) -- i.e., $LaTeX: P(0)=0.$ In addition, $LaTeX: P$ satisfies a form of monotonicity: worse violations incur greater penalty. This often has the form $LaTeX: \textstyle P(z) = U(z^+) + V(-z^-)$ -- i.e., the "up" and "down" penalties, where $LaTeX: U$ and $LaTeX: V$ are strictly increasing functions.

The above makes robust optimization similar (at least in the model) to a goal program. Recently, the robust optimization community defines it differently - it optimizes for the worst-case scenario. Let the uncertain MP be given by

$LaTeX: \min f(x; s): x \in X(s),$

where $LaTeX: S$ is some set of scenarios (like parameter values). The robust optimization model (according to this more recent definition) is:

$LaTeX: \min_x \left \{\max_{s \in S} f(x; s) \right \} x \in X(t) \mbox{ for all } t \in S,$

The policy $LaTeX: (x)$ is required to be feasible no matter what parameter value (scenario) occurs; hence, it is requied to be in the intersection of all possible $LaTeX: X(s).$ The inner maximization yields the worst possible objective value among all scenarios. There are variations, such as "adjustability" (i.e., recourse).

# Rosen decomposition

Essentially the dual of Dantzig-Wolfe decomposition, this applies to a linear program with linking activities:

# Rosenbrock function

The original form has 2 variables:

$LaTeX: f(x_1, x_2) = 100(x_2 - x_{1}^{2})^2 + (1 - x_1)^2$

The significance of this function is that it has "bannana-shaped" contours, making it difficult for nonlinear programming algorithms, particularly Cauchy's steepest descent, to solve it. The (global) minimum is at $LaTeX: \textstyle x^* = (1, 1)$ with $LaTeX: \textstyle f(x^*) = 0.$ There are variations that extend Rosenbrock's function, many by Schittkowski, such as the following on $LaTeX: \mathbb{R}^n:$

$LaTeX: f(x) = \sum_{1 \le j \le n/2} 100(x_{2j} - x_{2j-1}^2)^2 + (1 - x_{2j-1})^2.$

# Routing problems

Finding a path or cycle in a network. An easy routing problem is the shortest path; a hard one is the TSP. One prevalent class, with many variations, is vehicle routing.