All Pages

From Glossary

Jump to: navigation, search

Randomized program

A randomized policy is a distribution over the policy set, LaTeX: X, that describes a policy as a random variable. This is not restricted to stochastic programs. The randomized form is called a randomized program, and it is the following semi-infinite program in response space:

\max \sum_{x} w(x) f(x) : w \in W, \sum_x w(x)g(x) \le 0, \sum_x w(x)h(x) = 0,

where LaTeX: \textstyle W = \left \{w: X \to [0, 1]: w(x) > 0 for finitely many LaTeX: x \in X, and LaTeX: \textstyle \sum_x w(x) = 1.

In this form, the randomized program is an ordinary linear program if LaTeX: X is finite. More generally, the definition of LaTeX: W renders the summations well defined. One could interpret LaTeX: w as a randomized policy: use LaTeX: x with probability LaTeX: w(x). A pure strategy is when LaTeX: w(x^*) = 1 for some LaTeX: x^* (so LaTeX: w(x)=0 for LaTeX: \textstyle x \ne x^*); otherwise, LaTeX: w is called a mixed strategy.

One key fact is that the solutions to the original mathematical program (in standard form) correspond to pure strategies in this randomized form. Further, a key to the Lagrangian Strong Duality Theorem is that every mixed strategy is dominated by a pure strategy. Moreover, this underlies the Generalized Lagrange Multiplier method, and there is no loss in optimality to restrict mixed strategies to satisfy the Haar condition: LaTeX: \textstyle \left | \{ x: w(x) > 0 \} \right | \le m+1.

Range constraint

A constraint of the form LaTeX: \textstyle a \le g(x) \le b.

Range of compatibility

See Compatibility theory.

Rank-one correction

This is an update to an approximation of a matrix, notably to the hessian inverse, by adding a matrix of rank 1 to it. This generally takes the form: LaTeX: \textstyle H^T = H + vv^T, where LaTeX: v is a nonzero column vector (so LaTeX: vv^T is a rank 1 matrix).

Rank-two correction

This is an update to an approximation of a matrix, notably to the hessian inverse, by adding a matrix of rank 2 to it. This generally takes the form of two rank-one updates: LaTeX: \textstyle H^T = H + vv^T + uu^T, where LaTeX: v and LaTeX: u are linearly independent column vectors (so LaTeX: \textstyle [vv^T + uu^T] is a rank 2 matrix).

Rate of convergence

See convergence.

Rates of substitution

Arises when equations are split into dependent (or basic) variables and independent (or nonbasic) variables. In linear programming, we rewrite LaTeX: Ax=b as LaTeX: Bu + Nv = b, where LaTeX: u is the vector of basic variables and LaTeX: v is the vector of nonbasics. Then, the original equations are equivalent to LaTeX: \textstyle u = b' + Mv, where LaTeX: \textstyle b' = B^{-1} b and LaTeX: \textstyle M = -B^{-1}N. This implies that the rate of substitution between LaTeX: u_i and LaTeX: v_j is LaTeX: M_{(i, j)} because it describes the marginal rate at which LaTeX: u_i must change in response to a change in LaTeX: v_j to maintain the equations with all other LaTeX: v's held fixed.


A translated half-line: LaTeX: \textstyle \left \{x + t\, h : t \ge 0 \right \}, where LaTeX: h is a recession direction. We call LaTeX: x the root, and we say the ray is rooted at LaTeX: x. (Also see extreme ray.)

Recession cone

(also called characteristic cone), of a given set, LaTeX: S. The set of recession directions for

LaTeX: \textstyle S: \left \{h: \left \{x+th: t \ge 0 \right \} \in S \mbox{ for some } x \in S \right \}.

For example, let LaTeX: rc(S) denote the recession cone of S. Then, LaTeX: \textstyle rc(S)=S if LaTeX: \textstyle S = \mathbb{R}^n or if LaTeX: S=\left \{0 \right \}. If

LaTeX: \textstyle S = \left \{x: Ax \le b \right \}, rc(S) = \left \{h: Ah \le 0 \right \}.

Recession direction

At LaTeX: \textstyle x \in S, this is a nonzero vector, LaTeX: h, for which the associated ray is contained in LaTeX: S. If LaTeX: h is a recession direction for some LaTeX: \textstyle x \in S, it is called a recession direction for LaTeX: S.

Recourse model

See stochastic program.


Same as presolve.

Reduced cost

In linear programming, it is the sum of the direct cost LaTeX: (c_j) of a nonbasic variable LaTeX: (j) and the indirect costs from induced changes in the levels of the basic variables (to satisfy the equations). For a dual price vector, LaTeX: p, the reduced cost vector is is LaTeX: c - pA.

Reduced gradient method

This applies to the case of linear constraints: LaTeX: \textstyle \max \left \{f(x): Ax = b, x \ge 0 \right \}, where LaTeX: \textstyle \mbox{rank}(A) = m and LaTeX: f \in C^2. The variables are partitioned into LaTeX: x = (v,w), with the corresponding partition of LaTeX: \textstyle A = 
B & C
\end{bmatrix}, such that the mathematical program is equivalent to:

\max \left \{ f(v, w): Bv + Cw = b, (v, w) \ge 0 \right \}.

The method assumes LaTeX: B is nonsingular (i.e., only variables with linearly independent columns in LaTeX: A can be grouped), and the nondegeneracy assumption: LaTeX: v > 0. Now dw is the independent direction of change, and LaTeX: \textstyle dv =  -B^{-1}\begin{bmatrix} C & dw \end{bmatrix}, thus keeping LaTeX: \textstyle x + dx = (v+dv, w + dw) on the hyperplanes -- i.e., LaTeX: \textstyle A[x + dx] = Ax + A[dx] = b + 0 = b.

The method considers a direction for the independent variables, which then determines the direction for the dependent variables. In particular, LaTeX: dw is chosen by evaluating the gradient of the objective: LaTeX: \textstyle f(B^{-1} [b-Cw,w]). This gradient (with respect to LaTeX: w) is called the reduced gradient:

r = \nabla w[f(x)] - \nabla v[f(x)]B^{-1} C

at LaTeX: x = (v, w). Then, LaTeX: dw = r completes the first part of the iteration. The second part is to select a step size, which can be blocked by the non-negativity of LaTeX: v. The resulting change can cause the working set to change such that the partition changes.

Also see the generalized reduced gradient method.

Redundant constraint

A constraint whose removal does not change the feasible region. Suppose LaTeX: \textstyle g_i(x) \le 0 is redundant and LaTeX: S_i denotes the feasibility region without this constraint (so LaTeX: x \in S_i implies LaTeX: \textstyle g_i(x) \le 0). Then, the constraint is strongly redundant if LaTeX: \textstyle {x} \in S_i implies LaTeX: g_i(x) < 0; it is weakly redundant if it is redundant and LaTeX: g_{i}(x) = 0 for some LaTeX: x \in S_i.

Refinery problems

There are a myriad of these, including fluid blending problems, particularly the pooling problem, to determine a least costly operation to satisfy demands for products.


Obtaining a new formulation of a problem that is in some sense better, but equivalent to a given formulation. For example, consider the packing constraint: at most 1 element can be selected from LaTeX: \textstyle \left \{1, \dots, n \right \}. Letting LaTeX: x_j be the associated binary variable, the following formulations have the same feasibility region:

  1. LaTeX: 
x_i + x_j \le 1 \mbox{ for all } i < j, j=2, \dots,n;
  2. LaTeX: 
x_1 + x_2 + \dots + x_n \le 1.

Formulation 2 is better because its LP relaxation is sharper: the relaxed region for 2 is a proper subset of the relaxed region for 1.

Reformulation-linearization technique

(RLT). The Reformulation-linearization-technique (rlt) is a general methodology for recasting difficult optimization problems into higher-dimensional spaces for the purpose of obtaining tight bounds. As suggested by the name, it consists of the two basic steps of reformulation and linearization. Given a mixed 0-1 linear program in LaTeX: n binary variables, the reformulation step creates additional redundant nonlinear constraints by multiplying the constraints by product factors of the binary variables LaTeX: x and their complements LaTeX: (1-x), and subsequently enforces the idempotent identity that LaTeX: x^2=x. The linearization step then substitutes a continuous variable for each distinct product of variables. A hierarchy of relaxations result, depending on the form of the product factors employed. An explicit algebraic characterization of the convex hull is available at the highest level, level-n. The method is also applicable to mixed 0-1 polynomial and to continuous, nonconvex programs. Furthermore, it can be applied to general discrete optimization problems by using product factors of Lagrange interpolating polynomials.

Regular constraint

A global constraint that is defined on a fixed-length sequence of finite-domain variables and stating that the values taken by this sequence of variables belongs to a given regular language. It allows us to express relations between the variables of a sequence such as the maximum length sub-sequence of identical values.

Formally, a deterministic finite automaton (DFA) is described by a 5-tuple LaTeX: M = (Q, \Sigma, \delta, q_0, F) where LaTeX: Q is a finite set of states, LaTeX: \Sigma is an alphabet, LaTeX: \delta: Q \times \Sigma \rightarrow Q is a transition function, LaTeX: q_0 \in Q is the initial state, and LaTeX: {F} \subseteq Q is the set of final (or accepting) states. Given an input string, the automaton starts in the initial state LaTeX: q_{0} and processes the string one symbol at the time, applying the transition function LaTeX: \delta at each step to update the current state. If the last state reached belongs to the set of final states LaTeX: F, then the string is accepted. All strings that are accepted by M generates the language LaTeX: {L}(M).

Now, let LaTeX: M = (Q, \Sigma, \delta, q_0, F) be a DFA and let LaTeX: X = \{x_1, s_2, \dots x_n\} be a set of variables with domain LaTeX: D(x_{i}) \subseteq \Sigma for LaTeX: 1\leq i \leq n. Then LaTeX: regular(X,M) = \{ (d_1, \dots, d_n) | \forall i, d_i \in D(x_i), d_1d_2 \dots d_n \in L(M) \}.

Regular point

This pertains to a mathematical program whose functions are in LaTeX: C^1, and the issue is whether the Lagrange Multiplier Rule is valid. A regular point is a point that satisfies some constraint qualification, but some authors are more specific and require the Lagrange constraint qualification:

Let LaTeX: G(x) denote the matrix whose rows are the gradients of all active constraints. Then, LaTeX: x is a regular point if LaTeX: G(x) has full row rank.

This gives additional properties (e.g., see the tangent plane).

In this context, a regular point is also called Lagrange regular. The mathematical program is [Lagrange] regular if every feasible point is [Lagrange] regular.

Reified constraint

If a constraint LaTeX: C is reified by a Boolean variable LaTeX: r then LaTeX: r is true only if LaTeX: C holds and false otherwise, i.e. LaTeX: C \leftrightarrow r. As an example consider LaTeX: x = y \leftrightarrow b where LaTeX: x = y is a constraint reified by LaTeX: b.

Reification is the primary mechanism, for example, in composing a higher level constraint from a logical combination of other constraints: Constraint LaTeX: A could be LaTeX: (w = 1 \vee z = 17) for variables LaTeX: w,z.

Relative interior

The interior of a set when considered in the smallest subspace containing it. In particular, the polyhedron, LaTeX: \left \{x: Ax \ge b \right \}, can be defined as the intersection of the inequalities that are forced, say LaTeX: \textstyle \left \{x: Qx = q \right \}, and the others, say LaTeX: \textstyle \left \{x: Px \ge p \right \} (so LaTeX: \textstyle A = \begin{bmatrix} P & Q \end{bmatrix} and LaTeX: \textstyle b = \begin{bmatrix} p & q \end{bmatrix} ). Then, the relative interior of the original polyhedron is LaTeX: \textstyle \left \{x: Qx = q \mbox{ and } Px > p \right \}.


LaTeX: P' is a relaxation of LaTeX: P if: (1) the feasible region of LaTeX: P' contains the feasible region of LaTeX: P, and (2) the objective value in LaTeX: P', say LaTeX: F(x), is no worse than that of LaTeX: P, say LaTeX: f(x), for all LaTeX: x in the domain of LaTeX: P (for maximization, this means LaTeX: \textstyle F(x) \ge f(x) for all LaTeX: \textstyle x \in X ).

    Some of the most common are as follows:
  1. dropping integer restrictions (viz., we refer to the LP relaxation of a (linear) MIP);
  2. dropping some constraints (viz., an active set method).
  3. aggregating constraints (viz., using surrogate constraint);
  4. using duality (viz., GLM).

One relaxation is sharper than another if its objective value is never further from the original. One way this can occur is for the objectives to be the same and the feasibility region to be less. In particular, one LP relaxation of a MIP is sharper than another if its feasible region is contained in the other.


A term sometimes used to mean a mathematical program is solved after some parameter values have been changed. The idea is to use the original solution, and perhaps some information generated by the algorithm, to solve the related problem more efficiently than starting from scratch. An example is using an advanced basis to begin the simplex method.

Reproduction operation

See genetic algorithm.

Requirements space

Originally coined in linear programming, this represents the set of non-negative linear combinations of the columns of the matrix LaTeX: \textstyle (A): \left \{Ax: x \ge 0 \right \}. For a linear program in standard form, the requirements space is the set of feasible right-hand sides, and it is a convex, polyhedral cone.


The difference, LaTeX: h(x)-b, when seeking a solution to the system of equations, LaTeX: h(x)=b. In particular, if we seek to solve LaTeX: Ax=b, and LaTeX: x is some candidate solution, reached by an algorithm, the residual is LaTeX: \textstyle r = A x - b.

Response space

The set of [sub-]range values for the functions of a mathematical program in standard form:

\left \{(b, c, z): g(x) \le b, h(x) = c \mbox{ and } f(x) \ge z \mbox{ for some } x \in X \right \}.

Here are a few examples. Also see the supplement on response space.

Restricted basis entry rule

This is a restriction on which variables can enter the basis in a simplex method. A common rule arises in separable programming, which uses specially ordered sets: a group of non-negative variables must sum to 1 such that at most two variables are positive, and if two are positive, they must be adjacent. For example, suppose the variables are LaTeX: (x_1,x_2,x_3). Then, it is feasible to have LaTeX: (.5,.5,0) and LaTeX: (0,.2,.8), but it is not feasible to have LaTeX: (.5,0,.5) or LaTeX: (.2,.2,.6). In this case the rule is not to permit a variable to enter the basis unless it can do so without violating the adjacency requirement. For example, if LaTeX: x_1 is currently basic, LaTeX: x_3 would not be considered for entry.

Another restricted entry rule pertains to the delta form of separable programming (plus other applications): Do not admit a variable into the basis unless its predecessor variables are at their upper bound. This means there is an ordered set of bounded variables, say LaTeX: (x_1,x_2,\dots,x_n) such that LaTeX: \textstyle 0 \le x \le (U_1,\dots,U_n). Then, LaTeX: x_k is not considered for basis entry unless LaTeX: x_j=U_j for all LaTeX: j < k.

Reverse convex constraint

LaTeX: \textstyle g(x) \ge 0, where LaTeX: g is a convex function.

Reverse convex program

Minimizing a concave function, usually on a polyhedron. If the feasibility region is convex (as in the case of a polyhedron), non-empty and compact, it has an extreme point that is optimal.

Right-hand side

(RHS). When considering constraints of the form LaTeX: \textstyle g(x) \le b and LaTeX: \textstyle h(x) = c, the vector LaTeX: (b,c) is called the right-hand side.

Rim data

In a linear program, the data values are LaTeX: (A, b, c) plus (possibly) bounds, say LaTeX: \textstyle L \le x \le U. The LaTeX: A matrix pertains to row-column information, whereas all bounds LaTeX: (L, U), right-hand sides LaTeX: (b), and cost coefficients LaTeX: (c) pertain to a column or a row. The latter are called rim data elements because of their position in the following schematic:

& column name & \\
\mbox{obj name} & c & \dots\mbox{Min or Max} \\
\mbox{row name} & A & = b \\
\mbox{Lower bound} & L & \\
\mbox{Upper bound} & U & \\

Robust optimization

A term given to an approach to deal with uncertainty, similar to the recourse model of stochastic programming, except that feasibility for all possible realizations (called scenarios) is replaced by a penalty function in the objective. As such, the approach integrates goal programming with a scenario-based description of problem data. To illustrate, consider the LP:

\min cx + dy: Ax=b, Bx + Cy = e, x, \ge 0,

where LaTeX: d, B, C and LaTeX: e are random variables with possible realizations LaTeX: \textstyle \left \{(d(s), B(s), C(s), e(s): s \in \left \{1, \dots, N \right \} \right \} (LaTeX: N = number of scenarios). The robust optimization model for this LP is:

\min f(x, y(1), \dots, y(N)) + wP(z(1), \dots, z(N)): Ax=b, x \ge 0,

B(s)x + C(s)y(s) + z(s) = e(s), \mbox{ and } y(s) \ge 0, \mbox{ for all } s = 1, \dots, N,

where f is a function that measures the cost of the policy, LaTeX: P is a penalty function, and LaTeX: w > 0 (a parameter to trade off the cost of infeasibility). One example of LaTeX: f is the expected value: LaTeX: \textstyle f(x, y) = cx + \sum_{s} d(s) y(s) p(s), where LaTeX: p(s) = probability of scenario s. In a worst-case model, LaTeX: \textstyle f(x,y) = \max_{s} d(s) y(s). The penalty function is defined to be zero if LaTeX: (x, y) is feasible (for all scenarios) -- i.e., LaTeX: P(0)=0. In addition, LaTeX: P satisfies a form of monotonicity: worse violations incur greater penalty. This often has the form LaTeX: \textstyle P(z) = U(z^+) + V(-z^-) -- i.e., the "up" and "down" penalties, where LaTeX: U and LaTeX: V are strictly increasing functions.

The above makes robust optimization similar (at least in the model) to a goal program. Recently, the robust optimization community defines it differently - it optimizes for the worst-case scenario. Let the uncertain MP be given by

\min f(x; s): x \in X(s),

where LaTeX: S is some set of scenarios (like parameter values). The robust optimization model (according to this more recent definition) is:

\min_x \left \{\max_{s \in S} f(x; s) \right \} x \in X(t) \mbox{ for all } t \in S,

The policy LaTeX: (x) is required to be feasible no matter what parameter value (scenario) occurs; hence, it is requied to be in the intersection of all possible LaTeX: X(s). The inner maximization yields the worst possible objective value among all scenarios. There are variations, such as "adjustability" (i.e., recourse).

Rosen decomposition

Essentially the dual of Dantzig-Wolfe decomposition, this applies to a linear program with linking activities:


Rosenbrock function

The original form has 2 variables:

f(x_1, x_2) = 100(x_2 - x_{1}^{2})^2 + (1 - x_1)^2

The significance of this function is that it has "bannana-shaped" contours, making it difficult for nonlinear programming algorithms, particularly Cauchy's steepest descent, to solve it. The (global) minimum is at LaTeX: \textstyle x^* = (1, 1) with LaTeX: \textstyle f(x^*) = 0. There are variations that extend Rosenbrock's function, many by Schittkowski, such as the following on LaTeX: \mathbb{R}^n:

f(x) = \sum_{1 \le j \le n/2} 100(x_{2j} - x_{2j-1}^2)^2 + (1 - x_{2j-1})^2.

Routing problems

Finding a path or cycle in a network. An easy routing problem is the shortest path; a hard one is the TSP. One prevalent class, with many variations, is vehicle routing.

Personal tools