# CPLEX

A collection of mathematical programming software solvers.

# Capacity expansion

This is an augmentation to any mathematical program that has constraints representing a capacity limit. Suppose $LaTeX: g(x) \le b$ represents the limitation of using more than $LaTeX: b$ units of capacity, where $LaTeX: g(x)$ is the actual amount used for policy $LaTeX: x$. Then, this becomes a capacity expansion model by replacing the constraint with

$LaTeX: g(x) - vC \le b \;\mbox{ and }\; 0 \le v \le 1.$
Here $LaTeX: C$ is the maximum amount of new capacity, with $LaTeX: vC$ the portion brought online by choice of $LaTeX: v$. This is also reflected in the objective function with an added cost term, say $LaTeX: F(v)$, such that $LaTeX: F(0)=0$ and $LaTeX: F$ increasing on $LaTeX: [0, 1]$. If continuous use of the capacity is not possible (i.e., one builds a new plant or not), the model further requires $LaTeX: v$ to be binary-valued (0 or 1), and $LaTeX: F(1)$ is called the fixed charge.

# Capital budgeting problem

In its elementary form there is a fixed amount of capital, say $LaTeX: C$, that can be allocated to any of $LaTeX: n$ investments. Each investment has a minimum level, say $LaTeX: L$, and a maximum level, say $LaTeX: U$. The expected return on investment is a function, $LaTeX: v_j(x_j)$, where $LaTeX: x_j$ is the level of the $LaTeX: j$-th investment opportunity ($LaTeX: L_j \le x_j \le U_j$). Risk is measured by a standard deviation from the expected return, say $LaTeX: s_j(x_j)$. The problem is to maximize total expected return, subject to a budget constraint: $LaTeX: \sum_j x_j \le C$, and a risk constraint: $LaTeX: \sum_j v_j(x_j) + a_j s_j(x_j)} \le b$, where $LaTeX: a_j$ and $LaTeX: b$ are parameters. The returns on the investments could be correlated. Then, if $LaTeX: Q$ is the variance-covariance matrix, the risk constraint is quadratic: $LaTeX: v^Tx + x^T Q x \le b$. (Also see the portfolio selection problem.)

# Caratheodory conditions

For the classical Lagrange form, $LaTeX: \min \{f(x): x \in \mathbb{R}^n, \; h(x) = 0 \}$, where $LaTeX: f$ and $LaTeX: h$ are smooth, the following conditions are necessary for a feasible $LaTeX: x$ to be optimal: there exists $LaTeX: (y_0, y) \in \mathbb{R}^{m+1}\backslash \emptyset$, called multipliers, such that

$LaTeX:

y_0 \nabla f(x) - y^T \nabla h(x) = 0.

$

This reduces to the Lagrange Multiplier Rule when $LaTeX: y_0$ is not zero (divide by $LaTeX: y_0$), which must be the case if $LaTeX: \nabla h(x)$ has full row rank.

# Caterer problem

A caterer has booked his services for the next $LaTeX: T$ days. He requires $LaTeX: r_t$ fresh napkins on the t-th day, $LaTeX: t=1,\ldots,T$. He sends his soiled napkins to the laundry, which has 3 speeds of service: $LaTeX: s=1, 2$, or $LaTeX: 3$ days. The faster the service, the higher the cost, $LaTeX: c_s$, of laundering a napkin. He can also purchase new napkins at a cost, $LaTeX: c_0$. With an initial stock of $LaTeX: N$ napkins, the caterer wishes to minimize his total cost. (This can be formulated as a transportation problem.)

# Cauchy-Schwarz inequality

This is inequality

$LaTeX: |x^T y| \le \|x\| \|y\|,$

where $LaTeX: \|*\|$ is the Euclidean norm. (Note: for $LaTeX: \|x\| = \|y\| = 1$, $LaTeX: x^T y$ is the cosine of the angle between vectors $LaTeX: x$ and $LaTeX: y$.) .

# Ceiling

The integer round-up of a value, $LaTeX: x$:

$LaTeX: \mbox{Ceiling}(x) = \min \{ z : z \in \mathbb{Z}, \; z \ge x \}.

$

Examples: $LaTeX: \mbox{Ceiling}(5)=5$, $LaTeX: \mbox{Ceiling}(5.001)=6$, $LaTeX: \mbox{Ceiling}(-1.2) = -1$. Also, see floor.

# Central path

A differentiable curve where each point is the analytic center of a polyhedron associated with a linear program. Specifically, suppose we have the primal-dual pair:

$LaTeX: P: \; \max \left\{c^Tx : x \ge 0, \; Ax \le b \right\}.$
and
$LaTeX: \; D: \; \min \left\{ y^Tb : y \ge 0, \; A^Ty \ge c \right\}.$

Let $LaTeX: s = b - Ax$ be the primal slack. Then, the primal central path is:

$LaTeX: x^*(\mu) = \arg\max \left\{c^Tx + \mu \left(\sum_j \ln(x_j) + \sum_i \ln(s_i)\right) : x > 0, \; Ax + s = b, \; s > 0 \right\},$
for $LaTeX: u > 0$, where $LaTeX: \mu$ is the penalty parameter. Let $LaTeX: d = -(c - A^y)$ be the dual surplus. Then, the dual central path is:

$LaTeX: y^*(u) = \arg\min \left\{ y^Tb - \mu \left(\sum_j \ln(d_j) + \sum_i \ln(y_i)\right): y > 0, \; A^Ty - d = c, \; d > 0 \right\},$
for $LaTeX: u > 0$. A key fact is that if the LP optimality region is nonempty and bounded, then the central path converges to its analytic center as $LaTeX: \mu \downarrow 0$.

# Certificate

A set of sufficient conditions for some property to hold.

A certificate of optimality is a set of conditions that certify some feasible solution, $LaTeX: x$, is optimal. One example comes from duality: let $LaTeX: y$ be feasible in any dual with objective value $LaTeX: F(y)$. Then, $LaTeX: F(y)=f(x)$ is a certificate of optimality. In particular, if we have a linear program:

$LaTeX: \max \{ cx : Ax = b, x \ge 0,$

a certificate that a feasible $LaTeX: x$ is optimal is the set of dual conditions:

$LaTeX: A^Ty \ge c \;\;\mbox{ and }\;\; c^Tx = y^T b.$

(Note the general use of duality does not require any convexity assumption, and the dual can be weak.)

A certificate of unboundedness is a collection of conditions that certify a mathematical program is unbounded. An example from duality is that a primal feasible solution is found and the dual is determined to be infeasible. In particular, if we have a linear program:

$LaTeX: \max \{cx : Ax = b, \; x \ge 0 \},$

a certificate that this is unbounded is the existence of a feasible x and the determination that $LaTeX: A^Ty \ge c$ implies a contradiction.

A certificate of infeasibility is a set of conditions that certify a mathematical program is infeasible. One class comes from duality: a dual sequence is found whose objective diverges. In particular, if we have a linear program:

$LaTeX: \max \{ c^T x : Ax = b, \; x \ge 0\},$

then a certificate that this is infeasible is the existence of a sequence $LaTeX: y^k$ such that $LaTeX: A^T y^k \ge c$ for all $LaTeX: k$ and $LaTeX: y^k \rightarrow \infty$ as $LaTeX: y \rightarrow \infty$.

# Character of solution

The idea is to define character by solution status. In linear [integer] programming, columns might be partitioned into

• basic vs. nonbasic at Lower bound vs. nonbasic at Upper bound
• positive in some optimal solution vs. zero in every optimal solution

Rows might be partitioned into

• inactive vs. active
• non-binding vs. binding
• positive dual price in some optimal solution vs. zero dual price in every optimal solution

Only some rows and columns might be partitioned, and others are free to change. The character is the property we want to fix by the partition. For example, if the LP has activities for operating a plant, we might want to specify the plant operation as a character of the solution, never mind specific levels (or maybe some minimal throughput is specified).

Then, one conducts sensitivity analysis by asking for preservation of the character. Classical LP sensitivity analysis does this by preserving the optimality of the basis; interior point analysis seeks to preserve the optimality of the partition. Both are special cases, and one might want to focus on only a portion of the LP. Here are some properties of interest:

1. If the character holds at [c', b'] and at [c", b"], it holds for all [c, b] in their line segment.
2. The stability region is a convex set.

This notion of character extends naturally to combinatorial optimization, such as preserving a key portion of a TSP tour or job-order in a schedule.

# Characteristic cone

The Characteristic cone of a polyhedron, $LaTeX: P$ is

$LaTeX: \{y: x + y \in P \;\mbox{ for all }\; x \in P\}.$

If $LaTeX: P = \{x : Ax \le b\}$, then its characteristic cone is $LaTeX: \{y: Ay \le 0\}$, which is the same as the recession cone.

# Chemical equilibrium problem

The problem is

$LaTeX: \min \left\{ c^Tx + \sum_j \left(x_j \ln(x_j) \right) : x > 0, \; \sum\limits_j x_j = M, \; Ax=b \right\},$
where the objective is the Gibbs free energy function with $LaTeX: x_j$ being the number of moles of species $LaTeX: j$, $LaTeX: b_i$ being the atomic weight of atom of type $LaTeX: i$, and $LaTeX: A_{i,j}$ being the number of atoms of type $LaTeX: i$ in one mole of species $LaTeX: j$. The equation, $LaTeX: Ax=b$, is mass balance. (Its significance is that it was an early application to show how an LP approach can apply to a nonlinear mathematical program.)

# Chinese postman problem

A mailman starts from the post office, delivers the mail on his route, covering each street at least once, then returns to the post office. The problem is to find a traversal such that the total distance travelled is minimized. (Some streets are one-way, and the postman is driving.) Abstractly, the problem is to find a least-cost cycle in a graph that uses each edge and arc (forward direction only) at least once. This differs from the Traveling Salesman Problem in that the postman can traverse an edge more than once. A related problem is to determine whether the postman can cover his route within a prescribed distance (while starting and ending at his post office). If so, the cycle is produced. This is computationally equivalent to the minimization problem (in a complexity sense), and is sometimes taken as the problem definition.

# Chinese remainder theorem

Let $LaTeX: m_1, m_2, \ldots, m_k$ be relatively prime positive integers, and let $LaTeX: b_1, b_2, \ldots, b_k$ be any integers. Then, there exists $LaTeX: x$ such that $LaTeX: x = b_i(\hspace*{-8pt}\mod x_i)$ for each $LaTeX: i$. The value of $LaTeX: x$ is uniquely determined by $LaTeX: m_1, m_2, \ldots, m_k$ and $LaTeX: b_1, b_2, \ldots, b_k$.

# Cholesky factorization

Given an $LaTeX: m \times m$ symmetric matrix $LaTeX: A$, a lower triangular matrix, $LaTeX: L$, is obtained, called the Cholesky factor, such that $LaTeX: A = LL^T$. This is particularly useful for solving linear systems, $LaTeX: Ax=b$, by using forward substitution, $LaTeX: Ly=b$, then backward substitution, $LaTeX: L^T x = y$. The original algorithm assumes $LaTeX: A$ is positive definite, but it can apply more generally. This is also called Cholesky decomposition.

# Chvatal cut

A cutting plane of the form,

$LaTeX: \sum_j F(A_j) x_j \le F(b)$,

where $LaTeX: F$ is a Chvatal function.

# Chvatal function

This class is defined recursively:

• $LaTeX: F$ is a Chvatal function (of rank 0) if it is a linear function.
• If $LaTeX: F$ and $LaTeX: G$ are Chvatal functions, then $LaTeX: aF+bG$ is a Chvatal function for any nonnegative $LaTeX: a$ and $LaTeX: b$, and its rank is $LaTeX: \max \{\mbox{rank}(F), \mbox{rank}(G)\}$. (Some require $LaTeX: a$ and $LaTeX: b$ to be rational.)
• If $LaTeX: F$ is a Chvatal function of rank $LaTeX: r$, then $LaTeX: | F |$ is a Chvatal function, and its rank is $LaTeX: r+1$ if $LaTeX: | F |$ is not equal to $LaTeX: F$.

This arises in integer linear programming in several ways, see Gomory function.

# Closed form solution

This is where one could express a solution (e.g., optimal) by an equation of the

form:

$LaTeX: x^* = F(p),$

where $LaTeX: F$ is some function of the parameters, $LaTeX: p$. This is typically meant to suggest that no algorithm is needed, but one must be careful since $LaTeX: F$ is not limited in any way, such as being "easy" to evaluate. For example, under mild assumptions, the coordinates of a global maximum of a function, $LaTeX: f$, on the domain $LaTeX: X$, is given by the following integral:

$LaTeX: \displaystyle x^*_j = \lim_{w \rightarrow \infty} \frac{\int_{X} x_j e^{wf(x)} dx} {\int_{X} e^{w f(x)} dx}.$

(The idea is to weight "mass" at the global maximum, and it is valid if the set of global maxima is convex.) This is not too useful, and the integral would typically have to be evaluated with a numerical method - i.e., an algorithm.

# Closed function

In convex analysis, this is where its epigraph is a closed set. (If f is concave, it is its hypograph that must be closed.)

# Closed map

Let $LaTeX: A: X \rightarrow 2^X$ be a point-to-set map and suppose $LaTeX: x^k \rightarrow x$ and $LaTeX: y^k \rightarrow y$ are such that $LaTeX: y^k \in A(x^k)$ for all $LaTeX: k$. Then, $LaTeX: A$ is closed at $LaTeX: x$ if $LaTeX: y \in A(x)$. The map $LaTeX: A$ is closed on a subset of $LaTeX: X$, say $LaTeX: S$, if $LaTeX: A$ is closed at each $LaTeX: x$ in $LaTeX: S$.

# Closed set

A set is closed if it contains all its limit points, i.e. if $LaTeX: x^k \rightarrow x$ and each $LaTeX: x^k$ is in $LaTeX: X$, then $LaTeX: x$ is in $LaTeX: X$. By the logic of the definition, an empty set is closed (it is also open). A finite number of intersections of closed sets is a closed set, so the feasibility region of a mathematical program in standard form is closed if each constraint function is continuous. ($LaTeX: g$ can be only upper semi-continuous for the constraint, $LaTeX: g(x) \le 0$.) An example of a set that is not closed is: $LaTeX: {x: x < 0}$. The point 0 can be approached by a sequence, say $LaTeX: x^k = 1/k$, but 0 is not in the set. The closure of a set is the union of the set and all of its limit points.

# Closure condition

The conditinos that the closure of a nonempty strict interior be equal to the level set: $LaTeX: \mbox{cl}( \{x \in X: g(x) < 0 \} = \{x \in X: g(x) \le 0\}$. Here is an example where the closure condition fails:

$LaTeX: g(x) = \left\{ \begin{array}{cl}

 -1-x, & x < -1 \\ 0, & -1 \le x < 0 \\ -1 + (x-1)^2, & 0 \le x < 2 \\ x-2, & 2 \le x. \end{array} \right.

$

Note that $LaTeX: g$ is continuous and quasiconvex. Its level set is $LaTeX: \{ x: g(x) \le 0 \} = [-1, 2]$, but the strict interior, is $LaTeX: (0, 2)$, so its closure is only $LaTeX: [0, 2]$. We lose the flat portion, $LaTeX: [-1, 0)$.

This is important in the use of interior point methods and in stability. Here are some functions that satisfy the closure condition:

When equality constraints are present, there are two forms of extension of the closure condition: to consider the relative strict interior, and to consider "feasible sequences". The first generally assumes $LaTeX: h$ is affine, so the closure condition becomes:

$LaTeX: \mbox{cl}( \{ x: Ax=b, \; g(x) < 0 \}) = \{ x: Ax=b, g(x) \le 0\}.$
The second defines a family of maps:

$LaTeX: I_i - = \{ x \in X: g(x) < 0, \; h_k(x) = 0 \; \mbox{ for } k \neq i, \; h_i(x) < 0 \}$

and

$LaTeX: I_i+ = \{ x \in X : g(x) < 0, \; h_k(x) = 0 \; \mbox{ for } k \neq i, \; h_i(x) > 0 \}.$

Then, the closure condition is that $LaTeX: I_i-$ and $LaTeX: I_i+$ are not empty, and

$LaTeX: \mbox{cl}(I_i-) = \{ x \in X : g(x) \le 0, \; h_k(x) = 0 \; \mbox{ for } k \neq i, \; h_i(x) \le 0 \}$

and

$LaTeX: \mbox{cl}(I_i+) = \{ x \in X : g(x) \le 0, \; h_k(x) = 0 \; \mbox{ for } k \neq i, \; h_i(x) \ge 0 \}.$

# Coercive function

The function $LaTeX: f:\mathbb{R}^n \rightarrow \mathbb{R}^n$ is coercive with respect to $LaTeX: X \subseteq \mathbb{R}^n$ if there exists a vector $LaTeX: x^0 \in X$ such that for any sequence $LaTeX: x^k \in X$ having the property that $LaTeX: \|x^k \| \rightarrow \infty$, we also have

$LaTeX: \lim_{k \rightarrow \infty} | (f(x^k) - f(x^0))^T (x^k - x^0) | = \infty,$
where any norm can be used. This arises in variational inequalities (and complementarity problems).

Some people use a different definition, where $LaTeX: f : \mathbb{R}^n \rightarrow \mathbb{R}$ (i.e., a scalar, real-valued function):

$LaTeX: |f(x)| / \| x \| \rightarrow \infty, \;\mbox{ as }\; \|x\| \rightarrow \infty$
Note that the two definitions differ, even for $LaTeX: n=1$. For example, $LaTeX: f(x)=|x|$ is coercive under the first definition and is not under the second. For the bilinear function, $LaTeX: f(x,y)=x^TAy$, for some square matrix $LaTeX: A$, $LaTeX: f$ is coercive if there exists $LaTeX: a > 0$ such that
$LaTeX: y^TAy \ge a||y||^2.$

# Column generation

This pertains to solving a linear program whose columns are generated during pricing. Typically, the number of columns is astronomically large, possibly infinite. An example is when solving the randomized program, as with the Generalized Lagrange Multiplier method. In that case, column generation consists of maximizing the Lagrangian. A similar view applies to Dantzig-Wolfe decomposition. From the dual view, this is a cutting plane method since generating a column in the primal corresponds to generating a constraint in the dual.

The concept applies to any mathematical program, and the randomized program model hightlights the duality and how a completely general mathematical program can be considered with (generalized) linear programming.

# Combinatorial program

Let $LaTeX: N = \{1, ..., n\}$ and consider a finite collection of subsets, say $LaTeX: {S_1, S_2, ..., S_m}$. For each subset there is an objective function value, $LaTeX: f(S_k)$, and the problem is to optimize $LaTeX: f(S_k)$. Typically, the feasible subsets are represented by inclusion or exclusion of members such that they satisfy certain conditions. This then becomes a special class of integer programs whose decision variables are binary valued: $LaTeX: x_{i,k}=1$ if the i-th element is in $LaTeX: S_k$; otherwise, $LaTeX: x_{i,k}=0$. (IP formulations are not always easy, and often there is more than one formulation, some better than others.)

Also see integer program.

# Compact formulation

In integer programming, this refers to having a polynomial number of constraints. For example, look at the travelling salesman formulations. The linear form has an exponential number of "subtour elimination consgtraints," so it is not compact. The

# Compact set

This has a general meaning in mathematical analysis. For mathematical programming, where we work in finite-dimensional Euclidean vector spaces, it means the set is closed and bounded. This is often an assumption about the feasibility region in order to ensure the existence of an optimum value for a continuous objective function – see Weierstrass' theorem. Any finite set is compact. An open interval $LaTeX: (a,b)$ is not compact because its endpoints, $LaTeX: a$ and $LaTeX: b$ are limit points but are excluded from the set.

# Compatibility theory

The idea that a solution's character does not change for a particular perturbation. In linear programming the character could be an optimal basis, and the theory is concerned with whether a particular basis remains optimal when the data is changed in a prescribed direction. A Fundamental Theorem of Basis Compatibility is the following:

$LaTeX: h$ is an admissible direction for perturbing $LaTeX: (b, c)$ if, and only if, it is compatible with some equilibrium basis.

The range of compatiblity of a basis, $LaTeX: B$, for a direction, $LaTeX: h$, is the greatest step for which $LaTeX: B$ remains optimal:

$LaTeX: \sup \{t : B \;\mbox{ is optimal for the LP defined by }\; r + th\}.$

The basis spectrum is the greatest range:

$LaTeX: \sup\{ \mbox{range}(B; h) : B \;\mbox{ is optimal for the LP defined by }\; r\}.$

# Complementarity condition

Satisfying complementary slackness.

# Complementarity problem

Let $LaTeX: F: \mathbb{R}^n \rightarrow \mathbb{R}^n$. The complementarity problem (CP) is to find $LaTeX: z \ge 0$ such that $LaTeX: F(z) \ge 0$ and $LaTeX: F(z)z' = 0$. It is complementary because every solution has the property that either $LaTeX: z_j=0$ or $LaTeX: F_j(z)=0$ (or both) for each j. The linear complementarity problem (LCP) is when $LaTeX: F(z)=Mz+q$.

The problem generalizes to allow bounds so that $LaTeX: L \le z \le U$. Then, $LaTeX: F(z)$ is required to satisfy:

$LaTeX: F_j(z) \ge 0$ if $LaTeX: z_j = L_j$

$LaTeX: F_j(z) \le 0$ if $LaTeX: z_j = U_j$

$LaTeX: F_j(z) = 0$ if $LaTeX: L_j < z_j < U_j$.

# Complementary slackness

The condition that two non-negative vectors are orthogonal. It arises in the Kuhn-Tucker conditions, where the Lagrange multiplier, say $LaTeX: y$, must be orthogonal to the (inequality) constraint functional value: $LaTeX: y^T g(x)=0$. This means either $LaTeX: y_i=0$ or $LaTeX: g_i(x)=0$ for each $LaTeX: i$ -- that is, if a constraint is not active, its Lagrange multiplier must be zero.

# Complementary variables

If $LaTeX: u$ and $LaTeX: v$ are the two vectors that satisfy complementary slackness, each pair, $LaTeX: u_i$ and $LaTeX: v_i$ are called complementary variables. In linear programming, primal levels and dual prices are complementary variables.

# Complexity

A measure of computer time or space to solve a problem by an algorithm as a function of the problem's dimensions. Suppose $LaTeX: T(n)$ is the time it takes to solve an instance of the problem with dimension $LaTeX: n$. Then, the algorithm has (worst-case) time complexity $LaTeX: K(n)$, if the greatest time it could take to solve an instance of the problem is $LaTeX: O(K(n))$. When $LaTeX: K(n)$ is a polynomial, we say the algorithm has polynomial time complexity. The Klee-Minty polytope shows that the elementary simplex method does not have polynomial time complexity.

The average time complexity is the average (rather than worst) time of an algorithm (or class of algorithms) over some class of problems, for which some probability distribution is assumed. Absent a modifier, the complexity of an algorithm is taken to mean its worst-case time complexity.

Whereas complexity refers to the performance of an algorithm, see the notion of NP-completeness for the related meaning of problem complexity. The standard notation to describe complexity in terms of problem dimensions, say $LaTeX: n$, is $LaTeX: O(K(n))$. This big-O notation means the following: a function, $LaTeX: f:Z^+ \rightarrow R$, is $LaTeX: O(K(n))$ if there exists a constant, $LaTeX: c$, and $LaTeX: N$ in $LaTeX: Z^+$, such that $LaTeX: f(n) \le cK(n)$ for all $LaTeX: n \ge N$. For example, if an algorithm requires $LaTeX: 5n^3 + 2n + 10$ fundamental operations on a problem of size $LaTeX: n$, its time complexity is $LaTeX: O(n^3)$. See the supplement on complexity for more information.

# Complicating variables

Those variables which, when fixed at particular values, render the remaining mathematical program relatively easy to solve. One example is the use of binary variables for fixed charges in an otherwise linear program. More generally, see Benders' decomposition.

# Component

In a graph (or network), this is a maximal connected subgraph. (The number of components is 1 if the graph is connected.)

# Composite concave program

A global optimization problem whose objective is a composite concave function.

# Composite function

A function of a function, of the form $LaTeX: g(f(x))$.

# Concave function

The real valued function $LaTeX: f$ is concave if its domain, say $LaTeX: X$, is a convex set, and for $LaTeX: x, y \in X$ and $LaTeX: 0 \le a \le 1$:

$LaTeX: f(ax + (1-a)y) \ge af(x) + (1-a)f(y).$
Equivalently, $LaTeX: f$ is concave if its hypograph is a convex set or if its negative is a convex function.

# Condition number

This is $LaTeX: \|A\| \|A^{-1}\|$ when $LaTeX: A$ is nonsingular and $LaTeX: \|\cdot\|$ is some matrix norm. This arises in convergence analysis, where $LaTeX: A$ is the hessian. Whenever $LaTeX: A$ is symmetric and positive definite (as the hessian of a strictly convex function), and the matrix norm is that induced by the Euclidean norm (i.e., $LaTeX: \|A\| := \max \{\|Ay\|: \|y\|=1\}$), the condition number is the ratio of the extreme values of the eigenvalues. It is often used to measure the convergence rate of an ascent algorithm, due to Kantorovich's inequality. (Unfortunately, it is erroneously used to measure the goodness of an algorithm – see Myths and Counterexamples in Mathematical Programming.)

# Cone

A set, $LaTeX: C$, with the property that if $LaTeX: x \in C$, then $LaTeX: a x \in C$, for all positive real $LaTeX: a$. A convex cone is a cone that is also a convex set. Equivalently, $LaTeX: C$ is a convex cone if $LaTeX: C = C + C$. (An example of a cone that is not convex is the union of the axes.) A polyhedral cone is a cone that is also a polyhedron; equivalently, $LaTeX: C$ is a polyhedral cone if there exists a matrix $LaTeX: A$ such that $LaTeX: C = \{ x : Ax \le 0\}$. An example of a cone that is not polyhedral is $LaTeX: \{(x,y,z): x^2+y^2-z^2=0\}$.

A quadratic cone is of the form $LaTeX: \{x: x^T Q x \le 0\}$, where $LaTeX: Q$ is any (real) matrix. If $LaTeX: Q$ is negative semi-definite, the cone is all of $LaTeX: \mathbb{R}^n$ space. If $LaTeX: Q$ is positive definite, the cone is just the origin, $LaTeX: \{0\}$. So, quadratic cones usually arise when $LaTeX: Q$ is indefinite. Example: $LaTeX: \{x : x \ge 0, \; 2x_1x_2 \ge \|(x_3,...,x_n)||^2\}$. See each of the following special cones:

# Cone of optimality

Given an LP in standard form, let $LaTeX: B$ be an optimal basis (for which the dual is feasible). Then, the cone of optimality of $LaTeX: B$ is $LaTeX: {b : B^{-1}b& \ge 0}$. It is so called because $LaTeX: B$ remains an optimal basis for any $LaTeX: b$ in this cone.

# Conic program

The standard form is

$LaTeX: \min \{ c^T x : Ax=b, \; x \in K\}$,

where $LaTeX: K$ is a cone (not necessarily convex). This is more general than it looks. Suppose $LaTeX: S$ is any non-empty set, and we have the mathematical program:

$LaTeX: \min \{ c^T x : x \in S\}$.

Then, defining $LaTeX: K = \{t(1,x): x \in S, \; t \ge 0\}$, we have that the above problem is equivalent to the conic program:

$LaTeX: \min \{ c^Tx : (x_0, x) \in K, \; x_0 = 1 \}$.

This problem class also includes the semi-definite program, as $LaTeX: x$ could be a matrix.

# Conjugate directions

Direction vectors chosen in an algorithm that are conjugate with previous directions, with respect to the hessian of the objective. For the unconstrained quadratic, $LaTeX: x^T Qx + c^Tx$ (where $LaTeX: Q$ is symmetric), direction vector $LaTeX: d$ is conjugate if $LaTeX: d^T Q d^k = 0$ for all previous direction vectors, $LaTeX: d^k$.

# Conjugate duality

See the supplement on particular duals.

# Conjugate function

The convex conjugate of $LaTeX: f$ on $LaTeX: X$, denoted $LaTeX: f^*$ on $LaTeX: X^*$, is the greatest convex approximation from below:

$LaTeX: f^*(x^*) = \sup \{ x^T x^* - f(x): x \in X\}$,

and $LaTeX: X^* = \{x^*: f^*(x^*) < \infty\}$, i.e. $LaTeX: X^*$ is the effective domain of $LaTeX: f^*$). The concave conjugate of $LaTeX: f$ on $LaTeX: X$, denoted $LaTeX: f^{\wedge}$ on $LaTeX: X^{\wedge}$, is the least concave approximation from above:

$LaTeX: f^{\wedge}(x^*) = \inf\{ x^T x* - f(x): x \in X\}$,

and $LaTeX: X^{\wedge} = \{x^*: f^{\wedge}(x^*) > -\infty}$. This is a foundation for Lagrangian duality, viewed in response space.

An ascent method for unconstrained optimization such that successive directions are conjugate with respect to the hessian for a quadratic objective function.

# Conjugate vectors

With respect to a symmetric matrix, $LaTeX: A$, $LaTeX: u$ and $LaTeX: v$ are conjugate if $LaTeX: u^TAv = 0$. (In particular, if $LaTeX: A=I$, $LaTeX: u$ and $LaTeX: v$ are conjugate if they are orthogonal.)

# Conjunctive normal form

A logical expression that is written as a conjunction,

$LaTeX: C_1 \wedge C_2 \wedge \ldots \wedge C_m$,

where $LaTeX: \wedge$ is the conjunction (logical and). Each $LaTeX: C_i$ is called a clause and has the form

$LaTeX: L_{i_1} \vee L_{i_2} \vee \ldots \vee L_{i_n}$,

where $LaTeX: \vee$ is the disjunction (logical inclusive or) and each $LaTeX: L_i$ is a literal -- a proposition or its negation.

# Connected network

In the undirected case, a graph is connected if, for any pair of nodes, there is a path that contains both of them. In a directed graph, or network, the path may be required to be directed (i.e., follow the orientations of the arcs), in which case the network is strongly connected; or, the path may be allowed to ignore the arc orientations, in which case the network is weakly connected.

# Consistency

In general, a logical theory is consistent, if it contains no contradictions.

In constraint programming, a (partial) variable assignment is called consistent with respect to a set of constraints if none of the constraints is violated.

More generally, in CP, different definitions of consistency are used as a basis for inference. See:

# Consistent

Pertains to a system of equations and inequalities, for which there exists a feasible solution.

# Constraint

Usually this is a relation of the form of an inequality, g(x) <= 0, or equation, h(x)=0. More generally, it can be any restriction the decision variables must satisfy. For example, some regard "x must be integer-valued" as a constraint, while others would say that this is simply the domain of the decision variables in the mathematical program. There are other relations, such as the logical form of a precedence constraint: IF x=0 THEN y=0.

In constraint programming, a constraint is an arbitrary relation imposed on a set of variables $LaTeX: X = \{x_1, \dots, x_n\}$ where each variable must be assigned on value from its domain. A constraint holds if all variables in $LaTeX: X$ are assigned values from their domain and the relation is satisfied.

# Constraint graph

Constraint graphs are used to conceptualize the set of constraints and their scopes in a CSP.

A primal constraint graph represents variables by nodes and associates an arc with any two nodes in the scope of in the same constraint.

A dual constraint graph represents each constraint with a node and associates a labeled arc with any two nodes whose scopes share variables. The arcs are labeled by the shared variables.

# Constraint optimization problem

The general aim for constraint optimization problem is to find an optimal solution to a set of constraints subject to some objective function obj. For example, consider a CSP $LaTeX: P = \{ C, x_1 \in D_1, \dots, x_n \in D_n\}$ together with a function $LaTeX: obj : Sol \to \mathbb{R}$. $LaTeX: obj$ is a mapping from solutions to $LaTeX: P$ (i.e., all elements the set $LaTeX: Sol$) to the set of real numbers, $LaTeX: \mathbb{R}$.

A solution, $LaTeX: d$, to a constraint optimization problem is a complete assignment of variables that satisfies all constraints $LaTeX: C \in P$ and for which the value $LaTeX: obj(d)$ is optimal (either minimal or maximal, depending on the sense of the optimization).

# Constraint programming

Constraint Programming (CP) is a paradigm to solve satisfaction or optimization problems by a two-step approach: propagation and tree search. Given a CSP the decision variablesâ€™ domains are tightened by first propagating the constraints. If after this step all domains contain only one value then a solution has been found. If at least one domain is empty, the CSP has no solutions. Otherwise, the second step, tree search, is initiated: by specifying a variable ordering heuristic and a value ordering heuristic, a search tree is constructed which is then traversed according to one of various search strategies. If the problem is a satisfaction problem, the first solution found is returned or it is proved that no solution exists. For an optimization problem the whole tree is traversed in a branch-and-bound search and the best solution returned.

# Constraint propagation

Constraint propagation is a general term for inference techniques used in constraint programming. The basic technique consists in explicitly forbidding values or combinations of values for some variables of a problem because a given subset of its constraints cannot be satisfied otherwise. Constraint propagation appears under different names depending on both historical period and author including constraint relaxation, constraint inference, local consistency enforcing, and rule iteration.

Typically, the propagation process is repeated until a "fixed point" is reached where no further values can be removed from the domain of any variable.

# Constraint propagator

A constraint propagator is an algorithm (often associated with a global constraint) which implements a constraint propagation algorithm. Typically, the algorithm prunes the domains of the variables involved in the constraint, removing values for which it can be shown that they participate in no solutions given the current search node.

Constraint propagator is also now as filtering algorithm.

# Constraint qualification

Conditions on the constraint functions ($LaTeX: g$ and $LaTeX: h$) that are sufficient to make the Lagrange Multiplier Rule valid. Here is an example to illustrate what can go wrong:

$LaTeX: \max x : x^2 \le 0$.

Since $LaTeX: x=0$ is the only feasible solution, it is optimal. The Lagrange Multiplier Rule requires that there exist $LaTeX: u$ for which $LaTeX: f' - u^T g' = 0$, but $LaTeX: f' = 1$ and $LaTeX: g'= 0$, so no such $LaTeX: u$ exists. Slater used this example in illustrating his interiority condition. The classical qualification, given by Lagrange's multiplier theorem without inequality constraints, is that $LaTeX: \nabla h(x)$ have full row rank, which stems from the Implicit Function Theorem. Another constraint qualification is that all constraint functions be affine (even with redundant constraints). Each constraint qualification gives a sufficient condition for the Lagrange Multiplier Theorem to be valid. A constraint qualification is necessary if it must hold in order to guarantee that the Lagrange Multiplier Rule is valid for all smooth $LaTeX: f$ having optimal value at $LaTeX: x$.

# Constraint satisfaction problem

A classical Constraint Satisfaction Problem (CSP) is defined by a finite set of variables $LaTeX: X = \{x_1, \dots, x_n\}$ where each variable $LaTeX: x_i$ is associated with a finite domain of integers $LaTeX: D_i= \{1, \dots, N_i\}$. Furthermore, a finite set of constraints $LaTeX: \mathcal{C} = \{C_1, \dots, C_m \}$ is imposed on the variables, where a constraint $LaTeX: C_i$ is an arbitrary relation on a subset of variables $LaTeX: X$. A solution to the CSP is a value assignment to each variable such that the value lies within the respective domain and all constraints are satisfied.

There exist different variants and sub-classes of constraint satisfaction problem:

# Constraint solver

A constraint solver typically consists of two parts:

A solution is found as soon as every variable is assigned a value from its domain and all constraints are satisfied.

A constraint solver can typically search for one solution (returning the first solution it finds, if one exists) or for all solutions.

# Continuous program

A mathematical program with continuous variables. Moreover, the term is commonly used to further imply that the objective and constraint funtions are continuous, an assumption that allows certain analytical results. For example, Weierstrass first proved that continuous functions attain both a maximum and a minimum over a compact set. If the problem contains both continuous and integer variables, then the problem is said to be a mixed integer program.

# Contour

Also called an isoquant of a function, $LaTeX: f$. The projection of its graph into its domain: $LaTeX: \{x \in X : f(x) = c\}$, where $LaTeX: c$ is the contour value.

# Contraction map

Also called a contractor. A function that brings points closer together, forming the foundation for convergence by successive approximation. Mathematically, we have $LaTeX: f: X \rightarrow X$, where $LaTeX: X$ is in a normed space, with norm of $LaTeX: x$ denoted by $LaTeX: \|x\|$. $LaTeX: f$ is a contractor if there exists a constant $LaTeX: K$ in $LaTeX: [0,1)$ such that $LaTeX: \|f(x)-f(y)\| \le K \|x-y\|$ for all $LaTeX: x$ and $LaTeX: y$ in $LaTeX: X$. See the supplement on fixed points.

# Convergence

Specifically, convergence of an algorithm. The algorithm is represented as the point-to-set map, $LaTeX: x' \in A(x)$, where there is some selection function to choose $LaTeX: x'$ if $LaTeX: A(x)$ has more than one member. Convergence means that the sequence, $LaTeX: x^k$, has a limit point, say $LaTeX: x$, such that $LaTeX: x$ satisfies certain conditions. Ideally, the conditions are that $LaTeX: x$ is an optimal solution, at least locally, but this is often not the definition used in a convergence theorem. (For example, in unconstrained optimization, most algorithms converge to a stationary point, which need not be even a local optimum.)

Let $LaTeX: \textstyle s^k \rightarrow 0$, where $LaTeX: s^k$ is a scalar series pertaining to the sequence $LaTeX: x^k$. Typically, $LaTeX: s^k = \|x^k - x\|$, which is sometimes called policy convergence (to a solution, $LaTeX: x$). We could also have $LaTeX: s^k = f(x^k) - f^*$, where $LaTeX: f$ is the objective function, in which case the concern is with value convergence to an optimal value, $LaTeX: f^*$.

Some related terms and concepts:

• Dual convergence. Dual values, notably Lagrange multipliers, converge to a dual solution.
• Geometric convergence. Same as linear convergence, but usually used when the sequence is exactly a geometric progression: $LaTeX: s^k = r^k(s^0)$ - $LaTeX: r^k$ is $LaTeX: r$ to the power of $LaTeX: k$
• Global convergence. Usually means the same as globally convergent, but some mean that the algorithm convergences to a global solution.
• Globally convergent. Convergence to a solution from any starting point.
• Linear convergence. Order is 1 and rate is less than 1 (see below).
• Local convergence. Some mean locally convergent, but some mean that the limit point is a local optimum (or just satisfies necessary conditions, see Myths and Counter Examples in Mathematical Programming to avoid misconception).
• Locally convergent. There exists an open neighborhood of 0 such that {s^k}-->0 from any s^0 in the neighborhood.
• Order of convergence. The order is $LaTeX: \sup\{ p : \lim_{k \rightarrow \infty} \|s^{k+1}\| / \|s^k\|^p < \infty \}$. The order being 1 is linear convergence and the order being 2 is quadratic convergence. Most (non-finite) algorithms to solve mathematical programs are between linear and quadratic.

Let $LaTeX: v^k = \|s^k\|$ and suppose $LaTeX: v^0 = 1$ (for notational convenience). The term order derives from the approximate equation, $LaTeX: v^{k+1} := r(v^k)^p$, where $LaTeX: r$ is the rate. If this equation were exact, we would have $LaTeX: v^k = r^k$ if $LaTeX: p=1$, and $LaTeX: v^k = r^{(p^k-1)/(p-1)}$ if $LaTeX: p > 1$, for all $LaTeX: k$. If $LaTeX: r=0.1$ and $LaTeX: p=1$, we gain one decimal place each time: $LaTeX: v^1 = 0.1$, $LaTeX: v^2 =0.01$, $LaTeX: v^3 = 0.001$, etc. If $LaTeX: p=2$, the number of accurate decimal places approximately doubles each iteration: $LaTeX: v^1 = 0.1$, $LaTeX: v^2 = 0.0001$, $LaTeX: v^3 =0.0000001$, etc. Unfortunately, the underlying equation is not exact – see Myths and Counter Examples in Mathematical Programming to avoid misconception. Some authors call this q-order (or Q-order) convergence to distinguish it from variations of the definition of order. Each definition is designed to capture the notion of how many digits of accuracy are added as the sequence approaches its limit.

• Rate of convergence. This is generally used to mean the limiting ratio: $LaTeX: \lim_{k \rightarrow \infty} \|s^{k+1}\| / \|s^k\|^p$, given the order is $LaTeX: p$.
• Sublinear convergence. Order is 1 and rate is 1 (slower than all linear convergence), e.g., $LaTeX: s^k = 1/k$.
• Superlinear convergence. Order is 1 and rate is 0 (faster than all linear convergence), e.g., $LaTeX: s^k = (1/k)^k$.
• Stochastic convergence. This applies when the successor point is a random variable, as in simulated annealing. Here are the most common types of convergence for a sequence of random variables, $LaTeX: X_n \rightarrow X$.
• with probability $LaTeX: 1$ or almost everywhere (abbr., a.e.). $LaTeX: P\{\lim_{n \rightarrow \infty} X_n = X\} = 1$.
• in probability. $LaTeX: P\{\|X_n - X\| > e\} \rightarrow 0$ for all $LaTeX: e > 0$.
• in distribution. The sequence of cumulative distribution functions (cdf), converges point-wise: $LaTeX: F_n(x) \rightarrow F(x)$ for all $LaTeX: x$ at which $LaTeX: F$ is continuous, where $LaTeX: F_n$ is the cdf of $LaTeX: X_n$ and $LaTeX: F$ is the cdf of $LaTeX: X$.
• in p-th mean. $LaTeX: E\{\|X_n - X\|^p\} \rightarrow 0$. For $LaTeX: p=2$, this is called convergence in quadratic mean or in mean square.

# Convex combination

An affine combination of vectors whose coefficients are non-negative, i.e., $LaTeX: \textstyle\sum_k a_k x^k$, where $LaTeX: a_k \ge 0$ and $LaTeX: \textstyle\sum_k a_k = 1$.

# Convex cost flow problem

The min-cost network flow problem with a nonlinear convex cost function.

# Convex function

Let $LaTeX: f:X\rightarrow R$. Then $LaTeX: f$ is convex if $LaTeX: X$ is a convex set, and for $LaTeX: x, y \in X$ and $LaTeX: 0 \le a \le 1$:

$LaTeX: f(ax + (1-a)y) \le af(x) + (1-a)f(y)$.

Equivalently, its epigraph is convex.

# Convex hull

(of a set). The intersection of all convex supersets (which can be limited to halfspaces). Equivalently, the set of all convex combinations of points in the set (which can be limited to convex combinations of at most n+1 points, in n dimensions, which is known as Carathéodory's Theorem).

# Convex program

The mathematical program $LaTeX: \min \{f(x) : x \in X \}$ is convex if the objective function $LaTeX: f$ is convex, and the feasible region $LaTeX: X$ is a convex set. In particular, the mathematical program $LaTeX: \min \{f(x) : g(x) \le 0, \; h(x) = 0 \}$ is convex if $LaTeX: g$ is convex, and $LaTeX: h$ is affine. See the supplement, Convex Cones, Sets, and Functions.

Here are some key properties of a convex program:

# Convex set

If any two points are in the set, so is their line segment, i.e.

$LaTeX: x, y \in X \Rightarrow (1 - a) x + a y \in X$,

for any $LaTeX: 0 \le a \le 1$. See Myths and Counter Examples in Mathematical Programming.

# Convex simplex method

This extends the general strategy of the simplex method in linear programming to maximize a concave objective over linear constraints of the form $LaTeX: Ax=b$ and $LaTeX: x \ge 0$. (A form of local convergence applies when the objective is not concave, but is smooth.) A tableau is maintained, but nonbasic variables need not be at zero level. The partition is used to compute a generalized reduced cost of the form $LaTeX: d(x) = \nabla f(x) - y^T A$. The $LaTeX: y$ vector is determined by the basic portion: $LaTeX: 0 = \nabla_B f(x) - y^TB$ (so $LaTeX: y = (\nabla_B f(x))B^{-1}$).

As in the simplex method, pricing is used to determine whether there is an improving nonbasic vector. If not, the algorithm terminates (satisfying the Kuhn-Tucker conditions); otherwise, a line search is used to determine the new point, changing only the basic variables to compensate for the change in the one nonbasic level. If this causes a basic variable to reach zero, the basis is changed with the pivot operation.

# Convexity cut

A class of cutting planes derived by considering a convex set in a polyhedron, P. The form of the cut is

$LaTeX: \sum_i \frac{y_i}{t_i} \ge 1$,

and it is derived as follows.

Let $LaTeX: x_0$ be an extreme point of a given polyhedron, which contains a given set, $LaTeX: S$. Suppose $LaTeX: C$ is a convex set in $LaTeX: \mathbb{R}^n$ whose interior contains $LaTeX: x_0$ but does not contain any point of $LaTeX: S$. Let $LaTeX: v_1, v_2, \ldots, v_n$ be linearly independent vectors in $LaTeX: \mathbb{R}^n$, and let $LaTeX: t_i$ be such that $LaTeX: t_i > 0$ and $LaTeX: [x_0, x_0 + t_i v_i] \in C$ for all $LaTeX: i=1, 2, \ldots, n$ (e.g., the edges emanating from $LaTeX: x_0$ in $LaTeX: V$). Define $LaTeX: V = [v_1,..., v_n]$ and $LaTeX: M = [\mbox{diag}(t_i)V]^{-1}$ (i.e., $LaTeX: M(i,.) = V^{-1} / t_i)$. Then, the cutting plane $LaTeX: \{x: M(x-x_0) \ge 1\}$ excludes $LaTeX: x_0$ without excluding any other $LaTeX: x$ in $LaTeX: S$ for which $LaTeX: V^{-1}(x-x_0) \ge 0$. The cut, $LaTeX: M(x-x_0)\ge 1$, is equivalent to the above form, $LaTeX: y \, \mbox{diag}(1/t_i) \ge 1$, where $LaTeX: y = x_0 - Vw$ for some $LaTeX: w \ge 0$.

One special case is the intersection cut for a 0-1 integer program:

• $LaTeX: S = \{(x, s): x \in \{0, 1\}^n, \; s \in \mathbb{Z}^m, \; Ax + s = b \}$;
• $LaTeX: P = {(x, s): x \in [0, 1]^n, \; s \in \mathbb{Z}^m, \; Ax + s = b\} \cap \{x: Cx \le c\}$, where $LaTeX: \{Cx \le c\}$ are the cuts already added;
• $LaTeX: x_0$ is an extreme point of $LaTeX: P$, obtained by solving the LP relaxation with a simplex method, so $LaTeX: x_0 = (u, w)$ is a basic optimum.
• $LaTeX: V = B^{-1}N$, where $LaTeX: B$ is the basis that transforms the original equations to $LaTeX: u + B^{-1}Nw = x_0$ $LaTeX: ( =B^{-1}b )$;
• $LaTeX: C$ is a sphere, $LaTeX: \{y: \|y\|^2 \le 1/4\}$.

# Corner polyhedron problem

Gomory's relaxed IP that drops the non-negativity constraints of basic variables. Suppose $LaTeX: x = (u,0)$ is a basic solution to the LP, $LaTeX: \mbox{opt}\{c^Tx : Ax=b, \; x \ge 0\}$, where $LaTeX: u$ is the vector of basic variables. Then, the corner polyhedron problem is

$LaTeX: \mbox{opt} \{d^Tv: u \in \mathbb{Z}^m, \; v \in \mathbb{Z}^{n-m}+, \; Bu + Nv =b\}$,

(dropping $LaTeX: u \ge 0$), where $LaTeX: d$ is the reduced cost of $LaTeX: v$ in the LP solution. (Note: $LaTeX: c^Tx = d^Tv$ for all $LaTeX: x=(u,v)$ such that $LaTeX: Ax=b$.)

See elliptope.

# Cost of captial

Used in computing as an objective function, it is the interest rate. The units are currency per time period (e.g., dollars per year). One must be careful in computing this. For example, the marginal cost of capital can be greater than the marginal interest rate because the latter applies to the entire borrowing, not just the last dollar borrowed. (This is the "classical" definition, which does not consider advances in option theory.)

# Covering problem

The idea is to select enough members in each of a specified collection of sets; that defines covering the sets. Subject to this, there is a cost for the elements selected, and the objective is to minimize total cost. The IP form of the usual set covering problem is

$LaTeX: \min \{c^Tx: Ax \ge 1, \; x \in \{0,1\}^n\}$,

where $LaTeX: x_j=1$ if element $LaTeX: j$ is selected; else, $LaTeX: x_j=0$. The matrix $LaTeX: A$ has 0's and 1's, where the $LaTeX: i$-th row corresponds to the $LaTeX: i$-th set to be covered: $LaTeX: A_{i, j}=1$ means element $LaTeX: j$ is in set $LaTeX: i$; else, $LaTeX: A_{i, j}=0$. The constraint $LaTeX: Ax \ge 1$ means that at least one element of each set must be selected. This could be extended to require $LaTeX: b_i$ elements of set $LaTeX: i$ be selected by writing $LaTeX: Ax \ge b$. Also see vertex cover.

# Cramer rule

To solve $LaTeX: Ax=b$, where $LaTeX: A$ is nonsingular, the j-th coordinate is given by:

$LaTeX: x_j = \frac{ \mbox{det}(A^j)}{\mbox{det}(A)}$,

where $LaTeX: det(\cdot)$ is the determinant, and $LaTeX: A^j$ is the matrix obtained if column $LaTeX: j$ of $LaTeX: A$ is replaced by $LaTeX: b$:

$LaTeX: A^j = [A(., 1) A(., 2) \ldots A(., j-1) b A(., j+1), \ldots A(., n)]$.

# Crash

This is a heuristic to generate an initial point for an algorithm. It is from the early linear programming computer systems that used a variety of heuristics to generate an initial basic solution.

# Criss-cross method

Criss-cross method. A method in linear programming that chooses a pivot by possibly crossing from primal to dual, and vica versa. The least-index criss-cross method is a finite pivot method that chooses a pivot based on the least index of a column or row for which there is improvement in either the primal or dual infeasibility. It terminates at a basis when one of the following termination conditions is reached:

1. Both the associated primal and dual solutions are feasible (implies optimality because complementary slackness holds by construction).
2. There is evidence of primal and/or dual infeasibility (like the tests in the simplex method).
(Note: any basis is used to start and there is only one phase.)

# Critical path

A longest path in a network, where length units are time durations. The nodes represent tasks, and the arcs represent precedence constraints. The path is critical because the associated tasks determine the total completion time of the project. Moreover, at least one of their duration times must decrease in order to decrease the total completion time.

# Critical point

A critical point of a smooth function is an element of the domain for which the first partial derivatives are zero or infinite.

# Cumulative constraint

is a global constraint that models scheduling problems as follows. Given are a collection of tasks $LaTeX: T=t_1 \dots t_n$, such that each task $LaTeX: t_i$ is associated with four variables:

• release time, $LaTeX: r_i$, is the earliest time at which it can begin executing,
• deadline (or due date), $LaTeX: d_i$, is the time by which it must complete
• processing time, $LaTeX: p_i$, is the length of time it takes to complete
• capacity requirement, $LaTeX: c_i$, is the capacity of the resource that it uses while it executes.

In addition, we are given the capacity variable $LaTeX: C$ of the resource.

The $LaTeX: cumulative(\{s_1 \dots s_n\},\{p_1 \dots p_n\}, \{c_1 \dots c_n\}, C)$ constraint assigns a starting time $LaTeX: s_i$ for each task $LaTeX: t_i$ such that:

• $LaTeX: r_i\leq s_i \leq d_i - p_i$: the task starts at or after its release date and end at or before its deadline, and
• $LaTeX: \forall u \sum_{i|s_i\leq u\leq s_i+p_i} c_i\leq C$: at any time unit $LaTeX: u$, the capacity of the resource is not exceeded.

# Cut search

An algorithm strategy for global optimization (notably for integer programming) that consists of two alternating phases: search and cut. The search phase finds linearly independent vectors emanating from a root point, $LaTeX: x_0$, to setup a probe for the cut phase. Usually, $LaTeX: x_0$ is an extreme point, so the search is an edge probe phase that extends edges of the cone rooted at $LaTeX: x_0$ until it intersects the candidate solution points. Then, a cutting plane is added (usually a convexity cut) to exclude the root point. (A new root point is obtained by solving the new approximating problem on the smaller polyhedron.)

# Cutset

In a weakly connected network, this is a set of arcs whose removal decomposes the network into exactly two components. Separating node $LaTeX: s$ from node $LaTeX: t$, the value of the cutset is the sum of the capacities of its arcs from the component containing $LaTeX: s$ to the one containing $LaTeX: t$.

# Cutting plane

A hyperplane whose halfspace cuts off a particular point, such as a non-integer extreme point solution of a linear program relaxation of an integer program. The cut is such that it does not eliminate any solution to the original problem. For example, suppose we want to solve

$LaTeX: \max \{x: (x,y) \in \{0,1\}^2, \; 2x + 2y \le 1 \}$.

The only feasible point is $LaTeX: (0, 0)$. The LP relaxation is

$LaTeX: \max \{x: (x,y) \in [0,1]^2, \; 2x + 2y \le 1\}$,

and an optimal solution is at $LaTeX: (1/2, 0)$. One cutting plane is $LaTeX: \{(x,y): x + y \le 1/3\}$. A deeper one is $LaTeX: \{(x,y): x + y \le 0\}$. (When adding a cutting plane to the LP, a new relaxation is obtained, and the solution is closer to the IP solution.)

# Cutting stock problem

Determine a way to cut standard sheets of material into various shapes (like clothes parts) to minimize waste. This is a (linear) integer programming model: patterns are specified, and $LaTeX: A_{i, j, k}$ is the amount of $LaTeX: i$-th stock (e.g., sheet or roll of material) used to generate the $LaTeX: j$-th output by the $LaTeX: k$-th pattern. Then, let $LaTeX: x_k$ be level of $LaTeX: k$-th pattern used and $LaTeX: y = Ax$. Thus, $LaTeX: \textstyle s_i = \sum_j y_{i,j}$ is the amount of the $LaTeX: i$-th stock used, which is limited by its availability: $LaTeX: s \le S$; and $LaTeX: \textstyle v_j = \sum_i y_{i, j}$ is the amount of $LaTeX: j$-th output generated, which is required to be in some range, say $LaTeX: L \le v \le U$ (allowing some demand overruns or underruns). Some models seek to minimize the total waste: $LaTeX: \textstyle \sum_i S_i - s_i$. Other models consider cost too. The most common problems are 2-dimensional (odd shapes from sheets of material); the 1-dimensional case is called the trim problem. In the latter case, the stock index $LaTeX: i$ is not needed. For example, consider a stock of rolls of paper with a given width, which must be slit into rolls of various widths. Then, $LaTeX: A_{j, k}$ is how much of a stock roll is used by the $LaTeX: k$-th pattern to generate a roll of the $LaTeX: j$-th width. Moreover, $LaTeX: \textstyle y_j = \sum_k A_{j, k} x_k$ is the amount of a stock roll used by pattern $LaTeX: j$ to generate a roll of width $LaTeX: j$.

# Cyclic descent

An algorithm that seeks to optimize a multivariate function by optimizing each coordinate with a univariate optimization technique (keeping the other coordinates fixed). This is repeated until a fixed point is reached. In general, it is possible for such a fixed point not to be an optimum (even locally) because a simultaneous change in variables could result in an improvement. An example is given by:

$LaTeX: f(x, y) = (y - x^2)(y - 2x^2) \; \mbox{ at } \; (x, y) = (0,0)$.

$LaTeX: f(0, y)$ has a minimum at $LaTeX: y=0$, and $LaTeX: f(x, 0)$ has a minimum at $LaTeX: x=0$. However, $LaTeX: f$ decreases with simultaneous change, $LaTeX: y=(3/2)x^2$. Along this parabola, $LaTeX: f(x,y) = -(x^4)/4$, which is negative for $LaTeX: x$ nonzero. Thus, $LaTeX: 0$ is not a local minimum of $LaTeX: f$. For further discussion about this example, see Myths and Counter Examples.

# Cycling

In linear programming this means revisiting a basis, mainly referring to a simplex method, so that the algorithm would cycle through the same sequence of bases, not converging to an optimal one. See the cycling supplement