All Pages

From Glossary

Jump to: navigation, search


A solver using a simplified problem specification for Linear, Integer, Nonlinear and Dynamic optimization. See [1]


A modeling language companion to LINDO.


A Linear Programming Language

LU decomposition

A factorization of a matrix LaTeX: A into a lower triangular matrix LaTeX: (L) and an upper triangular matrix LaTeX: (U) so that LaTeX: A=LU. The advantage is to solve the system LaTeX: Ax=b, we first apply forward substitution to solve LaTeX: Ly=b, then backward substitution to solve LaTeX: Ux=y.

Label correcting algorithm

Arises in labeling algorithms for shortest path problem. Each iteration a label is set to an estimate of the shortest path from a given node. All labels become exact values at termination.

Label setting algorithm

Arises in labeling algorithms for shortest path problem. Each iteration a label becomes the actual shortest path from some node. (Termination occurs when the destination node(s) are permanently lablelled.)

Labeling algorithm

A class of algorithms for path-related problems on graphs and networks. To illustrate, consider a network LaTeX: [V,A] with arc values LaTeX: c.

  • The Ford-Fulkerson max flow labeling algorithm (LaTeX: c is capacity).
    • Input is source LaTeX: (s), destination LaTeX: (t), and capacities LaTeX: (c). Output is the max flow from LaTeX: s to LaTeX: t.
    • Initialize all flows to be zero and assign the label LaTeX: (-,\infty) to the source. All other nodes are unlabeled, and all nodes (including the source) are unscanned.
    • Labeling process: Select any labeled, unscanned node, say LaTeX: k with label LaTeX: (a,b). If node LaTeX: j is an unlabeled neighbor and LaTeX: x_{kj} < c_{kj}, assign the label LaTeX: (k+,b^*) to node LaTeX: j, where LaTeX: b^* = \min \{b, c_{kj}-x_{kj}\}. If node LaTeX: j is an unlabeled neighbor and LaTeX: x_{jk} > 0, assign the label LaTeX: (k-,b^*) to node LaTeX: j, where LaTeX: b* = \min\{b, x_{jk}\}. (In either case, define node LaTeX: j as labeled, but unscanned, while node LaTeX: k is now labeled and scanned.) Repeat this until either the sink is labeled and unscanned, or until no more labels can be assigned. In the latter case terminate; in the former case, go to the flow change step.
    • Flow change: Let the sink LaTeX: (t) have label LaTeX: (a,b). If LaTeX: a is LaTeX: k+, replace LaTeX: x_{kt} by LaTeX: x_{kt}+b; if it is LaTeX: (k-,b), replace LaTeX: x_{kt} by LaTeX: x_[kt}-b. In either case, apply the flow change (LaTeX: b or LaTeX: -b) along the path back to the source. The result is a new flow that is LaTeX: b more than it was. Erase labels, except on the source, and go to the labeling process.
  • Dijkstra's shortest path algorithm (LaTeX: c is distance).
    • Input is source LaTeX: (s), destinations LaTeX: (T \neq \emptyset), and distances LaTeX: (c). Output is a shortest path from LaTeX: s to each node in LaTeX: T.
    • Initialize labels: LaTeX: L_v = \infty for LaTeX: v \neq s and LaTeX: L_s=0. Set LaTeX: S=\emptyset.
    • Label: select LaTeX: u \in \arg\min {L_v: v \in V\backslash S\} (terminate if LaTeX: S=V). Then, update LaTeX: S := S \cup \{u\}, and for LaTeX: v \in V\backslash S: if LaTeX: L_u + c_{uv} < L_v, set LaTeX: L_v := L_u + c_{uv}.
    • Terminate when LaTeX: T is a subset of LaTeX: S (i.e., all destination nodes are labeled with their shortest distance from node LaTeX: s).
  • A generic shortest path labeling algorithm from all nodes to destination node LaTeX: t (LaTeX: c is distance).
    • Input is destination LaTeX: (t) and distances LaTeX: (c). Output is a shortest path from each node to LaTeX: t.
    • Initialize labels, LaTeX: L_v is the estimate of a shortest path length from node LaTeX: v to node LaTeX: t, with LaTeX: L_t = 0.
    • Label: Find an arc, say LaTeX: (i, j), that violates the distance equations, say LaTeX: L_j > L_i + c_{ij}. If none, terminate; otherwise, update the label: LaTeX: L_j = L_i + c_{ij}.
    • The labeling step is repeated until there are no violations. At termination, LaTeX: L_j is the length of a shortest path from node LaTeX: j to node LaTeX: t.

Another type of labeling algorithm occurs in simplicial subdivision, used to compute a fixed point that represents an economic equilibrium.

Lagrange conditions

The optimality equation (plus feasibility conditions) in Lagrange's multiplier theorem.

Lagrange multiplier

See Lagrange multiplier rule.

Lagrange multiplier rule

From the extension of Lagrange's multiplier theorem. Suppose

LaTeX: x^* \in \arg\max \{f(x): g(x) \le 0, \, h(x) = 0\},

where LaTeX: f, LaTeX: g, and LaTeX: h are smooth. Then, there exist multipliers LaTeX: (u, v) for which the following conditions hold:

  • LaTeX: \nabla_x [f(x^*) - u^T g(x^*) - v^T h(x^*)] = 0;
  • LaTeX: u >= 0;
  • LaTeX: u^T g(x*) = 0.

Since LaTeX: g(x^*) \le 0, the last condition, given LaTeX: u \ge 0, is equivalent to complementary slackness. These are considered first-order optimality conditions, though the Lagrange Multiplier Rule is not always valid -- see constraint qualifications.

For extensions see the Generalized Lagrange multiplier method.

Lagrange multiplier theorem

Let LaTeX: f and LaTeX: h be smooth functions and suppose LaTeX: \nabla h(x^*) has full row rank. Then, LaTeX: x^* \in \arg\max \{f(x): h(x) = 0\} only if there exists LaTeX: v \in \mathbb{R}^m such that:

LaTeX: \nabla f(x*) - v^T \nabla h(x*) = 0.

The LaTeX: i-th component of LaTeX: v, LaTeX: v_i, is called the Lagrange multiplier associated with the LaTeX: i-th constraint, LaTeX: h_i(x)=0. Extensions to remove the rank condition and/or allow inequality constraints were by Carathéodory, John, Karush, and Kuhn & Tucker. Also see the Lagrange Multiplier Rule.


For the mathematical program

LaTeX: \min \{ f(x) : g(x) \ge 0, \, h(x) = 0, \, x \in X \},
the Lagrangian is the function:

LaTeX: L(x, u, v) = f(x) - u^T g(x) - v^T h(x).

for LaTeX: x \in X and LaTeX: u \ge 0. Note that the Lagrange Multiplier Rule can be written as the first-order conditions for LaTeX: (x^*, u^*,v^*) to be a saddle point of LaTeX: L. In Lagrange's multiplier theorem (where LaTeX: X=\mathbb{R}^n and LaTeX: g is vacuous), this is simply that LaTeX: \nabla L(x^*,v^*)=0, which could be any type of stationary point.

Lagrangian duality

See it in the list of particular duals.

Lagrangian relaxation

The removal of constraints and putting them into the objective as in the Lagrangian function. Specifically, suppose we have

LaTeX: \max \{ f(x): x \in X, \, g(x) \le 0, \, h(x) = 0\}.

The Lagrangian relaxation is

LaTeX: \max \{f(x) - u^T g(x) - v^T h(x): x \in X\},

where LaTeX: u \ge 0 (LaTeX: v is unrestricted).

This is a relaxation of the original mathematical program because the constraints have been removed (expanding the feasible region), and for LaTeX: x feasible (i.e., satisfying these constraints), the relaxed objective satisfies:

LaTeX: f(x) - u^T g(x) - v^T h(x) \ge f(x),

because LaTeX: v^T h(x)=0 and LaTeX: u^Tg(x) \le 0.

The objective is the usual Lagrangian function when LaTeX: X is simply LaTeX: \mathbb{R}^n. It provides a foundation for Lagrangian duality and the Generalized Lagrange Multiplier Method.

Note that LaTeX: X could retain some constraints, such as in the separable form LaTeX: X = X_1 \times \ldots \times X_n. Now suppose LaTeX: f(x) = f_1(x_1) + \ldots + f_n(x_n), LaTeX: g(x) = g_1(x_1) + \ldots + g_n(x_n), and LaTeX: h(x) = h_1(x_1) + \ldots + h_n(x_n). Then, the Lagrangian relaxation decomposes into LaTeX: n independent mathematical programs for any multiplier value:

LaTeX: \max \{f(x) - u^T g(x) - v^T h(x): x \in X\} 
= \sum_{j=1}^n \max {f_j(x_j) - u_j g_j(x_j) - v_j h_j(x_j): 
x_j \in X_j \}.

Lagrangian saddlepoint equivalence

The tuple LaTeX: (x^*, u^*, v^*) in LaTeX: X \times \mathbb{R}^m \times \mathbb{R}^M, with LaTeX: u^* \ge 0, is a saddlepoint of of the Lagrangian LaTeX: L if, and only if, the strong duality properties hold:

  1. LaTeX: x^* \in \arg\max \{L(x, u^* v^*): x \in X\}
  2. LaTeX: g(x^*) \le 0 and LaTeX: h(x^*) = 0 (LaTeX: x^* is feasible)
  3. LaTeX: (u^*)^T g(x^*) = 0 (LaTeX: x^* and LaTeX: u^* satisfy complementary slackness)

See the supplement on Lagrangian Saddle Point Equivalence for further information.


A set LaTeX: S that contains LaTeX: x \vee y and LaTeX: x \wedge y for all LaTeX: x and LaTeX: y in LaTeX: S.

Lattice program

Minimizing a subadditive function on a lattice.

Lattice search

This finds the maximum of a unimodal function on a finite set, LaTeX: \{x_1, x_2, \ldots, x_m\}, by evaluating the function at points placed by the modified fibonacci sequence: LaTeX: K_{n+2} = K_{n+1} + K_n + 1. This modification comes from the improvement obtained by dropping the inferior end point after an evaluation: if the set is in LaTeX: [a, b] and LaTeX: f(x) < f(y), the new interval is LaTeX: [x+1, b] dropping the point LaTeX: x from the remaining search set.

The method proceeds the same as fibonacci search, except the placements follow the modified fibonacci sequence, which is equivalent to LaTeX: K_n = F_{n+1} - 1, reflecting the improvement that accumulates as LaTeX: n increases, for the extra point dropped after each evaluation. The placement ratio, LaTeX: K_{n-1} / K_n, also approaches the golden mean, so lattice search also approaches the golden section search.

Some refer to fibonacci search when they mean lattice search. The following table shows the difference. For example, with 20 evaluations, fibonacci search can guarantee finding the maximum on a set of 10,946 points, while lattice search can find it on a set of 17,710 points. Golden section is included for comparison as well. With 20 evaluations it can find the maximum on a set of only 9,349 points.

           n        5     10     15     20
           F_n      8     89    987  10946
           K_n     12    143   1596  17710
        golden      6     76    843   9349 

Lattice search minimizes the maximum number of evaluations needed to find the maximum on a given set of points. If the number in that set is LaTeX: K_N, the number of evaluations is LaTeX: N. If the number in the set is not exactly some modified fibonacci number, one can fill in dummy points at the end of the set. For example, if LaTeX: K_{N-1} < m < K_N, let the set be LaTeX: {x_1, \ldots, x_m, y_1, \ldots, y_k\}, where LaTeX: k = K_N- m. Then, the (minimum) number of evaluations to guarantee finding the maximum of any unimodal function is LaTeX: N.

Level set

For the function LaTeX: f:X \rightarrow R, the LaTeX: c-level set is LaTeX: \{x: x \in X, \, f(x) \le c\}. Sometimes, the 0-level set is called simply the level set of f.

Levenberg Marquardt algorithm

This is designed for solving nonlinear programs using the equation:

(\nabla^2 f(x) + p I)d = -\nabla f(x),

where LaTeX: \nabla^2 f(x) is the Hessian. For unconstrained optimization, the solution LaTeX: d serves as a direction vector for the iteration. This is also used in a trust region method. The parameter LaTeX: p is set to give a balance between Newton's method LaTeX: (p=0) and Cauchy's steepest descent LaTeX: (p >> 0). A low value of LaTeX: p helps get through difficult landscape curvature, and a high value yields some descent.

Lexicographic Ordering constraint

is a global constraint that allows the comparison of sequences of variables. Given two sequences LaTeX: x and LaTeX: y of LaTeX: n variables, LaTeX: \langle x_1 \dots x_n \rangle and LaTeX: \langle y_1 \dots y_n \rangle , let LaTeX: x \preceq_{lex} y denote the lexicographic ordering constraint. The constraint holds iff LaTeX: n = 0 or LaTeX: x_1 < y_1 or LaTeX: x_1 = y_1 and LaTeX: \langle x_2 \dots x_n \rangle \preceq_{lex} \langle y_2 \dots y_n \rangle.

Lexicographic order

A nonzero vector is lexicographically positive if its first non-zero coordinate is positive. The vector LaTeX: x is lexicographically greater than the vector LaTeX: y if LaTeX: x-y is lexicographically positive, and this defines the lexicographic order in LaTeX: \mathbb{R}^n. This is a total ordering in that every two vectors are either equal, or one is lexicographically greater than the other.

This was first used in mathematical programming to resolve cycling in the simplex method. It also provides a way to obtain solutions for multiple objectives with the property that LaTeX: x is a Pareto maximum if LaTeX: f(x) is lexicographically greater than or equal to LaTeX: f(y) for all feasible LaTeX: y.


In integer programming, it is creating a valid inequality for the original set, given one for a lower dimensional set, by augmenting a binary variable that is absent from the original inequality. That is, suppose LaTeX: S is a subset of LaTeX: \{0,1\}^n and LaTeX: a^T x \le b is a valid inequality for LaTeX: S\cap \{x \in \{0,1\}^n: x_1=0\} (where it does not matter what LaTeX: a_1 is since LaTeX: x_1=0). Then, under certain conditions, the inequality is valid for LaTeX: S. The "conditions" are problem dependent, and LaTeX: a_1 is set as large as possible.

For example, suppose LaTeX: x_2 + x_3 + x_4 \le 2 is a valid inequality for LaTeX: S. A lifting is the strengthening, LaTeX: 2x_1 + x_2 + x_3 + x_4 \le 2, if it can be ascertainted that this is a valid inequality for LaTeX: S. The coeffiecient of LaTeX: x_1 (LaTeX: 2 in the example) is determined by considering how large LaTeX: a_1 can be with LaTeX: x_1=1. (In this case, the lifted inequality is valid if LaTeX: x_2 + x_3 + x_4 \le 2 is valid for LaTeX: \{x \in S: x_1=0\} and if LaTeX: x_1=1 implies LaTeX: x_2 = x_3 = x_4 = 0.) The motivation for doing this arises in a branch and cut algorithm strategy. At a node in the search tree we might have a valid inequality, LaTeX: a^Tx \le b, when the branch had LaTeX: x_1=0, and we want to make it valid even if LaTeX: x_1=1. The inequality is valid for all

a_1 \le b - \max \{a_2 x_2 + \ldots + a_n x_n : x \mbox{ feasible and } x_1 = 1\}.

Line search

Optimizing the objective along a line: LaTeX: \mbox{opt} \{ f(x + \alpha \, d): s \ge 0\}, where LaTeX: x is a given point and LaTeX: d is a given direction vector (LaTeX: \alpha is a scalar). A coarse tolerance could be used, so "Opt" is not entirely accurate. (In fact, there are reasons to believe not optimizing is best.)

Here are some particular search methods in this glossary:

Line segment

The line segment with end points LaTeX: x and LaTeX: y in LaTeX: \mathbb{R}^n is

LaTeX: \{\alpah x + (1-\alpha)y: \alpha \in [0, 1]\},

which is denoted LaTeX: [x, y] and is the closed line segment. The open line segment is

LaTeX: \{\alpah x + (1-\alpha)y: \alpha \in (0, 1)\},

Linear combination

The vector LaTeX: y is a linear combination of vectors LaTeX: x_1, x_2, \ldots, x_n if LaTeX: \textstyle y = \sum_{k=1}^n {a_k x_k}, where LaTeX: a_k is called the coefficient of LaTeX: x_k.

Linear convergence

See convergence.

Linear program

An optimization problem in which the objective and constraints are linear. Forms include LaTeX: \min \{c^Tx: Ax = b,\, x >= 0\} and LaTeX: \max \{c^Tx : Ax \le b\}. The standard form assumes LaTeX: A has full row rank. Computer systems ensure this by having a logical variable LaTeX: (y) augmented, so the form appears, for example, as LaTeX: \min \{c^Tx: Ax + y = b,\, L \le (x, y) \le U\} (also allowing general bounds on the variables). The original variables LaTeX: (x) are called structural. Note that each logical variable can be a slack, surplus, or artificial variable, depending on the form of the original constraint. This computer form also represents a range constraint with simple bounds on the logical variable. Some bounds can be infinite (i.e., absent), and a free variable (logical or structural) is when both of its bounds are infinite.

Linearity interval

When a univariate function is piece-wise linear, it has the form LaTeX: a_i x + b_i for LaTeX: x in the interval LaTeX: [c_i, c_{i+1}], where LaTeX: a_i is not equal to LaTeX: a_{i+1}. (The phrase usually means the function is continuous.) This arises in linear programming when considering the optimal value as a function of varying right-hand sides (or cost coefficients) in fixed proportions: LaTeX: b+td (or LaTeX: c+td), where LaTeX: d is an admissible change vector and LaTeX: t is the (scalar) variable. Then, for the range of LaTeX: t where the LP has a solution, the optimal value function has this piece-wise linear (continuous) form, and the intervals that comprise the domain are called linearity intervals.


The strategy of linearization is to reformulate a nonlinear program as a linear progarm through the introduction of auxiliary variables and constraints that are used to circumvent the nonlinearities. See standard linearization, Glover's linearization, and Reformulation-Linearization-Technique

Lipschitz continuous

The function LaTeX: f: X \rightarrow \mathbb{R} is Lipschitz continuous if there exists a value LaTeX: K, called the Lipschitz constant, such that LaTeX: |f(x) - f(y)| \le K \|x-y\| for all LaTeX: x and LaTeX: y in LaTeX: X. This relation is called the Lipschitz condition. It is stronger than continuity because it limits the slope to be within LaTeX: [-K, K]. The generalized Lipschitz condition is that there exists a monotonically increasing function, LaTeX: M : \mathbb{R} \rightarrow \mathbb{R} with the property that LaTeX: m(z) \rightarrow 0 as LaTeX: z \rightarrow 0 such that there exists LaTeX: K for which LaTeX: |f(x)-f(y)| \le K m(\|x-y\|) for all LaTeX: x and LaTeX: y in LaTeX: X.

Local convergence

See convergence.

Local optimum

A feasible point LaTeX: x^* that is an optimal solution to the mathematical program whose feasible region is the intersection of the original region and some neighborhood of LaTeX: x^*.

Locally convex function

There exists a subset of the domain, such as the neighborhood of a point, such that the function is convex on that subset.

Location problem

See facility location problem.

Lockbox problem

A firm wants a check that it receives to clear as quickly as possible (so it can use the money). To reduce the number of days to clear a check, the firm opens offices, called lockboxes, in different cities to handle the checks. There is a cost to open a lockbox, and there are average annual losses calculated from interest rates and the number of days to clear a check sent from source LaTeX: i to lockbox city LaTeX: j.

Logical variable

See linear program.

Lot size problem

This is one of the oldest mixed-integer programs in operations research, first presented by Wagner and Whitin in 1958. The problem is to minimize cost while satisfying product demands over (discrete) time. Let LaTeX: y_t be the number of units produced in period LaTeX: t, for LaTeX: t=1,\ldots,T (LaTeX: T is called the planning horizon), and let LaTeX: x_t = 1 if a setup occurs in period LaTeX: t; LaTeX: 0 otherwise. Let LaTeX: D_t be the demand in period LaTeX: t, and let the demand from period LaTeX: i to period LaTeX: j, inclusive, be LaTeX: d_{ij} = \sum_{i \le t \le j} D_t. Then, a MILP formulation is:

LaTeX: \min \left\{ c^T x + d^T y: x \in \{0,1\}^n, \, y \ge 0, 
\sum_{t\le i} y_t \ge d_{li} \mbox{ for } i=1, \ldots,n-1, \,
\sum_{t\le n} y_t = d_{1n}, \,
d_{in} x_i - y_i \ge 0 \mbox{ for } i=1, \ldots,n \right\}.

(This is preferred over the MILP that defines inventory balance equations. Even though they are equivalent, the LP relaxation of the above is sharper.)

Lower semi-continuity

Suppose LaTeX: x^k \rightarrow x. Of a function, LaTeX: f is lower semi-continuous if

LaTeX: \lim_{k \rightarrow \infty} \inf_{j \ge k} f(x^j) \ge f(x).

Of a point-to-set map, LaTeX: A, let LaTeX: N_{\varepsilon}[S] be a neighborhood of the set LaTeX: S: for each LaTeX: \varepsilon &gt; 0, there exists LaTeX: K such that for all LaTeX: k \ge K, LaTeX: A(x) is a subset of LaTeX: N_{\varepsilon}[A(x^k)]. A pathology to show that the feasibility map may not be lower semi-continuous is:

LaTeX: A(b) = L_b(g) = \{x \in [-1, 1]: g(x) \le b\},

where LaTeX: g(x) = x if LaTeX: x is in LaTeX: [-1, 0], and LaTeX: g(x) = 0 if LaTeX: x \in [0,1].

Lower triangular matrix

A square matrix, LaTeX: A, such that LaTeX: A_{ij}=0 for LaTeX: i \le j.

Personal tools