# LINDO

A solver using a simplified problem specification for Linear, Integer, Nonlinear and Dynamic optimization. See [1]

# LINGO

A modeling language companion to LINDO.

# LPL

A Linear Programming Language

# LU decomposition

A factorization of a matrix $LaTeX: A$ into a lower triangular matrix $LaTeX: (L)$ and an upper triangular matrix $LaTeX: (U)$ so that $LaTeX: A=LU$. The advantage is to solve the system $LaTeX: Ax=b$, we first apply forward substitution to solve $LaTeX: Ly=b$, then backward substitution to solve $LaTeX: Ux=y$.

# Label correcting algorithm

Arises in labeling algorithms for shortest path problem. Each iteration a label is set to an estimate of the shortest path from a given node. All labels become exact values at termination.

# Label setting algorithm

Arises in labeling algorithms for shortest path problem. Each iteration a label becomes the actual shortest path from some node. (Termination occurs when the destination node(s) are permanently lablelled.)

# Labeling algorithm

A class of algorithms for path-related problems on graphs and networks. To illustrate, consider a network $LaTeX: [V,A]$ with arc values $LaTeX: c$.

• The Ford-Fulkerson max flow labeling algorithm ($LaTeX: c$ is capacity).
• Input is source $LaTeX: (s)$, destination $LaTeX: (t)$, and capacities $LaTeX: (c)$. Output is the max flow from $LaTeX: s$ to $LaTeX: t$.
• Initialize all flows to be zero and assign the label $LaTeX: (-,\infty)$ to the source. All other nodes are unlabeled, and all nodes (including the source) are unscanned.
• Labeling process: Select any labeled, unscanned node, say $LaTeX: k$ with label $LaTeX: (a,b)$. If node $LaTeX: j$ is an unlabeled neighbor and $LaTeX: x_{kj} < c_{kj}$, assign the label $LaTeX: (k+,b^*)$ to node $LaTeX: j$, where $LaTeX: b^* = \min \{b, c_{kj}-x_{kj}\}$. If node $LaTeX: j$ is an unlabeled neighbor and $LaTeX: x_{jk} > 0$, assign the label $LaTeX: (k-,b^*)$ to node $LaTeX: j$, where $LaTeX: b* = \min\{b, x_{jk}\}$. (In either case, define node $LaTeX: j$ as labeled, but unscanned, while node $LaTeX: k$ is now labeled and scanned.) Repeat this until either the sink is labeled and unscanned, or until no more labels can be assigned. In the latter case terminate; in the former case, go to the flow change step.
• Flow change: Let the sink $LaTeX: (t)$ have label $LaTeX: (a,b)$. If $LaTeX: a$ is $LaTeX: k+$, replace $LaTeX: x_{kt}$ by $LaTeX: x_{kt}+b$; if it is $LaTeX: (k-,b)$, replace $LaTeX: x_{kt}$ by $LaTeX: x_[kt}-b$. In either case, apply the flow change ($LaTeX: b$ or $LaTeX: -b$) along the path back to the source. The result is a new flow that is $LaTeX: b$ more than it was. Erase labels, except on the source, and go to the labeling process.
• Dijkstra's shortest path algorithm ($LaTeX: c$ is distance).
• Input is source $LaTeX: (s)$, destinations $LaTeX: (T \neq \emptyset)$, and distances $LaTeX: (c)$. Output is a shortest path from $LaTeX: s$ to each node in $LaTeX: T$.
• Initialize labels: $LaTeX: L_v = \infty$ for $LaTeX: v \neq s$ and $LaTeX: L_s=0$. Set $LaTeX: S=\emptyset$.
• Label: select $LaTeX: u \in \arg\min {L_v: v \in V\backslash S\}$ (terminate if $LaTeX: S=V$). Then, update $LaTeX: S := S \cup \{u\}$, and for $LaTeX: v \in V\backslash S$: if $LaTeX: L_u + c_{uv} < L_v$, set $LaTeX: L_v := L_u + c_{uv}.$
• Terminate when $LaTeX: T$ is a subset of $LaTeX: S$ (i.e., all destination nodes are labeled with their shortest distance from node $LaTeX: s$).
• A generic shortest path labeling algorithm from all nodes to destination node $LaTeX: t$ ($LaTeX: c$ is distance).
• Input is destination $LaTeX: (t)$ and distances $LaTeX: (c$). Output is a shortest path from each node to $LaTeX: t$.
• Initialize labels, $LaTeX: L_v$ is the estimate of a shortest path length from node $LaTeX: v$ to node $LaTeX: t$, with $LaTeX: L_t = 0.$
• Label: Find an arc, say $LaTeX: (i, j)$, that violates the distance equations, say $LaTeX: L_j > L_i + c_{ij}$. If none, terminate; otherwise, update the label: $LaTeX: L_j = L_i + c_{ij}$.
• The labeling step is repeated until there are no violations. At termination, $LaTeX: L_j$ is the length of a shortest path from node $LaTeX: j$ to node $LaTeX: t$.

Another type of labeling algorithm occurs in simplicial subdivision, used to compute a fixed point that represents an economic equilibrium.

# Lagrange conditions

The optimality equation (plus feasibility conditions) in Lagrange's multiplier theorem.

# Lagrange multiplier rule

From the extension of Lagrange's multiplier theorem. Suppose

$LaTeX: x^* \in \arg\max \{f(x): g(x) \le 0, \, h(x) = 0\},$

where $LaTeX: f$, $LaTeX: g$, and $LaTeX: h$ are smooth. Then, there exist multipliers $LaTeX: (u, v)$ for which the following conditions hold:

• $LaTeX: \nabla_x [f(x^*) - u^T g(x^*) - v^T h(x^*)] = 0$;
• $LaTeX: u >= 0$;
• $LaTeX: u^T g(x*) = 0$.

Since $LaTeX: g(x^*) \le 0,$ the last condition, given $LaTeX: u \ge 0$, is equivalent to complementary slackness. These are considered first-order optimality conditions, though the Lagrange Multiplier Rule is not always valid -- see constraint qualifications.

For extensions see the Generalized Lagrange multiplier method.

# Lagrange multiplier theorem

Let $LaTeX: f$ and $LaTeX: h$ be smooth functions and suppose $LaTeX: \nabla h(x^*)$ has full row rank. Then, $LaTeX: x^* \in \arg\max \{f(x): h(x) = 0\}$ only if there exists $LaTeX: v \in \mathbb{R}^m$ such that:

$LaTeX: \nabla f(x*) - v^T \nabla h(x*) = 0.$

The $LaTeX: i$-th component of $LaTeX: v$, $LaTeX: v_i,$ is called the Lagrange multiplier associated with the $LaTeX: i$-th constraint, $LaTeX: h_i(x)=0$. Extensions to remove the rank condition and/or allow inequality constraints were by Carathéodory, John, Karush, and Kuhn & Tucker. Also see the Lagrange Multiplier Rule.

# Lagrangian

For the mathematical program

$LaTeX: \min \{ f(x) : g(x) \ge 0, \, h(x) = 0, \, x \in X \},$
the Lagrangian is the function:

$LaTeX: L(x, u, v) = f(x) - u^T g(x) - v^T h(x).$

for $LaTeX: x \in X$ and $LaTeX: u \ge 0$. Note that the Lagrange Multiplier Rule can be written as the first-order conditions for $LaTeX: (x^*, u^*,v^*)$ to be a saddle point of $LaTeX: L$. In Lagrange's multiplier theorem (where $LaTeX: X=\mathbb{R}^n$ and $LaTeX: g$ is vacuous), this is simply that $LaTeX: \nabla L(x^*,v^*)=0$, which could be any type of stationary point.

# Lagrangian duality

See it in the list of particular duals.

# Lagrangian relaxation

The removal of constraints and putting them into the objective as in the Lagrangian function. Specifically, suppose we have

$LaTeX: \max \{ f(x): x \in X, \, g(x) \le 0, \, h(x) = 0\}.$

The Lagrangian relaxation is

$LaTeX: \max \{f(x) - u^T g(x) - v^T h(x): x \in X\},$

where $LaTeX: u \ge 0$ ($LaTeX: v$ is unrestricted).

This is a relaxation of the original mathematical program because the constraints have been removed (expanding the feasible region), and for $LaTeX: x$ feasible (i.e., satisfying these constraints), the relaxed objective satisfies:

$LaTeX: f(x) - u^T g(x) - v^T h(x) \ge f(x),$

because $LaTeX: v^T h(x)=0$ and $LaTeX: u^Tg(x) \le 0.$

The objective is the usual Lagrangian function when $LaTeX: X$ is simply $LaTeX: \mathbb{R}^n$. It provides a foundation for Lagrangian duality and the Generalized Lagrange Multiplier Method.

Note that $LaTeX: X$ could retain some constraints, such as in the separable form $LaTeX: X = X_1 \times \ldots \times X_n$. Now suppose $LaTeX: f(x) = f_1(x_1) + \ldots + f_n(x_n),$ $LaTeX: g(x) = g_1(x_1) + \ldots + g_n(x_n),$ and $LaTeX: h(x) = h_1(x_1) + \ldots + h_n(x_n).$ Then, the Lagrangian relaxation decomposes into $LaTeX: n$ independent mathematical programs for any multiplier value:

$LaTeX: \max \{f(x) - u^T g(x) - v^T h(x): x \in X\} = \sum_{j=1}^n \max {f_j(x_j) - u_j g_j(x_j) - v_j h_j(x_j): x_j \in X_j \}.$

The tuple $LaTeX: (x^*, u^*, v^*)$ in $LaTeX: X \times \mathbb{R}^m \times \mathbb{R}^M$, with $LaTeX: u^* \ge 0$, is a saddlepoint of of the Lagrangian $LaTeX: L$ if, and only if, the strong duality properties hold:

1. $LaTeX: x^* \in \arg\max \{L(x, u^* v^*): x \in X\}$
2. $LaTeX: g(x^*) \le 0$ and $LaTeX: h(x^*) = 0$ ($LaTeX: x^*$ is feasible)
3. $LaTeX: (u^*)^T g(x^*) = 0$ ($LaTeX: x^*$ and $LaTeX: u^*$ satisfy complementary slackness)

See the supplement on Lagrangian Saddle Point Equivalence for further information.

# Lattice

A set $LaTeX: S$ that contains $LaTeX: x \vee y$ and $LaTeX: x \wedge y$ for all $LaTeX: x$ and $LaTeX: y$ in $LaTeX: S$.

# Lattice program

Minimizing a subadditive function on a lattice.

# Lattice search

This finds the maximum of a unimodal function on a finite set, $LaTeX: \{x_1, x_2, \ldots, x_m\}$, by evaluating the function at points placed by the modified fibonacci sequence: $LaTeX: K_{n+2} = K_{n+1} + K_n + 1$. This modification comes from the improvement obtained by dropping the inferior end point after an evaluation: if the set is in $LaTeX: [a, b]$ and $LaTeX: f(x) < f(y)$, the new interval is $LaTeX: [x+1, b]$ dropping the point $LaTeX: x$ from the remaining search set.

The method proceeds the same as fibonacci search, except the placements follow the modified fibonacci sequence, which is equivalent to $LaTeX: K_n = F_{n+1} - 1$, reflecting the improvement that accumulates as $LaTeX: n$ increases, for the extra point dropped after each evaluation. The placement ratio, $LaTeX: K_{n-1} / K_n$, also approaches the golden mean, so lattice search also approaches the golden section search.

Some refer to fibonacci search when they mean lattice search. The following table shows the difference. For example, with 20 evaluations, fibonacci search can guarantee finding the maximum on a set of 10,946 points, while lattice search can find it on a set of 17,710 points. Golden section is included for comparison as well. With 20 evaluations it can find the maximum on a set of only 9,349 points.

           n        5     10     15     20
===================================
F_n      8     89    987  10946
K_n     12    143   1596  17710
golden      6     76    843   9349
section
===================================


Lattice search minimizes the maximum number of evaluations needed to find the maximum on a given set of points. If the number in that set is $LaTeX: K_N$, the number of evaluations is $LaTeX: N$. If the number in the set is not exactly some modified fibonacci number, one can fill in dummy points at the end of the set. For example, if $LaTeX: K_{N-1} < m < K_N,$ let the set be $LaTeX: {x_1, \ldots, x_m, y_1, \ldots, y_k\}$, where $LaTeX: k = K_N- m$. Then, the (minimum) number of evaluations to guarantee finding the maximum of any unimodal function is $LaTeX: N$.

# Level set

For the function $LaTeX: f:X \rightarrow R,$ the $LaTeX: c$-level set is $LaTeX: \{x: x \in X, \, f(x) \le c\}$. Sometimes, the 0-level set is called simply the level set of f.

# Levenberg Marquardt algorithm

This is designed for solving nonlinear programs using the equation:

$LaTeX: (\nabla^2 f(x) + p I)d = -\nabla f(x),$

where $LaTeX: \nabla^2 f(x)$ is the Hessian. For unconstrained optimization, the solution $LaTeX: d$ serves as a direction vector for the iteration. This is also used in a trust region method. The parameter $LaTeX: p$ is set to give a balance between Newton's method $LaTeX: (p=0)$ and Cauchy's steepest descent $LaTeX: (p >> 0)$. A low value of $LaTeX: p$ helps get through difficult landscape curvature, and a high value yields some descent.

# Lexicographic Ordering constraint

is a global constraint that allows the comparison of sequences of variables. Given two sequences $LaTeX: x$ and $LaTeX: y$ of $LaTeX: n$ variables, $LaTeX: \langle x_1 \dots x_n \rangle$ and $LaTeX: \langle y_1 \dots y_n \rangle$, let $LaTeX: x \preceq_{lex} y$ denote the lexicographic ordering constraint. The constraint holds iff $LaTeX: n = 0$ or $LaTeX: x_1 < y_1$ or $LaTeX: x_1 = y_1$ and $LaTeX: \langle x_2 \dots x_n \rangle \preceq_{lex} \langle y_2 \dots y_n \rangle$.

# Lexicographic order

A nonzero vector is lexicographically positive if its first non-zero coordinate is positive. The vector $LaTeX: x$ is lexicographically greater than the vector $LaTeX: y$ if $LaTeX: x-y$ is lexicographically positive, and this defines the lexicographic order in $LaTeX: \mathbb{R}^n.$ This is a total ordering in that every two vectors are either equal, or one is lexicographically greater than the other.

This was first used in mathematical programming to resolve cycling in the simplex method. It also provides a way to obtain solutions for multiple objectives with the property that $LaTeX: x$ is a Pareto maximum if $LaTeX: f(x)$ is lexicographically greater than or equal to $LaTeX: f(y)$ for all feasible $LaTeX: y$.

# Lifting

In integer programming, it is creating a valid inequality for the original set, given one for a lower dimensional set, by augmenting a binary variable that is absent from the original inequality. That is, suppose $LaTeX: S$ is a subset of $LaTeX: \{0,1\}^n$ and $LaTeX: a^T x \le b$ is a valid inequality for $LaTeX: S\cap \{x \in \{0,1\}^n: x_1=0\}$ (where it does not matter what $LaTeX: a_1$ is since $LaTeX: x_1=0$). Then, under certain conditions, the inequality is valid for $LaTeX: S$. The "conditions" are problem dependent, and $LaTeX: a_1$ is set as large as possible.

For example, suppose $LaTeX: x_2 + x_3 + x_4 \le 2$ is a valid inequality for $LaTeX: S$. A lifting is the strengthening, $LaTeX: 2x_1 + x_2 + x_3 + x_4 \le 2$, if it can be ascertainted that this is a valid inequality for $LaTeX: S$. The coeffiecient of $LaTeX: x_1$ ($LaTeX: 2$ in the example) is determined by considering how large $LaTeX: a_1$ can be with $LaTeX: x_1=1$. (In this case, the lifted inequality is valid if $LaTeX: x_2 + x_3 + x_4 \le 2$ is valid for $LaTeX: \{x \in S: x_1=0\}$ and if $LaTeX: x_1=1$ implies $LaTeX: x_2 = x_3 = x_4 = 0.$) The motivation for doing this arises in a branch and cut algorithm strategy. At a node in the search tree we might have a valid inequality, $LaTeX: a^Tx \le b$, when the branch had $LaTeX: x_1=0$, and we want to make it valid even if $LaTeX: x_1=1$. The inequality is valid for all

$LaTeX: a_1 \le b - \max \{a_2 x_2 + \ldots + a_n x_n : x \mbox{ feasible and } x_1 = 1\}.$

# Line search

Optimizing the objective along a line: $LaTeX: \mbox{opt} \{ f(x + \alpha \, d): s \ge 0\}$, where $LaTeX: x$ is a given point and $LaTeX: d$ is a given direction vector ($LaTeX: \alpha$ is a scalar). A coarse tolerance could be used, so "Opt" is not entirely accurate. (In fact, there are reasons to believe not optimizing is best.)

Here are some particular search methods in this glossary:

# Line segment

The line segment with end points $LaTeX: x$ and $LaTeX: y$ in $LaTeX: \mathbb{R}^n$ is

$LaTeX: \{\alpah x + (1-\alpha)y: \alpha \in [0, 1]\},$

which is denoted $LaTeX: [x, y]$ and is the closed line segment. The open line segment is

$LaTeX: \{\alpah x + (1-\alpha)y: \alpha \in (0, 1)\},$

# Linear combination

The vector $LaTeX: y$ is a linear combination of vectors $LaTeX: x_1, x_2, \ldots, x_n$ if $LaTeX: \textstyle y = \sum_{k=1}^n {a_k x_k}$, where $LaTeX: a_k$ is called the coefficient of $LaTeX: x_k$.

See convergence.

# Linear program

An optimization problem in which the objective and constraints are linear. Forms include $LaTeX: \min \{c^Tx: Ax = b,\, x >= 0\}$ and $LaTeX: \max \{c^Tx : Ax \le b\}$. The standard form assumes $LaTeX: A$ has full row rank. Computer systems ensure this by having a logical variable $LaTeX: (y)$ augmented, so the form appears, for example, as $LaTeX: \min \{c^Tx: Ax + y = b,\, L \le (x, y) \le U\}$ (also allowing general bounds on the variables). The original variables $LaTeX: (x)$ are called structural. Note that each logical variable can be a slack, surplus, or artificial variable, depending on the form of the original constraint. This computer form also represents a range constraint with simple bounds on the logical variable. Some bounds can be infinite (i.e., absent), and a free variable (logical or structural) is when both of its bounds are infinite.

# Linearity interval

When a univariate function is piece-wise linear, it has the form $LaTeX: a_i x + b_i$ for $LaTeX: x$ in the interval $LaTeX: [c_i, c_{i+1}]$, where $LaTeX: a_i$ is not equal to $LaTeX: a_{i+1}$. (The phrase usually means the function is continuous.) This arises in linear programming when considering the optimal value as a function of varying right-hand sides (or cost coefficients) in fixed proportions: $LaTeX: b+td$ (or $LaTeX: c+td$), where $LaTeX: d$ is an admissible change vector and $LaTeX: t$ is the (scalar) variable. Then, for the range of $LaTeX: t$ where the LP has a solution, the optimal value function has this piece-wise linear (continuous) form, and the intervals that comprise the domain are called linearity intervals.

# Linearization

The strategy of linearization is to reformulate a nonlinear program as a linear progarm through the introduction of auxiliary variables and constraints that are used to circumvent the nonlinearities. See standard linearization, Glover's linearization, and Reformulation-Linearization-Technique

# Lipschitz continuous

The function $LaTeX: f: X \rightarrow \mathbb{R}$ is Lipschitz continuous if there exists a value $LaTeX: K$, called the Lipschitz constant, such that $LaTeX: |f(x) - f(y)| \le K \|x-y\|$ for all $LaTeX: x$ and $LaTeX: y$ in $LaTeX: X$. This relation is called the Lipschitz condition. It is stronger than continuity because it limits the slope to be within $LaTeX: [-K, K]$. The generalized Lipschitz condition is that there exists a monotonically increasing function, $LaTeX: M : \mathbb{R} \rightarrow \mathbb{R}$ with the property that $LaTeX: m(z) \rightarrow 0$ as $LaTeX: z \rightarrow 0$ such that there exists $LaTeX: K$ for which $LaTeX: |f(x)-f(y)| \le K m(\|x-y\|)$ for all $LaTeX: x$ and $LaTeX: y$ in $LaTeX: X.$

See convergence.

# Local optimum

A feasible point $LaTeX: x^*$ that is an optimal solution to the mathematical program whose feasible region is the intersection of the original region and some neighborhood of $LaTeX: x^*$.

# Locally convex function

There exists a subset of the domain, such as the neighborhood of a point, such that the function is convex on that subset.

# Lockbox problem

A firm wants a check that it receives to clear as quickly as possible (so it can use the money). To reduce the number of days to clear a check, the firm opens offices, called lockboxes, in different cities to handle the checks. There is a cost to open a lockbox, and there are average annual losses calculated from interest rates and the number of days to clear a check sent from source $LaTeX: i$ to lockbox city $LaTeX: j$.

# Logical variable

See linear program.

# Lot size problem

This is one of the oldest mixed-integer programs in operations research, first presented by Wagner and Whitin in 1958. The problem is to minimize cost while satisfying product demands over (discrete) time. Let $LaTeX: y_t$ be the number of units produced in period $LaTeX: t$, for $LaTeX: t=1,\ldots,T$ ($LaTeX: T$ is called the planning horizon), and let $LaTeX: x_t = 1$ if a setup occurs in period $LaTeX: t$; $LaTeX: 0$ otherwise. Let $LaTeX: D_t$ be the demand in period $LaTeX: t$, and let the demand from period $LaTeX: i$ to period $LaTeX: j,$ inclusive, be $LaTeX: d_{ij} = \sum_{i \le t \le j} D_t.$ Then, a MILP formulation is:

$LaTeX: \min \left\{ c^T x + d^T y: x \in \{0,1\}^n, \, y \ge 0, \sum_{t\le i} y_t \ge d_{li} \mbox{ for } i=1, \ldots,n-1, \, \sum_{t\le n} y_t = d_{1n}, \, d_{in} x_i - y_i \ge 0 \mbox{ for } i=1, \ldots,n \right\}.$

(This is preferred over the MILP that defines inventory balance equations. Even though they are equivalent, the LP relaxation of the above is sharper.)

# Lower semi-continuity

Suppose $LaTeX: x^k \rightarrow x$. Of a function, $LaTeX: f$ is lower semi-continuous if

$LaTeX: \lim_{k \rightarrow \infty} \inf_{j \ge k} f(x^j) \ge f(x).$

Of a point-to-set map, $LaTeX: A$, let $LaTeX: N_{\varepsilon}[S]$ be a neighborhood of the set $LaTeX: S$: for each $LaTeX: \varepsilon > 0$, there exists $LaTeX: K$ such that for all $LaTeX: k \ge K$, $LaTeX: A(x)$ is a subset of $LaTeX: N_{\varepsilon}[A(x^k)]$. A pathology to show that the feasibility map may not be lower semi-continuous is:

$LaTeX: A(b) = L_b(g) = \{x \in [-1, 1]: g(x) \le b\},$

where $LaTeX: g(x) = x$ if $LaTeX: x$ is in $LaTeX: [-1, 0]$, and $LaTeX: g(x) = 0$ if $LaTeX: x \in [0,1].$

# Lower triangular matrix

A square matrix, $LaTeX: A$, such that $LaTeX: A_{ij}=0$ for $LaTeX: i \le j$.