All Pages

From Glossary

Jump to: navigation, search

BFGS method

This is a method to solve an unconstrained nonlinear program that proceeds as follows.

  1. Start with any symmetric, negative definite matrix, say LaTeX: B^0, and any initial point LaTeX: x^0. Set LaTeX: k = 0
  2. Direction: Solve LaTeX:  B^k d^k = -\nabla f(x^k).
  3. Step size: LaTeX: s^k \in \arg\max \{ f(x^0 + td^k) : t \ge 0 \}.
  4. Change in position: LaTeX:  p^k = s^k d^k.
  5. New point: LaTeX:  x^{k+1} = x^k + p^k.
  6. Change in gradient: LaTeX: q^k = \nabla f(x^{k+1}) - \nabla f(x^k).
  7. Update LaTeX: B^k to form LaTeX: B^{k+1} with the BFGS update.
  8. Let LaTeX: k = k+1 to complete the iteration.

Compare this with the DFP method, and note a key difference is that DFP estimates the inverse hessian, whereas BFGS estimates the hessian (and solves the linear system). There is a correspondance, and empirical evidence suggests, that the BFGS method has less of a problem with error.

BFGS update

A way to update an approximation of the hessian, used for unconstrained optimization. The update is

B' = B + \frac{(q - Bp)(q - Bp)^T}{p^T(q - Bp)},

where LaTeX: B' is the update from LaTeX: B and LaTeX: p and LaTeX: q are defined in the BFGS method for the LaTeX: k-th iteration. Taking the inverse to compare with the DFP update,

H' = H + \frac{1 + q^T H q}{(q^T p)^2} \left( pp^T \right) - 
\frac{1}{p^T q} \left( pq^T H + H qp^T \right)

where LaTeX: H' is the update from LaTeX: H.


See basic feasible solution.


Originally, this was a set of propositions that must be true in every solution of a satisfiability problem. If there is no solution, one seeks a set of truth values that maximizes the number of clauses that are true. Then, the backbone is the set of propositions that must be true in every optimal solution. This has been extended to various combinatorial programs, for which the backbone is

\{j : x_j = 1 \mbox{ for all } x \in X^* \},

where LaTeX: x is a vector of binary variables in the IP formulation and LaTeX: X^* is usually the optimality region, although it could allow some ball around the objective value. Some results suggest that the difficulty of an NP-hard problem can be further analyzed by the size of itsbackbone.


A backtracking algorithm that, in contrast to classic backtracking algorithms, records partial justifications for inferences that are made during the tree search. These justifications allow the search to soundly jump back several levels in the search tree in case of an inconsistency.


An algorithm applied in tree search that builds up candidate solutions by making one decision at a time (i.e., branching, see branch and bound) and checks for feasibility at each decision. In case an infeasible state is reached,the algorithm "backtracks", returning to a previous decisions and changing it.

In constraint programming a narrower definition is often used: backtracking is a recursive algorithm that traverses the search tree depth-first, such that at each step a variable is assigned a value (all possible values are assigned in turn) and then the partial assignment is checked for consistency, i.e. the algorithm checks whether all constraints whose variable are fully assigned, are satisfied. If there is an inconsistency, the algorithm prunes the sub-tree and tracks back one step.

Backward Transformation

Backward Transformation (abbr. BTRAN). This applies to the factored system, LaTeX: u E_1 E_2 \ldots E_n = c, where each LaTeX: E_i is an elementary matrix. The recursion is:

LaTeX: u_n E_n = c
LaTeX: u_{n-1} E_{n-1} = u_n
LaTeX: u_{1} E_{1} = u_2,
for which LaTeX: u_1 is a solution to the original system.

Backward substitution

The recursion to solve a nonsingular upper triangular system, LaTeX: Ux = b. It starts with LaTeX: x_n = b_n / U_{n,n}, then

x_j = \frac{1}{U_{j, j}} \left( b_j - \sum_{i > j} U_{i,j} x_i \right)  \mbox{ for } j = n-1, \ldots ,1.

Backward triangularization

An algorithm to rearrange a matrix by recursively assigning a singleton column to the rear (with its only row as its pivot). The recursion applies to the submatrix defined by omitting the assigned column and row (and reducing remaining column counts accordingly). This results in a lower-triangular rearrangement if and only if such a rearrangement exists. Otherwise, the algorithm reaches a submatrix will all columns having more than one element, of which one is selected to be a spike that is assigned to the rear of the matrix. The process then continues on the submatrix.

Example 1: Let the nonzeros be denoted by *:

<pre> A = \begin{bmatrix}
               * &   &   &     \\
               * & * & * & *   \\
               * &   &   & *   \\
               * &   & * & *

The column counts are LaTeX: (4,1,2,3). We thus put column LaTeX: 2 at the rear, assigned to pivot on row LaTeX: 2, and consider the remaining LaTeX: 3\times 3 submatrix:

<pre>A^1 = \begin{bmatrix}
               * &   &   & \cdot &   \\
               * &   & * & \cdot &   \\
               * & * & * & \cdot &   \\
               \cdot & \cdot & \cdot & \cdot & \cdot \\
               * & * & * & \cdot & * 

The remaining column counts (for LaTeX: A^1) are LaTeX: (3,1,2), so column LaTeX: 2 is placed at the (new) rear, pivoting on row LaTeX: 3:

<pre>A^2 = \left[\begin{array}{cc|cc}
               * &   &   &     \\
               * & * &   &     \\  \hline
               * & * & * &     \\
               * & * & * & *

The remaining column counts are LaTeX: (2,1), so we reach the final rearrangement, which is triangular:

<pre>A^3 = \left[\begin{array}{cc cc}
               * &   &   &     \\
               * & * &   &     \\
               * & * & * &     \\
               * & * & * & *

Example 2: Column counts are shown below the matrix.

      A = \left[\begin{array}{cccc}
               * &   &   &     \\
               * & * & * & *   \\
               * & * &   & *   \\
               * &   & * & *
           \end{array}\right] \\
  \phantom{A = \left[\right.}~
              4 & 2 & 2 & 3

All column counts are greater than one, so we select a column to be a spike, and we assign it to the rear with a pivot row (some algorithms delay the pivot row assignment). For this example, we choose a column with minimum count and pick the pivot row with maximum count.

      A^1 = \left[\begin{array}{ccc | c}
               * &   &   &     \\
               * & * &   &     \\
               * &   & * & *   \\ \hline
               * & * & * & *
    & A^2 = \left[\begin{array}{cc |cc}
               * &   &   &     \\
               * & * &   & *   \\ \hline
               * &   & * &     \\
               * & * & * & *
           \end{array}\right] = A^3.

The final rearrangement (LaTeX: A^3) has one spike.

Balance equation

A constraint of the form LaTeX: I(x) - O(x) = 0, where LaTeX: I(x) is the input and LaTeX: O(x) is the output, each as a function of the decision variables, LaTeX: x. Typically this is linear, conserving flows, measured in units of volume or mass. In an ordinary network in which there are no gains or losses, this appears as:

I(x) - O(x) = \sum_{i} x_{i,j} - \sum_{k} x_{j,k} = 0.

Barrier function

LaTeX: P is a barrier function on the strict interior, say LaTeX: I, of the set LaTeX: S if LaTeX: P is continuous, and LaTeX: P(x) \rightarrow \infty as LaTeX: x approaches any point in LaTeX: S\backslash I. Barrier functions arises in penalty function methods for certain classes that include convex programs of the form:

\min \{ f(x) : g(x) \ge 0 \},

where LaTeX: I = \{ x : g(x) > 0 \}. Two popular choices are

P(x) = \sum_i 1 / g_i(x),


P(x) = -\sum_i \ln (g_i(x)).


This is the center of mass and is the point at which a mass (body) can be considered as being concentrated without altering the effect of the attraction that the earth has upon it. This is the average position of particles located at postions LaTeX: x^i weighted by their masses LaTeX: m_i,

\frac{\sum_i m_i x^i}{\sum_i m_i}.

In mathematical programming, this is usually used to mean the simple average of the extreme points of a polytope,

\frac{1}{m} \sum_i x^i,

where LaTeX:  \{x^1, \ldots, x^m\} is the set of extreme points. This could be a weighted average,

\sum_i w_i x^i,

where LaTeX:  w \in S_m.


Associated with a submatrix of LaTeX: A, say LaTeX: B, whose columns comprise a basis for LaTeX: \mathbb{R}^m (i.e., LaTeX: B consists of LaTeX: m linearly independent columns of LaTeX: A, which is a basis for LaTeX: \mathbb{R}^m).

Here are some related terms that arise in linear programming.

  • Adjacent basis. One that differs in exactly one column from a given basis.
  • Basic column. A column of the basis matrix.
  • Basic variable. The variable, say LaTeX: x_j, associated with the LaTeX: j-th column of the basis matrix.
  • Basic level. The value of a basic variable.
  • Basic solution. The solution, LaTeX: x, obtained by setting nonbasic values to some bound value, like 0, resulting in a unique solution for the basic variables. That is, LaTeX: Ax=b is equivalent to LaTeX: Bu + Nv = b, where LaTeX: A=[B \; N] and LaTeX: x=[u^T \; v^T]^T. Upon fixing the value of LaTeX: v, the nonsingularity of LaTeX: B gives the basic solution with LaTeX: u = B^{-1}(b - Nv). In a standard linear program, LaTeX: v=0, and hence LaTeX: u = B^{-1} b.
  • Basic feasible solution. A basic solution that is feasible --i.e., the basic values satisfy their bounds. (In a standard LP, this means LaTeX: B^{-1}b \ge 0.)
  • Basis kernel. After performing forward triangularization, if the basis does not triangularize completely, backward triangularization is applied. The result is a (rearranged) blocking of the basis into three segments:
                    | \ <--- Forward triangle
                    |__\ ______
                    |   |      |
                    |   |      | <--- Kernel
                    |   |______|
                    |          |\
                    |          | \ <--- Backward triangle
    Each row and column in the kernel has at least 2 nonzeroes.

Basic solution

See terms associated with being basic.

Benders decomposition

This decomposition separates complicating variables. The (original) semi-linear model is:

\max \{ c^Tx + F(y) : x \in X, \; y \in Y, \; Ax + G(y) \le b \}.

Generalized Benders decomposition considers the following mathematical program:

\max \{ f(x, y) : g(x, y) \le 0, \; h(x, y) = 0, \; x \in X, \; y \in Y \}.

Suppose this is not a convex program, but that with each fixed LaTeX:  y \in Y, the remaining mathematical program (with LaTeX: x being the only variable) is convex. The idea is to consider

v(y) = \sup \{ f(x, y) : g(x, y) \le 0, \; h(x, y) = 0, \; x \in X \},

and solve LaTeX: \max \{v(y) : y \in Y\}, with evaluations of LaTeX: v requiring the solution to the convex program.

Biconcave function

A function whose negative is biconvex.

Biconvex function

A function LaTeX: f(x, y) on LaTeX: X \times Y that is convex on LaTeX: X for each LaTeX: y \in Y and convex on LaTeX: Y for each LaTeX: x \in X. An example is LaTeX: f(x,y) = xy on LaTeX:  \mathbb{R}^2, which is not convex in both LaTeX: x and LaTeX: y together.

Biconvex program

A nonlinear program with each function biconvex or biconcave, such that if LaTeX: x or LaTeX: y is fixed, the resulting mathematical program in LaTeX: y or LaTeX: x, resp., is a convex program. A special case is the bilinear program.

Bidding algorithm

Also called an auction algorithm. This proceeds by generating a sequence of partial solutions (e.g., a partial assignment), terminating when the solution is complete. There is a pricing vector and a bidding increment whereby the solution is revised and extended by a myopic optimization scheme (which depends on the particular problem).

Big-M method

An alternative to using Phase I, a large number LaTeX: M is used as a linear penalty in the composite objective:

\min \{ c^T x + M \; e^T v: Ax + v = b, \; x, v \ge 0 \},

where LaTeX: v is an artificial variable and LaTeX: e = (1, 1, ..., 1)^T.

Bilevel program

A multilevel program with two levels (sets of variables).

Bilinear function

A special quadratic function, LaTeX: x^T Q y + c^Tx + b^Ty, that is linear in LaTeX: y for each LaTeX: x, and linear in LaTeX: x for each LaTeX: y.

Bilinear program

A nonlinear program whose objective and/or constraint functions are bilinear. An example is the pooling problem.

Bin packing problem

Partition a set of integers, LaTeX: \{c_1, \ldots, c_n\}, into bins such that the sum of all integers in each bin does not exceed the bin capacity, say LaTeX: b, so as to minimize the number of bins required. The problem is NP-hard.

Binary CSP

A constraint satisfaction problem is binary if all constraints are either unary or binary relations (wrt the decision variables). That is, the cardinality of the scope of the constraint is either 1 or 2. For example, LaTeX: x > 2 (unary) or LaTeX: x \neq y (binary).

Binary constraint

A constraint is binary if there are exactly two variables in its scope.

Binary relation

A subset of LaTeX: S \times T, say LaTeX: R. When LaTeX: T=S, we refer to just the relation on the set LaTeX: S.

Here are 5 properties that distinguish relations on a set (LaTeX: x, LaTeX: y, LaTeX: z are in LaTeX: S):

  1. antisymmetric - LaTeX: (x, y) \in R and LaTeX: x \ne y imply LaTeX: (y, x) \not\in R.
  2. irreflexive - LaTeX: (x, x) \not\in R.
  3. reflexive - LaTeX: (x, x) \in R.
  4. symmetric - LaTeX: (x, y) \in R implies LaTeX: (y, x) \in R.
  5. transitive - LaTeX: (x, y) and LaTeX: (y, z) both in LaTeX: R imply LaTeX: (x, z) \in R.

Binary variable

A decision variable with two feasible values, usually 0 or 1.

Binding constraint

A constraint whose removal changes the optimality region. (Some authors require that the removal results in a strict improvement in the objective value.)


Taking the midpoint of endpoints known to contain a solution. If the solution is in the interval (a,b), bisection chooses x = (a+b)/2 as the next point to evaluate. The result reduces the interval to (a,x) or (x,b), so the length is cut in half with each evaluation.

Bland rule

This is for pivot selection in the simplex method to avoid cycling:

  • If more than one (nonbasic) column has a negative (for minimization) reduced cost, choose the one with lowest index.
  • If more than one (basic) column has the same determining value to leave the basis, select the one with the lowest index.

Beale's example of cycling shows that Bland's Rule must apply to all candidates for entering the basis, not just those with most negative reduced cost, see the supplement on cycling. The fifth tableau is the first time Bland's Rule is needed to break the cycle. Variable LaTeX: x_1 is chosen to enter the basis rather than LaTeX: x_5.

Blending problem

A blend is some combination of materials to make another material. Given raw materials, their blends make intermediate materials, called stock, and/or final materials, called products (there can be other blending activities that make products from stocks). The raw materials have purchase costs and the blending activities have costs of operation and maintenance. The products have either fixed demands or selling prices, and the problem is to find blends that minimize total net cost (or maximize profit). This arises in refinery operations, in the food industry, and in other process industries. The problem can sometimes be solved with linear programming, but there can be complicating relations that require nonlinear programming, such as the pooling problem.

Block pivot

Given a detached coefficient form, a pivot is performed on a nonsingular submatrix of nonbasic variables (rather than just one element). Suppose the detached coefficient form is partitioned into 2 row blocks and 5 column blocks (with the last column corresponding to the right-hand side):

A = \left[ \begin{array}{ccccc}
I & 0 & P & R & e \\
0 & I & Q & S & f
\end{array} \right].
(Note: each LaTeX: I is an identity matrix of appropriate dimension.) If LaTeX: P is nonsingular, the basis can change by entering all (nonbasic) variables associated with the columns of LaTeX: P for all basic variables associated with the rows of LaTeX: P.

Blocking polyhedron

The blocking polyhedron of a polyhedron LaTeX: P is LaTeX:  \{ y \in \mathbb{R}^n_+ : y^T x \ge 1 \mbox{ for all } x \in P\}.

Bolzano-Weierstrass theorem

A fundamental existence theorem guaranteeing a maximum and a minimum.

A continuous function on a non-empty compact set achieves a minimum and a maximum over the set.

A useful corollary to the Bolzano-Weierstrass theorem is:

Suppose LaTeX: f is continuous on its non-empty feasible region, LaTeX: F, and that LaTeX: F is a closed set. Then, LaTeX: f achieves a maximum on LaTeX: F if there exists LaTeX: x^0 such that LaTeX:  \{ x \in F : f(x) \ge f(x^0) \} is bounded.

In particular, if we have a quadratic objective of the form LaTeX: f(x) = x^T Q x + c^T x , it is sufficient for LaTeX: Q to be negative definite. Although often credited to Weierstrass alone, it was proven by Bolzano in 1817, and it was known to Cauchy near the end of the 19th century.

Bordered hessian

The complete Hessian of the Lagrangian:

\left[ \begin{array}{cc}
H_x[L(x,u,v)] & G^T \\
G &  0
\end{array} \right],
G = \left[ \begin{array}{cc}
\nabla g(x) & 0 \\ 0 & \nabla h(x)
\end{array} \right].
Here we let LaTeX: H_x[L(x,u,v)] be the Hessian of the Lagrangian (L) with respect to LaTeX: x only. The southwest and northeast blocks are the constraint gradients because they give the cross derivative between LaTeX: x and the Lagrange multipliers LaTeX: (u,v).

Bottleneck TSP problem

The problem asks if there a traveling salesman problem tour whose value is less than some given value?

Bottleneck assignment problem

This is the assignment problem, where the "cost", LaTeX: c_{i, j}, is the rate at which person LaTeX: i can perform job LaTeX: j. The objective is to maximize the rate of production of the entire assembly line.

Bottleneck transportation problem

This is the transportation problem with a traversal time instead of arc cost, and the objective is to minimize the time it takes to satisfy the demands:

\min \left\{ \max \{ t_{ij} : x_{ij} > 0 \} : x \ge 0, \sum_j x_{ij} \le s_i, \sum_i x_{ij} \ge d_j \right\},

where LaTeX: x_{ij} is the flow from source LaTeX: i to destination LaTeX: j, and LaTeX: t_{ij} is the time it takes to ship across arc LaTeX: (i, j) (independent of the quantity shipped).

Bounds consistency

A consistency property for variables over integer domains where consistency is only defined on the minimum and maximum values of the integer domain.

Box constraint

Simple bounds of the form LaTeX: L \le x \le U.

Braess paradox

The phenomenon that in a traffic network in which travel times depend on link (arc) flows the addition of a link (arc) can increase travel time for every user if all users minimize their own travel time. Hence, the paradox is that building a new road can make traffic congestion worse. See Myths and Counter Examples in Mathematical Programming for an example.

Branch and bound

Born in integer programming (IP), branch and bound (B&B) is more generally used in global optimization, including continuous functions. A prototype is as follows.

  1. Initialize set list with feasible region (LaTeX: X plus all constraints), say LaTeX: S, with associated bounds LaTeX: -\infty, LaTeX: \infty (finite bounds on the objective LaTeX: f can be computed, if computationally preferable, and the initial set could be a superset of the feasible region). Initialize LaTeX: f^* equal to the lower bound (LaTeX: -\infty, unless an initial feasible solution is known).
  2. Branch by selecting the LaTeX: k-th set from list, say LaTeX: S_k, and splitting into a partition, say LaTeX: S_k = A \vee B such that the interior of LaTeX: A does not intersect the interior of LaTeX: B. (Could partition LaTeX: S_k into more than 2 subsets.)
  3. Bound LaTeX: f in LaTeX: A and/or in LaTeX: B, say LaTeX: L(S) \le f(x) \le U(S) for all LaTeX: x \in S, for LaTeX: S = A and/or LaTeX: B. Define LaTeX: U(S) = -\infty if it is determined that LaTeX: S is empty (in which case LaTeX: S will be discarded in the next step).
  4. Test if LaTeX: U(S) \le f^* + t, for LaTeX: S = A and/or LaTeX: B. If not, put LaTeX: S on list. (If the test passes, LaTeX: S is discarded by not being put on the list. Note: LaTeX: f^* is best solution value to date, and LaTeX: t is a tolerance that says we want to get a solution whose objective value is within LaTeX: t of the optimal value.) Further, if LaTeX: L(S) > f* and the associated point, LaTeX: x(S), is feasible (where LaTeX: f(x(S)) = L(S)), replace LaTeX: f^* with LaTeX: L(S) and LaTeX: x^* with LaTeX: x(S). If the list is not empty, go to branch.

B&B terminates when the list becomes empty, and the best feasible solution obtained is LaTeX: x^* with value LaTeX: f^*, which is within LaTeX: t of optimality. (If LaTeX: f^* = -\infty, then there is no feasible solution.) Notes:

  1. A particular bounding rule for IP is to solve an LP relaxation, whose value is an upper bound on the integer-valued maximum. The lower bounds do not change unless an integer solution is obtained from the LP, in which case the best solution to date can change.
  2. Branching rules can range from breadth-first, i.e. expand all possible branches from a node in the search tree before going deeper into the tree, to depth-first, i.e. expand the deepest node first (and backtrack, as necessary). For example, suppose we branch on the value of LaTeX: x_j, which could be LaTeX: 0 or LaTeX: 1. The breadth-first rule would solve the LPs for each of the two cases. The depth-first rule would solve for LaTeX: x_j=0 and postpone the complementary case, LaTeX: x_j=1 to consider another variable dichotomy, going deeper into the search tree. To illustrate, the following are search progressions in LaTeX: \{0, 1\}^2;.


             ____.                 ____.____          ____.____
            |                     |         |        |         |
          (x=0)                 (x=0)     (x=1)    (x=0)     (x=1)
                 1                     2                  3
             ____.____             ____.____          ____.____
            |         |           |         |        |         |
          (x=0)     (x=1)       (x=0)     (x=1)    (x=0)     (x=1)
           /\                    /\         /       /\         /\
       (y=0)(y=1)            (y=0)(y=1) (y=0)   (y=0)(y=1) (y=0)(y=1)
                 4                     5                  6


              ____.                ____.              ____.
             |                    |                  |
           (x=0)                (x=0)              (x=0)
                                 /                  /\
                             (y=0)              (y=0)(y=1)
                 1                     2                  3
              ____.____            ____.____          ____.____
             |         |          |         |        |         |
           (x=0)     (x=1)      (x=0)     (x=1)    (x=0)     (x=1)
            /\                   /\         /       /\         /\
        (y=0)(y=1)           (y=0)(y=1) (y=0)   (y=0)(y=1) (y=0)(y=1)
                 4                     5                  6
  3. A best-first search is to select the node (i.e., set on list) that seems most promising, such as one with the greatest upper bound. Typically, there are multiple criteria for branching, and these change depending on whether a feasible solution has been obtained.
  4. Backtracking need not go in exactly the reverse order (i.e., need not have a LIFO rule), but care must be taken when reordering the path.
  5. When the problem is not finite (not IP), branching rules need to consider the hyper-volume of each set in a partition to ensure convergence.
  6. The B&B methods we consider in OR are special cases of heuristic search in Artificial Intelligence (AI).

Branch and cut

A hybrid of Branch and bound and cutting plane methods.

Branch and price

The price operation is the same as column generation.

Branching heuristic

See branching rule.

Branching rule

A branching rule (or branching heuristic) specifies how a search tree is constructed during search. More specifically, it is a procedure or set of conditions for choosing the next variable to be branched on. It is part of the search strategy.

In constraint programming a branching rule is commonly referred to as a variable ordering heuristic.


A point where some property of a function changes, such as its slope. For example, a piece-wise linear, univariate function with LaTeX: k breakpoints has the form:

f(x) = a_i x + b_i \mbox{ for } c_i \le x < c_{i+1}
where LaTeX: i=1, ...,k and LaTeX: c_{k+1} is defined to be infinity and LaTeX: c_1 is the least value of LaTeX: x. This function is continuous if it is equal at breakpoints, but the slope changes if LaTeX: a_i \neq a_{i+1} for LaTeX: i=1, ..., k-1.

Broyden-Fletcher-Goldfarb-Shanno method

See BFGS method

Broyden-Fletcher-Goldfarb-Shanno update

See BFGS update.

Broyden family

This is a family of algorithms to solve an unconstrained nonlinear program where the objective function is in LaTeX: C^2;. The algorithms in this family differ by how they update the approximation of the hessian inverse (assumed to exist). These are of the form:

H = aH^{\mbox{BFGS}} + (1-a)H^{\mbox{DFP}},

with the two matrices being the Broyden-Fletcher-Goldfarb-Shanno update (BFGS) and the Davidon-Fletcher-Powell update (DFP), respectively. If LaTeX: a is in LaTeX: [0, 1], this is called the Broyden convex family.

Bundle method

A class of nonsmooth-optimization methods for the numerical minimization of a convex function, LaTeX:  \theta : \mathbb{R}^n \rightarrow \mathbb{R}. At the LaTeX: kth iteration the technique uses the available information contained in a bundle LaTeX: 
(\theta(v^1), g_1, \ldots , \theta(v^k), g_k) that is obtained from an oracle; here each LaTeX: g_i is a subgradient of LaTeX: \theta at LaTeX: v^i.

Important applications are Lagrangian relaxation and column generation applied to a primal problem

\max \{ p(x) : x \in X, \; h(x) = 0 \}.

In this context, for a given LaTeX: v \in \mathbb{R}n ,

\theta(u) = \max \{ p(x) - v^T h(x) : x \in X \}

results from the maximization of the Lagrange function; each LaTeX: g_i is LaTeX: -h(x_i), with LaTeX: x^i maximizing the Lagrangian at LaTeX: v^i. Minimizing LaTeX: \theta is the dual problem.

If minimizing LaTeX: \theta(v), a basic method to construct the next iterate LaTeX: v^{k+1} uses the cutting-plane paradigm: one minimizes the piecewise-linear function

\theta_k(v) = \max \{ \theta (v^i) + g_i^T (v - v^i) : i = 1, ... ,k \},

which can be viewed as a model of LaTeX: \theta (in fact LaTeX: \theta_k \le \theta). Minimizing LaTeX: \theta_k is a linear program, which is called the restricted master program in the context of column generation. One of the deficiencies of this method is instability -i.e. the sequence LaTeX: v^k usually oscillates uncontrollably. Several possibilities exist to stabilize the process; one, adopted in the bundle approach, replaces the minimization of LaTeX: \theta_k by the quadratic program

\min_v \{ \theta_k(v) + (1/(2t)) |v - C|^2 \}.

Here the positive LaTeX: t is a stability parameter that is usually adjusted at each iteration, and the stability center LaTeX: C is

some LaTeX: v^i (LaTeX: i \le k) taken from the bundle, typically one that gave the best LaTeX: \theta-value.

Personal tools