# Convergence

### From Glossary

Specifically, convergence of an algorithm. The algorithm is
represented as the point-to-set map,
, where there is some
*selection function* to choose
if has more than one member. Convergence means that the
sequence, , has a limit point, say ,
such that satisfies certain conditions. Ideally, the
conditions are that is an optimal solution, at
least locally, but this is often not the definition used in a
convergence theorem. (For example, in unconstrained optimization,
most algorithms converge to a stationary point,
which need not be even a local optimum.)

Let , where is a scalar series
pertaining to the sequence . Typically, , which is sometimes called *policy convergence* (to a solution, ). We could also
have , where is the objective function, in which case the concern is
with *value convergence* to an optimal value, .

Some related terms and concepts:

*Dual convergence*. Dual values, notably Lagrange multipliers, converge to a dual solution.*Geometric convergence*. Same as linear convergence, but usually used when the sequence is exactly a geometric progression: - is to the power of*Global convergence*. Usually means the same as globally convergent, but some mean that the algorithm convergences to a global solution.*Globally convergent*. Convergence to a solution from any starting point.*Linear convergence*. Order is 1 and rate is less than 1 (see below).*Local convergence*. Some mean locally convergent, but some mean that the limit point is a local optimum (or just satisfies necessary conditions, see Myths and Counter Examples in Mathematical Programming to avoid misconception).*Locally convergent*. There exists an open neighborhood of 0 such that {s^k}-->0 from any s^0 in the neighborhood.*Order of convergence*. The order is . The order being 1 is*linear*convergence and the order being 2 is*quadratic*convergence. Most (non-finite) algorithms to solve mathematical programs are between linear and quadratic.Let and suppose (for notational convenience). The term

*order*derives from the approximate equation, , where is the rate. If this equation were exact, we would have if , and if , for all . If and , we gain one decimal place each time: , , , etc. If , the number of accurate decimal places approximately doubles each iteration: , , , etc. Unfortunately, the underlying equation is not exact – see Myths and Counter Examples in Mathematical Programming to avoid misconception. Some authors call this*q-order*(or*Q-order*) convergence to distinguish it from variations of the definition of order. Each definition is designed to capture the notion of how many digits of accuracy are added as the sequence approaches its limit.- Rate of convergence. This is generally used to mean the limiting ratio: , given the order is .
- Sublinear convergence. Order is 1 and rate is 1 (slower than all linear convergence), e.g., .
- Superlinear convergence. Order is 1 and rate is 0 (faster than all linear convergence), e.g., .
- Stochastic convergence. This applies when the successor point is a random variable, as in simulated annealing. Here are the most common types of convergence for a sequence of random variables, .
- with probability or almost everywhere (abbr., a.e.). .
- in probability. for all .
- in distribution. The sequence of cumulative distribution functions (cdf), converges point-wise: for all at which is continuous, where is the cdf of and is the cdf of .
- in p-th mean. . For , this is called convergence in quadratic mean or in mean square.