 This applies to the case of linear constraints: $LaTeX: \textstyle \max \left \{f(x): Ax = b, x \ge 0 \right \},$ where $LaTeX: \textstyle \mbox{rank}(A) = m$ and $LaTeX: f \in C^2.$ The variables are partitioned into $LaTeX: x = (v,w),$ with the corresponding partition of $LaTeX: \textstyle A = \begin{bmatrix} B & C \end{bmatrix},$ such that the mathematical program is equivalent to: $LaTeX: \max \left \{ f(v, w): Bv + Cw = b, (v, w) \ge 0 \right \}.$

The method assumes $LaTeX: B$ is nonsingular (i.e., only variables with linearly independent columns in $LaTeX: A$ can be grouped), and the nondegeneracy assumption: $LaTeX: v > 0.$ Now dw is the independent direction of change, and $LaTeX: \textstyle dv = -B^{-1}\begin{bmatrix} C & dw \end{bmatrix},$ thus keeping $LaTeX: \textstyle x + dx = (v+dv, w + dw)$ on the hyperplanes -- i.e., $LaTeX: \textstyle A[x + dx] = Ax + A[dx] = b + 0 = b.$

The method considers a direction for the independent variables, which then determines the direction for the dependent variables. In particular, $LaTeX: dw$ is chosen by evaluating the gradient of the objective: $LaTeX: \textstyle f(B^{-1} [b-Cw,w]).$ This gradient (with respect to $LaTeX: w$) is called the reduced gradient: $LaTeX: r = \nabla w[f(x)] - \nabla v[f(x)]B^{-1} C$

at $LaTeX: x = (v, w).$ Then, $LaTeX: dw = r$ completes the first part of the iteration. The second part is to select a step size, which can be blocked by the non-negativity of $LaTeX: v.$ The resulting change can cause the working set to change such that the partition changes.

Also see the generalized reduced gradient method.