IHX: Unconstrained quadratic optimization: the algorithm

We finally have all the tools in place to describe a step algorithm for minimizing a (convex) quadratic objective function without any constraints.

Minimize $f(x)= c\cdot x+\frac{1}{2}x^TCx$ where $C$ is a positive semidefinite matrix.

We start with any seed value $x_0\in \mathbb R^n$. Recall that the modified Gram-Schmidt process starts with a set of independent vectors and gradually replaces them with an orthonormal set. We will start with the standard basis $e_1,\ldots,e_n$, recorded as the columns of a matrix $GS_0=I$. We use a vector $J\in\{Y,N\}^n$ to keep track of which columns have been orthonormalized. So to begin we have $J=\{N,\ldots,N\}$, and as we work on a column to create a conjugate direction, we add $Y$ to that index of $J$. Set $g_0=\nabla f(x_0)=c+Cx_0$ and $j=0$ to initialize, where $j$ represents the iteration of the algorithm.

Step 1: Computation of search direction $s_j$: Let $GS_j=[c_{1j},\ldots, c_{nj}]$ and $J_j=\{\alpha_{1j},\ldots,\alpha_{nj}]$. If each component of $J_j$ is equal to $Y$, then stop with optimal solution $x_j$. Otherwise, compute the smallest index $k$ such that $$|g_j^Tc_{kj}|= \max\{|g_j^Tc_{ij}|\,|\, \forall i \alpha_{ij}=N\}.$$ If $g^T_jc_{kj}=0$, stop with optimal solution $x_j$. Otherwise set $s_j=\operatorname{sgn}(g^T_jc_{kj})c_{kj}.$ Go to step 2.
Step 2: Computation of step size $t_j$: Compute $\langle s_j,s_j\rangle$. If it is $0$, then stop. The objective function is unbounded from below. Otherwise set: $$t_j=\frac{g_j^Ts_j}{\langle s_j,s_j\rangle} $$ Go to step 3.
Step 3: Updating data: Set

$x_{j+1}:=x_j-t_js_j$.
$g_{j+1} := \nabla f(x_{j+1}) = c+Cx_{j+1}$
$f(x_{j+1}) = c^Tx_{j+1} +\frac{1}{2}x_{j+1}^TCx_{j+1}$
$d_j = s_j/\sqrt{\langle s_j,s_j\rangle}$
$GS_{j+1}$ is formed from $GS_j$ by

replacing the $k$th column $c_k$ by $c_k/(d_j^Tc_k)$
replacing all columns $c_i$ where $\alpha_{ij}=N$ by $c_i-\frac{d_j^Tc_i}{d_j^Tc_k}c_k$
leaving columns $c_i$ where $\alpha_{ij}=Y$ unchanged.

$J_{j+1}$ is formed from $J_j$ be setting $\alpha_{k}$ to $Y$ and leaving everything else unchanged.
Increment $j:=j+1$ and go to step 1!

To summarize the algorithm works as follows. Pick some point $x_0$. Choose a search direction $s_1$ parallel to the axes that maximizes the gradient slope. Locate the minimum of $f$ on the line through $x_0$ parallel to $s_1$ to get $x_1$. Comb the remaining basis vectors of $\mathbb R^n$ to be orthogonal to $s_1$ with respect to the inner product defined by $C$. Next find a search direction $s_2$ among these combed vectors that maximizes the gradient slope, locate the minimum of $f$ on the line through $x_1$ parallel to $s_2$ to get $x_3$, etc. After at most $n$ steps the algorithm will terminate.

I went ahead and programmed this into C# using the MathNet.Numerics.LinearAlgebra library and it seems to work quite well.

First we have a helper function that updates the matrix $GS$, followed by the main minimization routine.

public Matrix<double> GramSchmidt(Matrix<double> matrix, Vector<double> d, int k, bool[] J)

{

Matrix<double> A = matrix.Clone();

double lambda = d.DotProduct(matrix.Column(k));

A.SetColumn(k, matrix.Column(k) / lambda);

for(int i = 0; i < A.ColumnCount; i++)

{

if (!J[i]&&i!=k)

{

double mu = d.DotProduct(matrix.Column(i));

A.SetColumn(i, matrix.Column(i)-matrix.Column(k) * (mu / lambda));

}

return A;

}

public Vector<double> Minimize(Vector<double> c,Matrix<double> C,Vector<double> x0, out bool unbounded)

{

unbounded = false;

Vector<double> x = x0;

int n = c.Count;

bool[] J = new bool[n];//builds array of falses

Matrix<double> GS = Matrix<double>.Build.DenseIdentity(n);

Vector<double> g = c + C * x;

for(int j = 0; j < n; j++)

{

//Step 1:

bool init = true;

foreach(bool index in J) { init = init && index; }

if (init) { return x; }

double max = 0;

int k = -1;

for(int i =0; i < n; i++)

{

if (!J[i] && Math.Abs(g.DotProduct(GS.Column(i))) > max)

{ max = Math.Abs(g.DotProduct(GS.Column(i)));k = i; }

}

if (g.DotProduct(GS.Column(k)) == 0) { return x; }

Vector<double> s = GS.Column(k);

if (g.DotProduct(GS.Column(k)) < 0) { s *= -1; }

//Step 2:

double normSquaredOfs = (s.ToRowMatrix() * C * s.ToColumnMatrix())[0,0];

if (Math.Abs(normSquaredOfs) < .0000001) { unbounded = true; return x; }

double t = g.DotProduct(s) / normSquaredOfs;

//Step 3:

x -= t * s;

g = c + C * x;

Vector<double> d = 1/ Math.Pow(normSquaredOfs, 0.5)*(C * s);

GS = GramSchmidt(GS, d, k, J);

J[k] = true;

}

return x;

}

IHX

Friday, July 15, 2022

Unconstrained quadratic optimization: the algorithm

No comments:

Post a Comment