We finally have all the tools in place to describe a step algorithm for minimizing a (convex) quadratic objective function without any constraints.
- Minimize $f(x)= c\cdot x+\frac{1}{2}x^TCx$ where $C$ is a positive semidefinite matrix.
We start with any seed value $x_0\in \mathbb R^n$. Recall that the modified Gram-Schmidt process starts with a set of independent vectors and gradually replaces them with an orthonormal set. We will start with the standard basis $e_1,\ldots,e_n$, recorded as the columns of a matrix $GS_0=I$. We use a vector $J\in\{Y,N\}^n$ to keep track of which columns have been orthonormalized. So to begin we have $J=\{N,\ldots,N\}$, and as we work on a column to create a conjugate direction, we add $Y$ to that index of $J$. Set $g_0=\nabla f(x_0)=c+Cx_0$ and $j=0$ to initialize, where $j$ represents the iteration of the algorithm.
- Step 1: Computation of search direction $s_j$: Let $GS_j=[c_{1j},\ldots, c_{nj}]$ and $J_j=\{\alpha_{1j},\ldots,\alpha_{nj}]$. If each component of $J_j$ is equal to $Y$, then stop with optimal solution $x_j$. Otherwise, compute the smallest index $k$ such that $$|g_j^Tc_{kj}|= \max\{|g_j^Tc_{ij}|\,|\, \forall i \alpha_{ij}=N\}.$$ If $g^T_jc_{kj}=0$, stop with optimal solution $x_j$. Otherwise set $s_j=\operatorname{sgn}(g^T_jc_{kj})c_{kj}.$ Go to step 2.
- Step 2: Computation of step size $t_j$: Compute $\langle s_j,s_j\rangle$. If it is $0$, then stop. The objective function is unbounded from below. Otherwise set: $$t_j=\frac{g_j^Ts_j}{\langle s_j,s_j\rangle} $$ Go to step 3.
- Step 3: Updating data: Set
- $x_{j+1}:=x_j-t_js_j$.
- $g_{j+1} := \nabla f(x_{j+1}) = c+Cx_{j+1}$
- $f(x_{j+1}) = c^Tx_{j+1} +\frac{1}{2}x_{j+1}^TCx_{j+1}$
- $d_j = s_j/\sqrt{\langle s_j,s_j\rangle}$
- $GS_{j+1}$ is formed from $GS_j$ by
- replacing the $k$th column $c_k$ by $c_k/(d_j^Tc_k)$
- replacing all columns $c_i$ where $\alpha_{ij}=N$ by $c_i-\frac{d_j^Tc_i}{d_j^Tc_k}c_k$
- leaving columns $c_i$ where $\alpha_{ij}=Y$ unchanged.
- $J_{j+1}$ is formed from $J_j$ be setting $\alpha_{k}$ to $Y$ and leaving everything else unchanged.
- Increment $j:=j+1$ and go to step 1!
To summarize the algorithm works as follows. Pick some point $x_0$. Choose a search direction $s_1$ parallel to the axes that maximizes the gradient slope. Locate the minimum of $f$ on the line through $x_0$ parallel to $s_1$ to get $x_1$. Comb the remaining basis vectors of $\mathbb R^n$ to be orthogonal to $s_1$ with respect to the inner product defined by $C$. Next find a search direction $s_2$ among these combed vectors that maximizes the gradient slope, locate the minimum of $f$ on the line through $x_1$ parallel to $s_2$ to get $x_3$, etc. After at most $n$ steps the algorithm will terminate.
I went ahead and programmed this into C# using the MathNet.Numerics.LinearAlgebra library and it seems to work quite well.
First we have a helper function that updates the matrix $GS$, followed by the main minimization routine.
public Matrix<double> GramSchmidt(Matrix<double> matrix, Vector<double> d, int k, bool[] J)
{
Matrix<double> A = matrix.Clone();
double lambda = d.DotProduct(matrix.Column(k));
A.SetColumn(k, matrix.Column(k) / lambda);
for(int i = 0; i < A.ColumnCount; i++)
{
if (!J[i]&&i!=k)
{
double mu = d.DotProduct(matrix.Column(i));
A.SetColumn(i, matrix.Column(i)-matrix.Column(k) * (mu / lambda));
}
}
return A;
}
public Vector<double> Minimize(Vector<double> c,Matrix<double> C,Vector<double> x0, out bool unbounded)
{
unbounded = false;
Vector<double> x = x0;
int n = c.Count;
bool[] J = new bool[n];//builds array of falses
Matrix<double> GS = Matrix<double>.Build.DenseIdentity(n);
Vector<double> g = c + C * x;
for(int j = 0; j < n; j++)
{
//Step 1:
bool init = true;
foreach(bool index in J) { init = init && index; }
if (init) { return x; }
double max = 0;
int k = -1;
for(int i =0; i < n; i++)
{
if (!J[i] && Math.Abs(g.DotProduct(GS.Column(i))) > max)
{ max = Math.Abs(g.DotProduct(GS.Column(i)));k = i; }
}
if (g.DotProduct(GS.Column(k)) == 0) { return x; }
Vector<double> s = GS.Column(k);
if (g.DotProduct(GS.Column(k)) < 0) { s *= -1; }
//Step 2:
double normSquaredOfs = (s.ToRowMatrix() * C * s.ToColumnMatrix())[0,0];
if (Math.Abs(normSquaredOfs) < .0000001) { unbounded = true; return x; }
double t = g.DotProduct(s) / normSquaredOfs;
//Step 3:
x -= t * s;
g = c + C * x;
Vector<double> d = 1/ Math.Pow(normSquaredOfs, 0.5)*(C * s);
GS = GramSchmidt(GS, d, k, J);
J[k] = true;
}
return x;
}
No comments:
Post a Comment