We finally have all the tools in place to describe a step algorithm for minimizing a (convex) quadratic objective function without any constraints.
- Minimize f(x)= c\cdot x+\frac{1}{2}x^TCx where C is a positive semidefinite matrix.
We start with any seed value x_0\in \mathbb R^n. Recall that the modified Gram-Schmidt process starts with a set of independent vectors and gradually replaces them with an orthonormal set. We will start with the standard basis e_1,\ldots,e_n, recorded as the columns of a matrix GS_0=I. We use a vector J\in\{Y,N\}^n to keep track of which columns have been orthonormalized. So to begin we have J=\{N,\ldots,N\}, and as we work on a column to create a conjugate direction, we add Y to that index of J. Set g_0=\nabla f(x_0)=c+Cx_0 and j=0 to initialize, where j represents the iteration of the algorithm.
- Step 1: Computation of search direction s_j: Let GS_j=[c_{1j},\ldots, c_{nj}] and J_j=\{\alpha_{1j},\ldots,\alpha_{nj}]. If each component of J_j is equal to Y, then stop with optimal solution x_j. Otherwise, compute the smallest index k such that |g_j^Tc_{kj}|= \max\{|g_j^Tc_{ij}|\,|\, \forall i \alpha_{ij}=N\}. If g^T_jc_{kj}=0, stop with optimal solution x_j. Otherwise set s_j=\operatorname{sgn}(g^T_jc_{kj})c_{kj}. Go to step 2.
- Step 2: Computation of step size t_j: Compute \langle s_j,s_j\rangle. If it is 0, then stop. The objective function is unbounded from below. Otherwise set: t_j=\frac{g_j^Ts_j}{\langle s_j,s_j\rangle} Go to step 3.
- Step 3: Updating data: Set
- x_{j+1}:=x_j-t_js_j.
- g_{j+1} := \nabla f(x_{j+1}) = c+Cx_{j+1}
- f(x_{j+1}) = c^Tx_{j+1} +\frac{1}{2}x_{j+1}^TCx_{j+1}
- d_j = s_j/\sqrt{\langle s_j,s_j\rangle}
- GS_{j+1} is formed from GS_j by
- replacing the kth column c_k by c_k/(d_j^Tc_k)
- replacing all columns c_i where \alpha_{ij}=N by c_i-\frac{d_j^Tc_i}{d_j^Tc_k}c_k
- leaving columns c_i where \alpha_{ij}=Y unchanged.
- J_{j+1} is formed from J_j be setting \alpha_{k} to Y and leaving everything else unchanged.
- Increment j:=j+1 and go to step 1!
To summarize the algorithm works as follows. Pick some point x_0. Choose a search direction s_1 parallel to the axes that maximizes the gradient slope. Locate the minimum of f on the line through x_0 parallel to s_1 to get x_1. Comb the remaining basis vectors of \mathbb R^n to be orthogonal to s_1 with respect to the inner product defined by C. Next find a search direction s_2 among these combed vectors that maximizes the gradient slope, locate the minimum of f on the line through x_1 parallel to s_2 to get x_3, etc. After at most n steps the algorithm will terminate.
I went ahead and programmed this into C# using the MathNet.Numerics.LinearAlgebra library and it seems to work quite well.
First we have a helper function that updates the matrix GS, followed by the main minimization routine.
public Matrix<double> GramSchmidt(Matrix<double> matrix, Vector<double> d, int k, bool[] J)
{
Matrix<double> A = matrix.Clone();
double lambda = d.DotProduct(matrix.Column(k));
A.SetColumn(k, matrix.Column(k) / lambda);
for(int i = 0; i < A.ColumnCount; i++)
{
if (!J[i]&&i!=k)
{
double mu = d.DotProduct(matrix.Column(i));
A.SetColumn(i, matrix.Column(i)-matrix.Column(k) * (mu / lambda));
}
}
return A;
}
public Vector<double> Minimize(Vector<double> c,Matrix<double> C,Vector<double> x0, out bool unbounded)
{
unbounded = false;
Vector<double> x = x0;
int n = c.Count;
bool[] J = new bool[n];//builds array of falses
Matrix<double> GS = Matrix<double>.Build.DenseIdentity(n);
Vector<double> g = c + C * x;
for(int j = 0; j < n; j++)
{
//Step 1:
bool init = true;
foreach(bool index in J) { init = init && index; }
if (init) { return x; }
double max = 0;
int k = -1;
for(int i =0; i < n; i++)
{
if (!J[i] && Math.Abs(g.DotProduct(GS.Column(i))) > max)
{ max = Math.Abs(g.DotProduct(GS.Column(i)));k = i; }
}
if (g.DotProduct(GS.Column(k)) == 0) { return x; }
Vector<double> s = GS.Column(k);
if (g.DotProduct(GS.Column(k)) < 0) { s *= -1; }
//Step 2:
double normSquaredOfs = (s.ToRowMatrix() * C * s.ToColumnMatrix())[0,0];
if (Math.Abs(normSquaredOfs) < .0000001) { unbounded = true; return x; }
double t = g.DotProduct(s) / normSquaredOfs;
//Step 3:
x -= t * s;
g = c + C * x;
Vector<double> d = 1/ Math.Pow(normSquaredOfs, 0.5)*(C * s);
GS = GramSchmidt(GS, d, k, J);
J[k] = true;
}
return x;
}
No comments:
Post a Comment