Suppose we want to solve a system of linear equations $Ax=b$. There are many ways to go about this. If $A$ happens to be a square invertible matrix, you can just write $x=A^{-1}b$. If $A$ is not square, or is not invertible, then we have to look for other methods. One trick that often works is to multiply both sides of the equation $Ax=b$ by the transpose of $A$, to get
$$A^TAx=A^Tb$$
The matrix $A^TA$ is a square matrix, and in good cases, it is actually invertible, so we have
$$x=(A^TA)^{-1}A^Tb.$$
This is actually a special case of the Moore-Penrose pseudoinverse of a matrix $A^+$. When $A^TA$ is invertible, we have that $A^+=(A^TA)^{-1}A^T$, but $A^+$ is defined for any matrix.
It is not my intention to get deep into the theory of pseudoinverses, but I do want to point out some important properties. (See wikipedia for more detail.)
- The system $Ax=b$ has at least one solution if and only if $AA^+b=b$.
- If a solution exists, $x=A^+b$ is the solution with smallest Euclidean norm. (i.e. it is closest to the origin.)
- If a solution exists, one can obtain all solutions by $x=A^+b+(I-A^+A)w$ for an arbitrary vector $w$.
Calculation of pseudoinverses is widely supported in software libraries. For example, in C#, I use the MathNet numerics package, and it is a convenient way to solve linear systems.
Instead of finding the solution closest to the origin, what if you want to find the solution closest to some other point $x_0$?
Exercise: Let $A$ be an $m\times n$ matrix. What is a method for finding the solution to a system of equations $Ax=b$ which is closest to a given $x_0\in\mathbb R^n$.
Solution: Subtract $Ax_0$ from both sides of the equation to get: $$A(x-x_0)=b-Ax_0.$$
Then $x-x_0=A^+(b-Ax_0)$ is the solution such that $x-x_0$ is closest to $0$. Thus $$x=A^+(b-Ax_0)+x_0$$ is closest to $x_0$. $\Box$
We have just solved a simple quadratic optimization problem, that of minimizing the distance to a point of a solution to a linear system.
Let us consider a slightly more complex question.
Problem: Minimize the distance to a point $x_0$ of a solution to $Ax=b$, assuming each coordinate of $x$ is nonnegative.
This is a more difficult question, and even the simpler feasibility question is not so obvious:
Feasibility Problem: Does there exist a solution to $Ax=b$ where each coordinate of $x$ is nonnegative?
The feasibility problem can be solved using the technique of linear programming, which for our purposes means that we want to minimize some linear objective function $f\colon \mathbb R^n\to\mathbb R$ subject to the constraints $Ax=b$ and $x\geq 0$. (Although it looks like a typo, the convention is that $x\geq0$ means each coordinate of $x$ is nonnegative.) At first glance, this seems like a harder problem than determining feasibility, but bear with me. Once we have set up the algorithm for solving linear programs of the above form, we will be able to easily apply it to the feasibility issue.
In order to make these posts as readable as possible, I'll stop here and continue in the next post to talk about linear programs.
No comments:
Post a Comment