Processing math: 2%

Tuesday, July 12, 2022

Solving linear systems, the pseudoinverse, and linear programming.

Suppose we want to solve a system of linear equations Ax=b. There are many ways to go about this. If A happens to be a square invertible matrix, you can just write x=A^{-1}b. If A is not square, or is not invertible, then we have to look for other methods. One trick that often works is to multiply both sides of the equation Ax=b by the transpose of A, to get

A^TAx=A^Tb

The matrix A^TA is a square matrix, and in good cases, it is actually invertible, so we have

x=(A^TA)^{-1}A^Tb.

This is actually a special case of the Moore-Penrose pseudoinverse of a matrix A^+. When A^TA is invertible, we have that A^+=(A^TA)^{-1}A^T, but A^+ is defined for any matrix.

It is not my intention to get deep into the theory of pseudoinverses, but I do want to point out some important properties. (See wikipedia for more detail.)

  • The system Ax=b has at least one solution if and only if AA^+b=b.
  • If a solution exists, x=A^+b is the solution with smallest Euclidean norm. (i.e. it is closest to the origin.)
  • If a solution exists, one can obtain all solutions by x=A^+b+(I-A^+A)w for an arbitrary vector w.  

Calculation of pseudoinverses is widely supported in software libraries. For example, in C#, I use the MathNet numerics package, and it is a convenient way to solve linear systems.

Instead of finding the solution closest to the origin, what if you want to find the solution closest to some other point x_0?

Exercise: Let A be an m\times n matrix. What is a method for finding the solution to a system of equations Ax=b which is closest to a given x_0\in\mathbb R^n.

Solution: Subtract Ax_0 from both sides of the equation to get: A(x-x_0)=b-Ax_0.

Then x-x_0=A^+(b-Ax_0) is the solution such that x-x_0 is closest to 0. Thus x=A^+(b-Ax_0)+x_0 is closest to x_0. \Box

We have just solved a simple quadratic optimization problem, that of minimizing the distance to a point of a solution to a linear system. 

Let us consider a slightly more complex question. 

Problem: Minimize the distance to a point x_0 of a solution to Ax=b, assuming each coordinate of x is nonnegative.

This is a more difficult question, and even the simpler feasibility question is not so obvious:

Feasibility Problem: Does there exist a solution to Ax=b where each coordinate of x is nonnegative?

The feasibility problem can be solved using the technique of linear programming, which for our purposes means that we want to minimize some linear objective function f\colon \mathbb R^n\to\mathbb R subject to the constraints Ax=b and x\geq 0. (Although it looks like a typo, the convention is that x\geq0 means each coordinate of x is nonnegative.) At first glance, this seems like a harder problem than determining feasibility, but bear with me. Once we have set up the algorithm for solving linear programs of the above form, we will be able to easily apply it to the feasibility issue.

In order to make these posts as readable as possible, I'll stop here and continue in the next post to talk about linear programs.











No comments:

Post a Comment