Tuesday, July 12, 2022

Linear programming and the test for optimality.

[Note: I am using Linear Programming by Daniel Solow as a reference for this and future posts. This is an economical Dover Book which I highly recommend!]

Our goal in this post is to describe an algorithm for solving a particular type of linear program, one which is an optimization problem of the following form:

  • Minimize $f(x)=c\cdot x$ 
  • Subject to
    • $Ax=b$
    • $x\geq 0$

Let's just take a step back and think about this problem for a second. $Ax=b$ is some affine subspace of $\mathbb R^n$, and we are considering its intersection with the positive orthant. This gives us some higher dimensional polytope, and we are trying to identify the point that minimizes $f$. Since $f$ is a linear function, such a point, if it exists, will have to be one of the vertices of this polytope. Unfortunately, a polytope can have exponentially many vertices in the number of constraints, so any method that seeks to enumerate all vertices first is going to be very slow.

I am going to outline a method, due to Dantzig, called the simplex method. There are two ingredients of this method that make it work:

  • There is a simple test to check if a point is optimal. So if you happen to land on an optimal point, you can easily check if you are done.
  • If you are not on an optimal point, you can find a direction where $f$ decreases, and walk along an edge of the polytope to find a new point. 
You do need an initial vertex to get the algorithm started (more on that later), but once you have it, you can keep walking along edges, decreasing $f$ each time, until you find the minimum.

Next lets think about what a vertex of this polytope is combinatorially. To begin, we will assume that the solution set to $Ax=b$ is generic in the sense that the rows of $A$ are linearly independent. That means it carves out a codimension $m$ affine subspace of $\mathbb R^n$. Generically, this will intersect an $m$ dimensional subspace in a point, so the corners of our polytope will come from setting $n-m$ coordinates equal to $0$. And indeed, if we set $n-m$ variable equal to $0$, we are left with the same number of unknowns as equations, and we expect there to be a unique solution generically. This motivates the definition of what is known as a "basic feasible solution" or bfs for short.

Definition: A basic feasible solution (bfs) to the linear system $Ax=b, x\geq 0$, is a vector $x$ such that
  • there is a nonsingular $m\times m$ submatrix of $A$ formed by choosing $m$ columns, denoted by $B$, and the remaining columns are denoted by $N$ such that
  • $x_B=B^{-1}b$
  • $x_B\geq 0$, and
  • $x_N=0$

Here $x_B$ and $x_N$ are the projections of $x$ onto the subspace spanned by the $B$ and $N$ columns of $A$ respectively. 

Again, a bfs is generically nothing more than a vertex of the solution polytope in the first orthant. If $Ax=b$ is not generic, one could first row reduce the system and throw out the $0$ rows. There are other options available as well to create a system with an initial bfs.

Let us return now to the two ingredients I mentioned above, a test for optimality, and a method to determine a direction to walk if the current point is not optimal.

Test for optimality: Consider the linear program to minimize $f(x)=c\cdot x$ subject to $Ax=b, x\geq 0$. Let $x^*$ be a bfs with $(x_B,x_N)=(B^{-1}b,0)\geq 0$. If $$c_N-c_BB^{-1}N\geq0$$ then $x^*$ is optimal for this LP.

Proof:

Let $x=(x_B,x_N)$ be some other feasible solution. We want to show $c\cdot x\geq c\cdot x^*$.

$$ c\cdot(x-x^*) = c_B(x_B-x_B^*)+c_N(x_N-x_N^*) $$

$$\hspace{2em}=c_B[B^{-1}(b-Nx_N)-B^{-1}b]+c_N(x_N-0)$$

Here we use that $x$ is feasible, so $Ax=Bx_B+Nx_N=b$, and solving for $x_B$, we get $x_B=B^{-1}(b-Nx_N)$.

Continuing,

$$c_B[B^{-1}(b-Nx_N)-B^{-1}b]+c_N(x_N-0)=c_B[B^{-1}b-B^{-1}Nx_N-B^{-1}b]+c_Nx_N$$

$$\hspace{4em}=-c_BB^{-1}Nx_N+c_Nx_N$$

$$\hspace{4em}=(c_N-c_BB^{-1}N)x_N.$$

So to reiterate, for any feasible solution $x$, and any bfs $x^*$, we have the equation

$$c\cdot(x-x^*)=(c_N-c_BB^{-1}N)x_N.$$

By hypothesis, $(c_N-c_BB^{-1}N)\geq 0$, and by feasibility $x_N\geq 0$. So if $x^*$ passes the optimality test, then $c\cdot x\geq c\cdot x^*$ for any feasible solution. Hence it really is optimal. $\Box$

Now, what about the converse? Can we show that if a feasible solution is optimal, then it is a basic feasible solution satisfying the optimality test? This is a good question which we will return to later. To get some intuition, let's do an example.

Example: Minimize $x+2y+3z,$ subject to $x+y+z=1$ and $x,y,z\geq0$. 

There are $3$ basic feasible solutions. $(1,0,0)$, $(0,1,0)$ and $(0,0,1)$. In each case $B$ is the $1\times 1$ matrix with a $1$ in it. So $B=B^{-1}=(1)$, while $N=(1,1)$. For the first basic feasible solution we have $c_N=(2,3)$ and $c_B=(1)$. So we get

$$c_N-c_BB^{-1}N=(2,3)-(1)*1*(1,1)=(1,2)\geq 0.$$

So indeed $y=z=0$ gives an optimal solution. It is interesting to note that this works even in the degenerate case when we consider $x=y=z=0$. 

This seems like a good place to stop for this post. In the next post, we'll look at how to find an improved bfs if your existing bfs fails the test for optimality.



No comments:

Post a Comment