LP 1: Constrained Optimization

Packages Used: No new packages are used in this notebook.

1. Unconstrained Linear Optimization

Ex 1: Coffee and Peppers Arbitrage Opportunity

Question: You are planning an international trip and have the opportunity to purchase some estate-grown coffee and high-quality dried peppers while there that you can sell upon your return to a friend who supplies several local restaurants. You can sell the coffee and peppers to your friend for \$6 and \\$15 per pound, respectively, and can purchase them at your destination for \$2 and \\$6 per pound, respectively. How many pounds of coffee and peppers should you purchase?

Solution Approach:

Objective will be to try to maximize the total profit. Unit profits for coffee and peppers are $ \pi_1 = \left( p_1 - c_1 \right) = 6 - 2 = \$4 $ and $ \pi_2 = \left( p_2 - c_2 \right) = 15 - 6 = \$9 $ per pound, respectively, and so total profit will be the linear function

$ \quad \mbox{Total profit:} \quad \mathit{\Pi}(x_1, x_2) = \pi_1 x_1 + \pi_2 x_2 = 4x_1 + 9x_2$,

where the decision variables $x_1$ and $x_2$ are pounds of coffee and peppers purchased, respectively. Since no limit is given on the amount of coffee and peppers that can be purchased and sold, the optimal answer is to purchase as much of each as you can since the gradient is always positive:

$\quad \nabla\!\mathit{\Pi}(x_1,x_2) = \left[ \begin{array}{c} 4 \\ 9 \end{array} \right] $

Limits: If limits were placed on each item, then the optimal solution would be to purchase the limit amounts of each.

Question: Why does this simple procedure work in this case, while more complicated analytical or numerical procedures were needed for the other examples considered so far?

Ex 2: Unconstrained Nonlinear Optimization

If there is a need to limit possible optimal solutions so that they remain within some limits, then it is necessary to include these limits as constraints in the optimization process.

2. Constrained Optimization

Constraints on the feasibility of a solution can be incorporated into an optimization problem in two ways:

Polytope Region Penalty Method

A polytope (or polygon in 2-D) can be used to indicate feasibility, either requiring feasible solutions to be inside or outside of the polytope. A convex polytope can be represented by a series of linear equations $ \mathbf{a}_i \mathbf{x} = b_i $, representing a half-space i of the polytope (or line i of a polygon). The equation $ \mathbf{a}_i \mathbf{x} \le b_i $ can be used to indicate points on or below the half-space, and $ \mathbf{a}_i \mathbf{x} \gt b_i \implies -\mathbf{a}_i \mathbf{x} \le -b_i $ to indicate points on or above. Each half-space can be represented as a row in a matrix $\mathbf{A} = \left[ \mathbf{a}_i \right]$ and as an element in a vector $\mathbf{b} = \left[ b_i \right]$, allowing the boolean-valued indicator function $S$ to be defined to determine if a point $\mathbf{x}$ is on or inside of the polytope:

$ \quad \mbox{In polytope: } S(\mathbf{x}) = \begin{cases} \mathsf{true}, & \mbox{if } \mathbf{Ax \le b} \\ \mathsf{false}, & \mbox{otherwise } \end{cases} $

Ex 3: Constrain Minimum-distance Location to be Outside of Region

Any minimum-distance location is restricted to being outside of a polygon region. The region could represent, for example, a small lake, where it would not be feasible to locate. In the following example, the infeasible region is a rectangle defined by x- and y-axis upper and lower bounds:

$ \begin{eqnarray*} \quad \textit{x}\textrm{-axis UB:}\quad x_1 &\le& 6 = 1x_1 + 0x_2 \le 6 = a_{1,1}x_1 + a_{1,2}x_2 \le b_1 = \mathbf{a}_1 \mathbf{x} \le b_1 \\ \textit{y}\textrm{-axis UB:}\quad x_2 &\le& 4 = 0x_1 + 1x_2 \le 4 = a_{2,1}x_1 + a_{2,2}x_2 \le b_2 = \mathbf{a}_2 \mathbf{x} \le b_2 \\ \textit{x}\textrm{-axis LB:}\quad x_1 &\ge& 3 = -x_1 \le -3 = -1x_1 + 0x_2 \le -3 = a_{3,1}x_1 + a_{3,2}x_2 \le b_3 = \mathbf{a}_3 \mathbf{x} \le b_3 \\ \textit{y}\textrm{-axis LB:}\quad x_2 &\ge& 1 = -x_2 \le -1 = 0x_1 + -1x_2 \le -1 = a_{4,1}x_1 + a_{4,2}x_2 \le b_4 = \mathbf{a}_4 \mathbf{x} \le b_4 \\ \end{eqnarray*} $

$\quad\mathbf{Ax \le b} = \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \\ a_{3,1} & a_{3,2} \\ a_{4,1} & a_{4,2} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \le \begin{bmatrix} b_1 \\ b_2 \end{bmatrix} = \left[ \begin{array}{r} 1 & 0 \\ 0 & 1 \\ -1 & 0 \\ 0 & -1 \end{array} \right] = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \le \left[ \begin{array}{r} 6 \\ 4 \\ -3 \\ -1 \end{array} \right]$

First, solve for the unconstrained location:

Next, modify the distance function dist2 by using the indicator function S to return a penalty cost of infinity (Inf) for any location in the infeasible region since this is a minimization problem with the constraint that the solution is outside of the region; otherwise, just return the distance:

The mean of P is inside of the infeasible region. If this mean is used as the starting point for the optimization (see white circle inside the region, below), the Nelder-Mead procedure stops after four function calls and no iterations (but still reports success!); this is because infinity is a hard penalty that cannot be improved. Instead, one needs to either make sure the starting point is not inside the region or, if it is, that a soft penalty is used so that progress can be made to a feasible point:

Ex 4: Maximizing the Minimum Distance

A finite region surrounding a set of points P is needed to restruct possible solutions that maximise the minimum distance to the closest point in P; otherwise, the optimal solution is unbounded and Nelder-Mead fails to find a feasible soltion:

Restricting possible solutions to a finite region defined by $-2 \le x \le 9$ and $-2 \le y \le 8$ results in Nelder-Mead being able to find feasible solutions; in this case, there are multiple local optima that can be reached from different starting points:

Similar to what was done in the previous example, the distance function dist2 is modified, but now the indicator function S is used to return a penalty cost of -Inf for any location outside of the feasible region since this is a maximization problem with the constraint that the solution be inside of the region; otherwise, just return the distance:

3. Penalty-Contrained Optimization vs Linear Programming

Ex 5: Constrained Coffee and Peppers Arbitrage Opportunity

Question: You are planning an international trip and have the opportunity to purchase some estate-grown coffee and high-quality dried peppers while there that you can sell upon your return to a friend who supplies several local restaurants. You can sell the coffee and peppers to your friend for \$6 and \\$15 per pound, respectively, and can purchase them at your destination for \$2 and \\$6 per pound, respectively. You have a budget of \$120 to spend on these purchases, and you have estimated that you will be able to pack up to 30 pounds, in total, of both items in your knapsack since there is a maximum weight limit of 50 pounds on the one allowed checked bag and your clothes and the knapsack together weigh 20 pounds. How many pounds of coffee and peppers should you purchase?

Solution Approach:

Objective will be to try to maximize the total profit from a mix of both items. Unit profits for coffee and peppers are $ \pi_1 = \left( p_1 - c_1 \right) = 6 - 2 = \$4 $ and $ \pi_2 = \left( p_2 - c_2 \right) = 15 - 6 = \$9 $ per pound, respectively, and so total profit will be the linear function

$ \quad \mbox{Total profit:} \quad \mathit{\Pi} = \pi_1 x_1 + \pi_2 x_2 = 4x_1 + 9x_2$,

where the decision variables $x_1$ and $x_2$ are pounds of coffee and peppers purchased, respectively. The budget and weight constaints to define a polygon region to represent feasible solutions:

$ \begin{eqnarray*} \quad\mbox{Budget constraint:} \quad \mathbf{a}_1 \mathbf{x} \le b_1 &=& a_{1,1}x_1 + a_{1,2}x_2 \le b_1 \\ &=& c_1 x_1 + c_2 x_2 \le 120 = 2x_1 + 6x_2 \le 120 \\ \mbox{Weight constraint:} \quad \mathbf{a}_2 \mathbf{x} \le b_2 &=& a_{2,1}x_1 + a_{2,2}x_2 \le b_2 = x_1 + x_2 \le 30 \\ \mathbf{Ax \le b} &=& \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \le \begin{bmatrix} b_1 \\ b_2 \end{bmatrix} \\ &=& \begin{bmatrix} 2 & 6 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \le \begin{bmatrix} 120 \\ 30 \end{bmatrix} \end{eqnarray*}$

Also, the two rows $ \begin{bmatrix} -1 & 0 \end{bmatrix} $ and $ \begin{bmatrix} 0 & -1 \end{bmatrix} $ can be added to $\mathbf{A}$ and two $0$'s to $\mathbf{b}$ to represent the lower-bound constraints $x_1 \ge 0$ and $x_2 \ge 0$.

Display Feasible Region and Constraints
Graphical Solution

The display of the feasible region and contour plot indicate the values for the decision values that maximize total profit are at the intersection of the Budget and Weight constraints. To solve for their intersection:

Penalty-Constrained Optimization

Using the matrix $A$ and vector $b$ to define the feasible region, modify the objective function fobj by using the indicator function S to return a penalty cost of infinity for any point outside the feasible region:

Starting at point [0, 0], Nelder-Mead terminates successfully, but it has not found the known (global) optimal solution at [15, 15], which corresponds to a total profit of \$195:

L-BFGS, which uses finite differences to estimate the gradient, did not work, even though it reports a "sucess"! Can try to tighten the tolerance used by Nelder-Mead to decide when to terminate (was 1e-8, now 1e-12). The only result it that the number of iterations increases, but solution stays the same:

Using different starting points that are located more towards the interior of the feasible region, the known optimal solution is found, but the number of iterations required by Nelder-Mean is increasing even though the starting point is getting closer to the optimal point at [15, 15]:

Linear Programming

Clearly, Nelder-Mead and gradient-based methods are not effective procedures for solving problems with a linear objective function. Luckily, there are specialized procedures like the simplex method (1947) for solving these problems effectively.

image.png

The basic idea of the simplex method is as follows:

  1. Know (i.e., can prove) that the (global) optimal solution exists at one of the corner points of the feasible region.
  2. Pick a starting corner point
  3. If the current corner point is optimal, STOP; otherwise
  4. Move to an adjacent corner point that improves the objective function and then repeat Step 3.

The procedure will eventually terminate at the optimal corner point as long as the feasible region is bounded; otherwise, it will report that the problem is unbounded and that no finite optimal solution exists. It will always terminate because, at each corner point considered, the objective is strictly improving (e.g., strictly increasing if it is a maximization and strictly decreasing if it is a minimization problem).

Using the Simplex Method

The simplex method utilizes the same matrix $A$ and vector $b$ that were used by the penalty method to define the feasible region, except that the lower bounds on the decision variables can be removed because they are assumed to be greater-than or equal to zero by default:

$ \begin{eqnarray*} \quad \mbox{Maximize} \quad 4x_1 + 9x_2 \\ \quad \mbox{subject to} \quad 2x_1 + 6x_2 &\le& 120 \\ x_1 + x_2 &\le& 30 \\ x_i &\ge& 0 \mbox{,} \end{eqnarray*} $

Linear algebra is used to solve a series of equations and, as such, each (inquality) constaint has to be first transformed into an equation by adding an additional slack variable, converting the problem to 4-D:

$ \begin{eqnarray*} \quad \mbox{Maximize} \quad 4x_1 + 9x_2 + 0x_3 + 0x_4 \\ \quad \mbox{subject to} \quad 2x_1 + 6x_2 + x_3 &=& 120 \\ x_1 + x_2 + x_4 &=& 30 \\ x_i &\ge& 0 \mbox{.} \end{eqnarray*} $

Pick a starting corner point: Since this is a maximization problem and the point (0,0) is part of the feasible region, the two additional slack variables make it easy to define an initial basic feaible solution (0, 0, 120, 30) for the 4-D problem:

The two variables $x_3$ and $x_4$ define the initital basis; in the final optimal solution, the simplex method will have moved both of these variables out of the basis so that only the real, non-slack variables define the basis. The following procedure is termed the revised simplex method because it is more computationally efficient than the original simplex method (in particular, no costly matrix inversions are required).

Check if optimal; otherwise move to new corner point:

Putting all of these steps into a single procedure:

Revised Simplex Method