NL Opt 1: Univariate Optimization

Packages Used: Functions from the following package is used in this notebook for the first time:

1. Nonlinear Optimization

Nonlinear optimization refers to the problem of optimizing an objective function, where the elements of the objective function are points in a subset of Euclidean space. Unconstrained nonlinear optimization is when there are no constraints on the possible solutions, and when there are constraints, it is more commonly referred to as nonlinear programming; linear programming is when the objective function and constraints are all linear. Mathematical programming refers to nonlinear and linear programming. Univariate optimization corresponds to the optimization of a function of a single variable on a bounded interval and is the most basic type of nonlinear optimization, used as a building block in most nonlinear programming procedures.

Univariate Optimization

$ \begin{eqnarray*} x^* &=& \mathrm{arg}\underset{x}{\operatorname{min}} \bigl\{ \, f(x) : L\!B \leq x \leq U\!B \bigr\} \\ T\!C^* &=& f(x^*) \end{eqnarray*}$

Convex vs Nonconvex Functions

image.png

2. User-Defined Functions

User-defined functions can be used to extend the capabilities of Julia beyond its basic functions. Although developing a set of functions to solve a particular problem is at the heart of using Julia to program, the easiest way to create a function is to do it in an incremental, scripting manner by writing each line, executing it, and, if it works, and only then copying it into the function.

Ex 1: Distance Function

Given a 1-D point x and m other 1-D points in p, create a function dist1 to determine the distance d from x to each of the m points in p:

$ \begin{array}{l} \quad\bf{d} = \left[ \begin{array}{c} \lvert x - p_{1} \rvert \\ \vdots \\ \lvert x - p_{m} \rvert \end{array} \right] \end{array} $

The best way to start is to create some example data for which you know the correct answer:

The first step is to subtract x from each row in p:

Then take the absolute value of the result and assign the result to d:

A function can be created and then called to assign the result to d:

The function dist1 can now be used inside of any expression; e.g., to calculate the sum of the distances from x to the points in p:

One-Line Functions

One-line functions provide a convenient means to define simple functions. The function dist1 can be defined on a single line:

The sum of the distance from x to the points in p can be defined as the function sumdist1 with only x as an input argument and the current values of p used inside the function:

An array comprehension can be used to create an array of outputs from sumdist1:

Although the sum of the distances sumdist1 could be defined directly without using dist1; but, by first defining dist1, it allows other distance-related functions to be defined in addition to summation:

Function Plotting

The function sum2dist1 of a single variable x can be plotted between LB and UB:

Anonymous Functions

In order not to have to first define and name a function (like sum2dist1, e.g.), an anonymous function can be used instead of a named function when the result is only going to be used once as an input argument inside of another function:

3. Analytical Optimization

In some special cases, it is possible to determine optimal values for a function using analytical techniques, although numerical techniques work in most cases.

Ex 2: Local Optima and Inflection Points

$ \begin{eqnarray*} \quad y &=& x - x^3 \\ \dfrac{dy}{dx} &=& 1 - 3x^2 \\ \dfrac{d^{2}y}{x^2} &=& -6x \\ x^* &=& \pm \sqrt{\dfrac{1}{3}} \end{eqnarray*}$

4. Derivative-free Numerical Minimization

All numerical optimization methods are designed to minimize a function $f(x)$ since the function can be maximized by minimizing its negation: $-f(x)$. These methods are only guaranteed to find the minimum for unimodal functions.

A robust but slow method that successively tightens a search internal that contains the minimum point until the width of the interval is less than some specified tolerance.

Ex 3: Unimodal function

Is it optimal to devide the search interval into thirds, as was done above in myintervalsearch? At each iteration, the interval is reduced by a third. It turns out, following Moler, that the maximum reduction is $\rho = 2 - \phi = (3 - \sqrt{5})/2 \approx 0.382$, where $\phi$ is the "golden ratio".

image.png

This can be seen by setting the LB and UB of the interval to 0 and 1, respectively, and then determining what $\rho$ would maintain the following ratio as the interval is reduced:

$\quad \dfrac{1 - \rho}{1} = \dfrac{\rho}{1 - \rho}$,

or

$\quad \rho^2 - 3\rho + 1 = 0$.

Solving for $\rho$ gives

$\quad \rho = 2 - \phi = \dfrac{3 - \sqrt{5}}{2} \approx 0.382$,

where $\phi \approx 1.618$ is the golden ratio, so that

$ \begin{eqnarray*} \quad x_1 &=& 0 + \rho(1 - 0) = 0 + \dfrac{3 - \sqrt{5}}{2}(1 - 0) \\ x_2 &=& 0 + (1 - \rho)(1 - 0) = 0 + \dfrac{\sqrt{5} - 1}{2}(1 - 0) \end{eqnarray*}$

Picking values for x1 and x2 based on the golden ratio minimizes the number of iterations required in interval search:

Parabolic Interpolation

A parabola can be fit to any three non-collinear points on a function (termed a "triplet"), and then the minimum value of the fitted parabola can be used to replace one of the points to (hopefully) narrow the search. This can potentially require fewer function evaluations as compared to Golden Section Search.

Given three points $(x_1,y_1),(x_2,y_2),(x_3,y_3)$ on the function, can create three equations and solve for the three coefficients of the parabola that passes through the points:

$ \begin{eqnarray*} \quad y_1 &=& \alpha_1 x_1^2 + \alpha_2 x_1 + \alpha_3 \\ \quad y_2 &=& \alpha_1 x_2^2 + \alpha_2 x_3 + \alpha_3 \\ \quad y_3 &=& \alpha_1 x_3^2 + \alpha_2 x_3 + \alpha_3 \end{eqnarray*}$

The minimum point of the resulting parabola is then $\dfrac{-\alpha_2}{2\alpha_1}$.

Brent's Method

Brent's method (1973) is the default method used for univariate optimization in many software packages. It uses parabolic interpolation whenever possible because it can converge to the optimum solution much faster than interval search. Except, there is the possibility (as seen above) that the parabola minimum may be outside of the endpoints of the triplet; when this is detected, Brent's method switches to interval search for the next step. Also, the triplet points may be collinear, making it impossible to fit a parabola; when this is detected, Brent's method switches to Secant search for the next step, which only requires two, possibly collinear, points.

Optimize Function

The function optimize in the Optim package performs general-purpose nonlinear minimization. Note: maximization corresponds to the minimization of the negative of the function to be maximized.

5. Examples using dist1

Ex 4: Minimize the Sum of Squared Distances

Given a bounded search interval [LB, UB], determine a value for x that minimizes the sum of squared distances to each point in p. In this case, the centroid/center-of-gravity/mean of p corresponds to the point that minimizes the sum of squared distances:

This can also be determined using optimize, with the advantage that is can also minimize other functions where there is not a known analytical solution:

To assign the optimal location and minimum total distance values to variables:

To return only the optimal location :

Can then get the minimimum total distance if needed:

Ex 5: Minimize the Maximum Distance

Ex 6: Maximize the Minimum Distance

Trying to determine a location x between between 0 and 10 that is as far away as possible from any of the points 1, 2, 5, and 9 in p. This corresponds to trying to maximize the minimum distance of x from any point in p.

Beware!: Can see that the global maximum of 2 at 7 was not found. Can cut off local maximum of 1.5 at 3.5 by increasing the LB used in optimize to 4:

Ex 7: Minimize Sum of Square-Root Distances

6. Finding Roots and Intersection Points

Ex 8: Roots of an Equation

Ex 9: Intersection Points of Two Functions