Convex optimization

The General Idea

Over this past summer (2020), I took Stanford’s EE364A course, which is about a subdiscipline of mathematical optimization called convex optimization. I learned that it has myriad applications all over engineering, finance, medicine, signal processing, and many other seemingly disconnected fields. In this post, I want to discuss what convex optimization is and what makes it so useful as a problem solving technique.

In as general a sense as possible, we humans solve optimization problems all the time. We find ourselves in situations where we have to make decisions about which choice will lead to the best outcome (or the least bad outcome in some unfortunate cases), but most of the time, the space of possible decisions is constrained. When we find ourselves saying things like “If the stars aligned, I would…,” or, “In a perfect world, I’d choose…,” we’re usually lamenting the fact that the decision we’re facing would be easier to make if our options were not constrained.

The field of mathematical optimization attempts to bring mathematical formalism to the above-described decision making process. A mathematical optimization problem has a few components:

Decision variable: This represents the space of decisions you have to make. Solving an optimization problem involves finding the “best” value of a decision variable, where best is defined by your choice of…
Cost/Profit (a.k.a. objective) function: This represents some negative/positive utility of a decision.
Constraints: These constrain the space of decisions you can make. The constraints imply a feasible set, which is a set of allowed decisions.

Solving an optimization problem with these components amounts to finding a feasible value of the decision variable that minimizes (or maximizes) the cost (or profit) function. We will see examples of this below.

While mathematicians have managed to develop a deep, rich mathematical theory around how to reason about and solve different kinds of mathematical optimization problems, unfortunately, many (most) optimization problems are computationally very difficult to find optimal solutions for. For problems of significant enough size, this effectively renders them intractable.

There is, however, a certain interesting class of optimization problems that can be solved very efficiently: these are called convex optimization problems. What makes a convex optimization problems easier to work with is the “shape” of its objective function and its feasible set. In order for an optimization problem to be convex, the objective function and feasible set must be convex.

Convex sets and convex functions

In an effort to not use any math symbols in this post, I’m going to resort to pictures to describe what convexity means and looks like for sets and functions. A set is convex if anytime you pick two points in a set, the line between those points is also inside the set. A picture from Chapter 2 of Convex Optimization by Boyd and Vandenberghe provides the idea:

In the left set (all the points on the inside of the hexagon), pick any two points you want. Now draw a line segment between them. Notice that no matter what pair of points you pick, all points along the line segment between them will lie inside the hexagon! In the right set, on the other hand, we see pair of points (there are actually many, can you find another pair?) for which the line segment between them is not fully contained within the set. The set on the right is thus not convex.

Loosely speaking, a function is convex if it has upward curvature. The simplest example that gets the point across is to think of a smile. In three dimensions, a simple example is a bowl. In more mathematical parlance, upward curvature means that any tangent line (or, in higher dimensions, plane) that you draw through any point on the function lies below the function itself. In the picture on the next page, the red and green lines are tangent to the black curve. Notice that the red, green, and any other tangent line you might draw lie or would lie beneath the black curve.

That’s it! Those properties and some clever reasoning are enough to ensure that if a problem is convex, you can probably solve it efficiently with the right software. As long as the space of allowed decisions is a convex set and the objective function is convex, you’re off to the races!

Examples

Before closing this post, I want to describe two examples of how you might translate an industry problem into a convex optimization problem.

The first has to do with radiation treatment planning for cancer patients. The goal is to figure out how to schedule a radiation treatment plan (over some period of time) that trades off damage to a patient’s health with shrinking the size of a tumor. For this problem, we set it up as follows:

The decision we have to make is how much treatment to administer in each time period.
The goal is to minimize the maximum damage to the patient over the entire course of the treatment.
We are constrained by:
- A maximum allowed dose in each period.
- Wanting the tumor to be below some maximum tumor size.
- The way the patients health changes with treatment over time.

With a few mathematical tricks, this problem can be formulated as a convex problem and solved very efficiently. As you might imagine, balancing patient health and tumor size is obviously critical for the health and longevity of patients. I think this is a great example of one case in which convex optimization allows us to threat that needle very precisely. (Note: This problem was actually a problem on the final exam in the course I took. It was derived from real research by the professor.)

Another example of the effectiveness of convex optimization comes from a very different application: finance. In this example, we want to choose a portfolio that minimizes risk while achieving a particular expected return. In its simplest form, this is a convex optimization problem that can be formulated as follows:

The decision we have to make is to pick a portfolio (from a universe of a bunch of stocks).
The cost of the portfolio is the amount of risk the portfolio holds. We want to minimize this quantity.
We are constrained by the expected return that we want to achieve. Any portfolio we pick has to achieve a particular expected return.

We can add other constraints (e.g., no short-selling) and add terms to the cost function (e.g., tax liability, transaction costs, which we would also want to minimize), but optimization problem we just formulated, in some form or other, is at the core of most of the portfolio construction being done in industry today.

Conclusion

Examples of convex optimization’s uncanny effectiveness and ubiquity are everywhere, but there’s an important point I want to stress before we close. In each of the examples above, the accuracy and utility of the output depends on very human choices about the problem setup. In the treatment planning example, it depends on the models the mathematicians and doctors come up with for how patient health changes and how tumor size changes. In the portfolio construction example, it depends on how good our projections of expected returns are. Some of what I showed and talked about in this post is heavily mathematical, but modeling these problems well is truly an art. So while the math is important and the engine that makes all of this possible, none of it really works without consistent communication and collaboration with domain experts.

Finally, I want to thank Stephen Boyd, Anqi Fu, and the rest of the EE364A staff for a fantastic course. I really cannot overstate how excellent the class was. If you’re interested in seeing what some of this is about in more detail, a free version of the class is available on YouTube and the lecture slides and course textbook are available for free here.

I did it! No math symbols!

Convex optimization

The General Idea

Convex sets and convex functions

Examples

Conclusion

Further Reading

Simulated annealing

Solving sudoku as a linear program

Probabilistic interpretation of regularization

Trending Tags