Introduction
In this post, I want to demonstrate how helpful visual intuition can be. To do this, we are going to think about how to extend a technique called linear interpolation from one dimension to two. Loosely speaking, techniques for interpolation allow us to use information that we know to hopefully make reasonable estimates of quantities we don’t know. In the rest of this post will first discuss linear interpolation in one dimension, and then use some pictures to figure out what it would mean to linearly interpolate in two dimensions.
Linear Interpolation: 1D
Suppose that you have two points $(x_1, f(x_1))$ and $(x_2, f(x_2))$ and a value $x_1 \leq x \leq x_2$ whose corresponding value $f(x)$ we want to estimate. The first think you might think to do is to assume that $f$ is linear. You would then find the slope $m$ and intercept $b$ of the line connecting the points $(x_1, f(x_1))$ and $(x_2, f(x_2))$, and then use that line to estimate that $f(x) = mx + b$. This is shown visually in the figure below.
It turns out that by rearranging the expression $f(x) = mx + b$ (with $m$ and $b$ expanded as shown in the figure), we can actually express $f(x)$ in a different way:
where $\theta = \frac{x - x_1}{x_2 - x_1}$ is the fraction of the total distance between $x_1$ and $x_2$ that is between $x$ and $x_1$. This formulation furnishes another way to think about what linear interpolation does: it estimates $f(x)$ by mixing some amount of $f(x_1)$ with some amount of $f(x_2)$. The amounts of each that are used depends on how close to $x_1$ (or $x_2$) $x$ lies. (To be precise, how much of $f(x_1)$ we use actually depends on the size of the distance between $x$ and $x_2$. As $x$ moves further from $x_2$, the coefficient on $f(x_1)$ should get bigger.)
Linear interpolation: 2D
Now suppose that instead of $(x, f(x))$ pairs, we have $((x, y), f(x, y))$ pairs. Whereas in the prior section, the domain of $f$ is the set of real numbers, in this section, the domain is actually points in the plane. The setting for interpolation in two dimensions is that we have four points in the plane $(x_1, y_1)$, $(x_1, y_2)$, $(x_2, y_1)$, and $(x_2, y_2)$ whose $f$ values we know. We are then given another point, $(x, y)$, and we are trying to estimate the value of $f(x, y)$ (again assuming that $f$ is linear). This setup is shown graphically below.
In this scenario, $f$ actually defines a surface (shown in gray), rather than a curve. In order to estimate the value of $f(x,y)$, we want to come up with a formula for linear interpolation in two dimensions. There are various ways to derive the formula for this**, but here I want to discuss one that I think has an elegant and very intuitive visual interpretation. It turns out that we can borrow the mixture idea from the the 1D case, but instead of a mixture based on the distances along a line, we are going to use areas of subrectangles.
The key here is that we are using areas here as a proxy for 2D “distance”. To sanity check this intuition, note that if $(x, y)$ is one of our four known points, say $(x_1, y_1)$, the area of the subrectangle corresponding to it will be equal to the total area of the larger rectangle, using this method, we can easily see that in this case,
as we expect.
This intuition can be extended to an arbitrary number of dimensions. In 3D, for instance, we would use areas of 3D sub-rectangular prisms rather than subrectangles.
Conclusion
This isn’t an especially deep idea from a mathematical standpoint, but I thought that it was a nice illustration of how sometimes, visual intuition can take us a very long way. If you’re ever trying to solve some challenging problem and you don’t know where to start, drawing some pictures might be a great way to get the juices flowing.
**The usual, and sort of messy, way to derive the formula for bilinear interpolation is to first interpolate in one of the variables and then the other. Just to give a sense for the way that this get’s pretty cumbersome, we will briefly show how to do it. First we compute $f(x_1, y)$ and $f(x_2, y)$
Then we compute $f(x,y) = \frac{x-x_1}{x_2 - x_1} f(x_2, y) + \frac{x_2-x}{x_2 - x_1}f(x_1, y)$, plugging in (1) and (2). In my opinion this in more dimensions quickly becomes unwieldy, and the intuition becomes less clear the more dimensions you try to think about.