Optimal Transport Mapping
Wasserstein distance is a measure of dissimilarity between two probability distributions, which has a variety of applications in modern machine learning. This distance is calculated by finding the path that minimizes the total cost of transferring mass from one distribution to the other, referred to as the optimal transport map. This problem was first introduced by French mathematician Gaspard Monge in 1781.
As an analogy, suppose we view one distribution X as a collection of factories from a single company and the other distribution Y as a collection of stores that receive shipments from the factories. Suppose we also have a cost function c such that c(xi,yj) calculates the cost of transporting one shipment of material from factory xi to store yj.
An employee in charge of product distribution notices that the company's current delivery plan designed by her predecessor is wasting a lot of resources. Her goal is to rearrange the delivery plan to minimize the total cost of the shipments. With so many factories and store locations, there are numerous potential delivery paths that she can choose. She must also consider the amount products produced at each factory and the supply demand at each store. With these factors in mind, how should the employee decide which path to choose? This is exactly the premise of optimal transport mapping.
The following series of interactive simulations introduces the concepts of optimal transport, from the very basic case to the more complex cases.
Optimal Assignment Problem
Let's start by looking at the simplest case of optimal transport known as the optimal assignment problem. In this sitation, we assume that there is an equal number of factories and stores. Each factory produces the same amount of products and they each deliver to a single store. Due to the uniform distribution of products/supply demand, when transporting, we only need to account for the Euclidean distance between the factories and stores to calculate the cost. In this case, finding the optimal transport map between factories and stores is equivalent to a one-to-one assignment problem where we try to find a set of paths that would minimize the total distance traveled.
Below, try to find the optimal assignment by mapping the red point set (factories) to the blue point set (stores) — for each highlighted red point, click on a blue point to assign it to and once all points have been assigned, check out if the transport map created is indeed the optimal!
click on a store to transport the products from the factory:
We can see that this is a relatively easy problem to optimize since the only factor in play is the distance! But what if the company has more store locations than factories or vice versa? Or, if each factory produces different amount of products?
Monge Problem
Now let's consider a more general case of the previous problem where the number of factories and stores are not necessarily the same, and the amount of products produced at each factory and required at each stores are also not always equal. This case of optimal transport is known as the Monge problem. Since number of factories ≠ number of stores, we cannot simply do a one-to-one assignment and minimize only the distance between the points. We must now also take into account the individual weights of each point when finding the optimal map, so now our cost is calculated as distance times mass.
Keep in mind the following rules:
1. every store must receive the exact amount of products that it needs
2. each factory can still only deliver to a single store
3. each store can receive shipments from multiple factories
Below, try to find the optimal Monge map by again clicking on a blue point to map each highlighted red point to. Note that each point has a labeled mass (representing the amount of products produced/needed), and this time taking into account the above constraints. Once all points have been assigned, check out if the transport map created is indeed the optimal!
click on a store to transport the products from the factory:
hint: take into account the amount of products produced at each factory and needed at each store
We can see that having different numbers of points with different masses makes it more challenging to obtain the optimal map. However, to implement the best possible delivery plan, we must take it one step further. What if we allow each factory to deliver to multiple store loctions?
Kantorovich Relaxation
Notice that although the Monge problem allows a single store to receive multiple shipments, but it requires that a single factory only delivers to one store. Because of this, there sometimes may be no optimal Monge map between the factories and stores if there are more store locations than factories, or, if a factory produces more than any single store needs.
Then, in the 1940s, Soviet mathematician Leonid Kantorovich proposed the idea of "relaxing" the deterministic nature of the Monge problem, referred to as Kantorovich relaxation. In particular, Kantorovich proposed that mass at a point xi could be split up and dispatched to different locations, meaning that the products produced at a single factory can be delivered to multiple stores. Note that the cost function for each delivery in this situation is still calculated as distance times mass. Now, the rules are the following:
1. every store must receive the exact amount of products that it needs
2. each factory can deliver to multiple stores
3. each store can receive shipments from multiple factories
Below, once again try to find the optimal transport map by clicking on a blue point to map each highlighted red point to. Once all points have been assigned, check out if the transport map you created is indeed the optimal!
NOTE: In order to visualize mass splitting, only 1 unit product is transported by each click. For example, the first highlighted red point has a mass of 4 units; to distribute 2 units to the blue point on to its left, click that blue point two times. The arrow between the two points will have a label with the amount of mass that is currently being transported between them and will update on each click.
click on a store to transport the products from the factory:
note: each click delivers 1 unit of product from the factory
This version of the problem above is the current formulation of the optimal transport problem today.
With what we've learned...
With her knowledge of optimal transport mapping, the employee utilizes the Python package POT: Python Optimal Transport to successfully compute a delivery plan for her company that would optimize their resources! Click on the transport!! button below to see a snippet of her delivery plan!
Now, that we've explored the different layers of the optimal transport mapping, hopefully you have a better conceptual understanding of the problem. Note that the examples above are simple data sets to introduce the basics of the problem. As you worked through the set of problems in the last section, you might have had to attempt several times to achieve the optimal map since there are many possible combinations of paths. In practice, computational algorithms for finding the optimal transport map often becomes more and more costly as the size and dimensions of the data increases. One ongoing challenge in modern machine learning is finding more efficient ways to calculate the optimal transport map using various methods such as neural networks and different mathematical approaches.
Here are some resources to learn more about the mathematical formulations for optimal transport:
Computational Optimal Transport, G. Peyré and M. Cuturi (2018)
Introduction to Optimal Transport, M. Thorpe (2018)
. . .
[view the GitHub repository for this visualization]
created by Ashley Ho & Mizuho Fukuda