In this chapter, we consider some additional concepts and problems in graph theory, and discuss some classical results and algorithms. These concepts and algorithms will be useful in the sequel.
6.1 The Max-Flow Problem
We start by considering the max-flow problem. Consider a set of nodes , and assign an integer “capacity” to every pair of nodes such that for every ; we refer to the pair as a capacitated graph. Given a capacitated graph , let denote the set of edges such that ; we refer to as the directed graph induced by (the graph is directed as the capacity of an edge may be different from the capacity of an edge ). We say that is a source (or a sink, respectively) in a capacitated graph if it is so in the directed graph induced by . (Recall that a source is a node with no incoming edges, and a sink is a node with no outgoing edges.)
Now, given a capacitated graph , a source , and a sink , consider the problem of finding the maximum flow from to in : that is, finding the maximum amount of some imaginary resource (water, electricity, car traffic, computer network traffic, etc.) that we can route from to while respecting the capacity constraints of each edge. For instance, for the case of network traffic, the capacities correspond to the network bandwidth between nodes (servers) and the max flow yields the maximum bandwith for network traffic from to . For car traffic, the capacity of an edge (corresponding to a road) may instead specify the number of cars per second that can enter the road when the road is fully utilized and cars are driving at the speed limit (i.e., the “car bandwidth” on that road); in this case, the max flow from to gives the maximum number of cars (all wanting to travel from to ) that can enter at the source .
To formalize this, we start by defining the flow, , between any two nodes . This notion captures the “net” amount of unit of flow being transferred between two nodes ; for example, if we are sending 8 units from to , and 3 units from to , then the “net flow” from to is 5, and the “net flow” from to is . More formally,
Definition 6.1. An -flow in a capacitated graph , where is a source and is a sink, is a function such that the following conditions hold:
(That is, for every “internal” node , the total “incoming” flow equals the total “outgoing” flow.)
The value of an -flow , denoted , is the total flow coming out of ; that is,
An -flow is said to be a max--flow in if its value is at least as big as the value of any other -flow in .
Let us turn to describing an algorithm that finds the maximum flow in a capacitated graph . On a high level, the idea is to incrementally push flow from to until no more flow can be pushed through. Toward formalizing this approach, given some flow for , let denote the “residual capacity” that remains along edge after pushing through the flow ; that is,
We refer to as the residual graph (resulting from pushing through the flow ). Note that, due to the “flow symmetry” condition, once we push some positive flow along an edge , there is an equal negative flow along the “reverse edge” , and thus the residual graph will have positive capacity along the reverse edge even if the original graph did not. (In more detail, if , then and thus .) This corresponds to the fact after some positive flow is pushed through , we always have the option of removing that flow, which corresponds to pushing a flow along ; thus, the capacity along should be increased by in the residual graph.
The following claim, which follows directly from the construction of the residual graph (and the above discussion), formalizes the usefulness of the residual graph construction.
Claim 6.1. Let be a capacitated graph, let be a source-sink pair, let be an -flow for , and let be an -flow for the residual graph . Then the combined flow , defined as , is also an -flow for .
Proof. Consider two nodes . Since by construction , we have that and thus so . Thus, satisfies the capacity constraint condition. Since and satisfy flow symmetry and conservation of flow, it directly follows that the same is true for the sum of them.
■
With these preliminaries we are now ready to formalize the algorithm for finding a max -flow. The Augmenting Path Algorithm proceeds as follows given a capacitated graph and a source-sink pair :
and for each directed edge , let,
See Figure 6.1 for an illustration of the algorithm.
6.2 The Max-Flow Min-Cut Duality
To analyze the Augmenting Path Algorithm, we need to introduce some additional definitions.
Definition 6.2. An -cut of a capacitated graph is a partition of into two sets such that and . The capacity of an -cut is the sum of the capacities of all the edges such that and (i.e., all edges leaving ). An -cut is said to be a min--cut in if its capacity is smaller than, or equal to, the capacity of any other -cut in .
Given an -flow and an -cut , let denote the “net flow” from to ; that is, . Since is an -cut, any flow traveling from to must pass through the edges leaving , and thus ought to equal . This is formalized in the next lemma.
Lemma 6.1. Let be a capacitated graph, a source-sink pair, an -flow for , and an -cut of . Then .
Proof. Recall that
where the second equality follows from the fact that is a source, and the third equality follows from the fact that by the flow conservation condition, for every “internal” node , . Since is a partition of , we can further decompose as:
By the flow symmetry condition, the first term is 0 (since if is in the sum, then so is ), and, by definition, the second term is ; thus, we conclude that .
■
Note that by definition, is at most the capacity of the cut . As a consequence, we directly get the following useful corollary.
Corollary 6.1. Let be a capacitated graph, a source-sink pair, an -flow for , and an -cut of . Then is at most the capacity of .
We are now ready to analyze the Augmenting Path Algorithm.
Theorem 6.1. Given a capacitated graph and a source-sink pair , the Augmenting Path Algorithm outputs some max--flow in . Furthermore:
Proof. We start by observing that by Claim 6.1 and by an induction over the number of iterations of the algorithm, remains a valid flow for throughout the execution of the algorithm. Furthermore, since we are assuming capacities are integers, the extra flow “pushed” along any augmenting path must be an integer, and consequently the residual graph will always have integral capacities. It follows (again by induction) that the flow remains integral throughout the execution of the algorithm. Additionally, every time we increase the flow along an augmenting path by , the value of the flow increases by ; since is an integer, the value of the flow must increase by at least 1 at each iteration. It follows that the number of iterations is bounded by the value, , of the max -flow. Moreover, by using BFS and Theorem 2.1, each iteration of the algorithm can be performed in polynomial time in ; thus, we conclude that the overall running time of the algorithm is polynomial in .
We now show that the final flow is a maximum flow and along the way also prove bullet 3 of Theorem 6.1. By Corollary 6.1, the value of the max- flow must be less than or equal to the capacity of every -cut in . Hence, if we show that the flow from to at the end of the execution is equal to the capacity of some -cut, we have also shown that the flow is maximal; furthermore, this also shows that the value of a max flow equals the capacity of some (and thus any) min cut.
Recall that the algorithm ends when there are no more augmenting paths from to . Consider the final residual graph at this point. Let denote the set of nodes that are reachable from in the directed graph induced by (i.e., through edges with positive capacities). Let be the remaining nodes.
We conclude that the value of the final flow equals the capacity of the -cut , which concludes the proof.
■
We mention that special instantiations of the Augmenting Path Algorithm (where we are more careful about which augmenting path the algorithm selects in each iteration) have an even faster running time for large capacities [EK72], but this will not be of importance to us in this course.
6.3 Bipartite Graphs and Maximum Matchings
Let us now turn to a seemingly different problem (which will turn out to be closely related to the max flow problem): that of finding matchings in bipartite graphs. A bipartite graph is simply a graph where the nodes are divided into two groups—“left nodes” and “right nodes” and edges can only exist between a node in one group and a node in the other. See Figure 6.2 for an example; here the left nodes correspond to a set of professors, the right nodes corresponds to a set of classrooms, and we draw an edge between a professor and the classroom if the classroom is acceptable to the professor.
We here focus only on undirected bipartite graphs:
Definition 6.3. An undirected bipartite graph is an undirected graph where and .
A particular type of bipartite graph of interest is one where the edge set is the full set of edges ; we refer to such a graph as a complete bipartite graph over .
Let us turn to defining the notion of a matching; we provide a general definition that applies to all graphs (and not just bipartite ones), but we shall here focus mostly on bipartite graphs.
Definition 6.4. A matching in an undirected graph is a set of edges such that every node has at most degree 1 with respect to (i.e., at most 1 edge in connected to it). We define the size of the matching as . We say is a maximum matching in if its size is at least as large as the size of any other matching in .
An example of a matching in a bipartite graph can be found in Figure 6.2. An equivalent way to define a matching in an undirected graph is as a function such that defines the node that gets matched to, or (where denotes an “error” symbol) if is “unmatched.” More precisely, is a function such that (1) for any , if then and , and (2) for any , there exists at most one such that . To simplify our presentation, we abuse of notation and use both of these formulations/notations of matchings interchangeably.
We now show how to use the Augmenting Path Algorithm to efficiently find maximum matchings in bipartite graphs.
Theorem 6.2. There exists an efficient algorithm that finds the maximum matching in any undirected bipartite graph in polynomial time in .
Proof. Given an undirected bipartite graph , create an expanded capacitated graph with two extra nodes and such (a) that the capacity between and every node in is 1, (b) the capacity between every node in and is 1, and (c) the capacities along edges are set to .1 (See Figure 6.3 for an illustration.) Note that if has a matching of size , then there exists an -flow in with value —simply push a flow of 1 from to every node that gets matched, and next from to ’s match, , and finally from to . In other words, the size of the maximum matching in is at most the max -flow in .
Now, consider the following algorithm: find the maximum -flow in using the Augmenting Path Algorithm, and finally output the edges between and which have flow on them. Since the capacities of the edges connecting and to the bipartite graph are 1 (and since there are no edges connecting two nodes in or two nodes in ), by the capacity and conservation of flow constraints, at most 1 unit of flow can pass through each node in and . By Theorem 6.1, the output max flow is integral; hence, the algorithm always outputs a valid matching. Additionally, it follows that the size of the matching is the value of the max flow in , and consequently, the matching is of maximum size (since, as argued above, the size of the maximum matching is at most the value of the max flow).
■
6.4 Perfect Matchings and Constricted Sets
Consider now a bipartite graph where . We are interested in understanding when such a bipartite graph has a perfect matching, where every node gets matched.
Definition 6.5. A matching in a bipartite graph where is said to be perfect if all nodes in get matched in —that is, is of size .
For instance, if is a set of men and a set of women, and the edges exist between a man-woman pair if they would find a marriage acceptable; a perfect matching exists if and only if everyone can get “acceptably” married.
Note that we can use the above-described maximum matching algorithm to determine when a perfect matching exists. But if the algorithm notifies us that the maximum matching has size , it gives us little insight into why a perfect matching is not possible.
Constricted sets The notion of a constricted set turns out to be crucial for characterizing the existence of perfect matchings. Given a graph , let denote the set of neighboring nodes to any node in (that is, if and only if there exists some and an edge .) A constricted set is a set of nodes that has less neighbors than elements in :
Definition 6.6. Given a bipartite graph , a constricted set is a subset such that .
Clearly, if a constricted set exists, no perfect matching can be found—a set of men cannot have acceptable marriages with women. The “marriage theorem” proves also the other direction; a perfect matching exists if and only if no constricted set exists. We will show not only this theorem but also how to efficiently find the constricted set when it exists.
Theorem 6.3. Given a bipartite graph such that , a perfect matching exists in if and only if no constricted set exists in . Additionally, whenever such a constricted set exists, it can be found in polynomial time in .
Proof. As already noted above, if there exists a constricted set, then clearly no perfect matching exists. For the other direction, consider a bipartite graph without a perfect matching. We shall show how to efficiently find a constricted set in this graph, which concludes the proof of the theorem.
Consider the expanded capacitated graph from the proof of Theorem 6.2; use the max-flow algorithm from Theorem 6.1 to find the max--flow, and consider the min--cut, , induced by this flow in the proof of Theorem 6.1. Recall that the set was obtained by considering the set of nodes that are reachable from in the final residual graph; this set of nodes can be found in polynomial time using breadth-first search (see Theorem 2.2).
We now claim that is a constricted set.
where the inequality follows since, as argued above, the min cut is strictly smaller than .
This proves that is a constricted set.
■
Notes
We refer the reader to [CLRS09] and [KT05] for a more in-depth study of graph algorithms. The Augmenting Path Algorithm is due to Ford and Fulkerson [FF62]. The “marriage theorem” is due to Hall [Hal35].
1For this proof, we can set the capacities along edges in to any value , but for our later usage of this construction it will be useful that the capacities are sufficiently large.