- James S. Plank
- Original Notes: November, 2014
- Directory:
**/home/plank/cs302/Notes/MST**

Read the Wikipedia page for applications of spanning trees.

In this set of lecture notes, we'll teach you two algorithms for finding the minimum spanning tree: Prim's algorithm and Kruskal's algorithm:

You start by setting the current spanning tree to an arbitrary node. You'll note that that node, with no edges, is indeed a spanning tree of that solitary node.

You then proceed to iterate, adding one node and one edge to the current spanning tree. What you'll do is find the minimum weight edge from a node in the current spanning tree to a node not in the current spanning tree. You add that node and edge to the current spanning tree. Keep doing that until all nodes are added to the spanning tree.

To implement Prim's algorithm, you maintain a multimap of edges, just like Dijkstra's algorithm. However, the keys are the weights of the edges, rather than path lengths like they are in Dijkstra's algorithm. Each time you process an edge by adding the edge and its destination node to the current spanning tree. Then you process the node's adjacency list, adding more edges to the multimap. You may need to delete edges from the multimap when you do this, because you may be improving a node's distance to the current spanning tree.

Let's process an example -- you've seen this graph before:

I think it's safe to say that it's hard to eyeball this graph to find the minimum
spanning tree. So, let's run Prim's algorithm on it. We'll arbitrarily put *s*
onto the current spanning tree. In the pictures below, I'm going to denote the nodes
in the current spanning tree in pink, and the edges in the current spanning tree in red.
I'll also draw the multimap of edges. We'll start with *s* i the current spanning
tree:

We process the first edge in the multimap. That will add the edge *s-n02* and the
node *n02* to the current spanning tree. We add the edges to
*n00*,
*n01* and *n03* to the multimap, and we update the edge to *n04*, because the
edge from *n02* is smaller than the one that is currently there (from *s*):

Again, we process the smallest edge in the multimap. That adds the edge *n02-n03*,
plus the node *n03* to the current spanning tree. When we process edges, we see
that the edges to *n00* and *n04* need to be updated in the map:

Next, we add the edge *n03-n04*,
plus the node *n04* to the current spanning tree. When we process edges, the only
modification is that we change the edge to *n00*:

I'll draw the remaining pictures without comment, until you see the final spanning tree:

Here's the final spanning tree:

The running time of Prim's algorithm is identical to Dijkstra's algorithm. If we assume
that the graph is connected, then the running time is *O(|E|log(|V|))*. To derive that,
consider that each time that we visit an edge, we may be deleting and inserting an edge
into the multimap. The maximum size of the multimap is one entry per node -- hence the
*log(|V|)* term.

Sort the edges from smallest to largest and process them in that order. For each edge, determine if the edge spans two different connected components in the current spanning tree. If it doesn't, you ignore it. If it does, then you add the edge to the spanning tree, and the two connected components become one.

You repeat this process until you have just one connected component in the graph.

Implementing Kruskal's algorithm is straightforward -- sort edges in a multimap, and use Disjoint Sets to identify connected components. Let's go through the same example above, but using Kruskal's algorithm. I'll draw the current spanning tree, and keep the sorted list of edges to the right of the drawing. Here's our starting point:

We start with the smallest edge: *[n02-n03]*, and add it to the current spanning
tree. It now has 6 connected components rather than seven:

We'll process the next two edges: *[n04-n03]*
*[n00-t]*. Our graph now has four connected components:

Let's process the next two edges: *[n02-n01]* and
*[n03-n01]*. We don't add the second of these two the graph, because it does not
span connected components:

Similarly, edge *[n04-n02]* does not span connected components, so we ignore it.
The next two edges, *[n00-n04]* and *[s-n02]* do span connected components, so we
add them to our spanning tree. At that point, we are left with one component, so we're done!

The running time of Kruskal's algorithm is
*O(|E|log(|E|))* to sort the edges, and
*O(|E|α(|V|))* to process the edges. Since the first time is larger than the second, we
can ignore the second term:
*O(|E|log(|E|))*.

Does that mean that Kruskal's algorithm is worse than Prim's algorithm? Well, we can play games with math.
In the worst case, *|E| = O(|V| ^{2})*, so:

=

=

- If the edge weights are all unique, then so is the minimum spanning tree.
- Negative edge weights don't matter. In fact, you can add the same constant to every edge, and the minimum spanning tree will remain the same.
- Maximum spanning tree is the same thing, and both algorithms apply.