I. Shortest Path Problem: Find the path in a weighted graph that (a) connects two vertices x and y, such that (b) the sum of the weights of all the edges in the path is minimized over all such paths. A. Important Insight: A greedy algorithm will work. A greedy algorithm tries to solve a problem in stages by doing what appears to be the best thing at each stage. In this case the greedy approach is to select an unknown vertex, v, at each stage which has the shortest path among all remaining unknown vertices. It can be proven that this path is indeed the shortest path to v. B. Converting the insight to an algorithm: We want to do some type of graph search. We will call the fringe the set of vertices that are not yet marked known but which have been seen at least once during the search. The next vertex to choose must come from this fringe set, since the paths to all unseen vertices are not yet known and hence these vertices should not be considered. We can write this insight as follows: Insight: If our search always selects the fringe vertex that has the shortest path from x, then when we visit y, we will have found the shortest path from x to y. C. Nailing down the algorithm: We want to perform a priority first search that starts from x. The priority of a fringe vertex is computed as the length of the currently known shortest path from x to that vertex (i.e., at each step we visit the vertex on the fringe that is closest to x). 1. Maintain a dist field with each vertex that keeps track of the currently known shortest path from x to that vertex 2. At each step visit the vertex with the minimum dist value 3. Each time a vertex is visited (i.e., becomes a known vertex), recompute the shortest paths for every unknown vertex adjacent to the visited vertex. If vertex k is visited and vertex m is an unknown vertex adjacent to vertex k, the computation is: m.dist = min(m.dist, k.dist + edge_weight(k,m)) D. Time Complexity: 1. In a sparse graph, priority-first search can compute the shortest path from x to any vertex in O((E lg V) time. 2. In a dense graph, priority-first search can compute the shortest path from x to any vertex in O(V2) time. E. Code--see pp. 344-345 of Weiss F. Code Notes 1. This algorithm is called Dijkstra's algorithm in honor of its creator. 2. If the graph is sparse, you should use a priority queue to find the next vertex to mark known a. The decrease function in Fig 9.32 would add a vertex to the priority queue if it was previously unseen (i.e., its dist field is infinity. b. The decrease function would reposition the vertex in the priority queue if it is already in the queue i. One way to reposition a vertex is to push it up the priority queue. The drawback to this approach is that we need to be able to locate the vertex in the binary heap, which means we need to find its location in the array representing the binary heap. This means that we need to add a field to a vertex that denotes its location in the binary heap. Keeping this field up-to-date can be rather messy. ii. A second way to reposition a vertex is to simply insert it into the queue again. This means that multiple copies of the vertex may be in the queue so after the first copy is removed, we don't want to process the remaining copies. This means that after deletemin returns a vertex, we should check to see whether the vertex is known. If it is we repeat the deletemin operation until we obtain an unknown vertex. iii. The second approach is easier to implement than the first approach but it may be less efficient because the priority queue could contain up to |E| entries rather than |V| entries. This means that inserts and deletes may require log |E| rather than log |V| time. However, |E| < |V|2 so log |E| < 2 log |V|. Hence the running time of the second approach should only be a constant time slower than the first approach. It is unlikely that the priority queue will grow as large as |E| so in practice the second approach may be faster, because its code is simpler. If I were implementing the shortest path problem in practice, I would implement approach 2 first because it is easier. Only if I had a performance problem would I implement the first approach, and I would do that only after using a profiler to ensure that the performance problem was caused by the second approach 3. If the number of edges, E, is proportional to V2, then priority-first using priority queues is O(V2 lg V). a. Using adjacency matrices, we can get O(V2) running time b. Insight: Each time we choose a vertex from the fringe, we will have to update O(V) priorities. In other words, in all likelihood, we will have to examine almost all the other vertices in the graph. c. Translating this insight to an idea: Since we probably have to look at most of the other vertices anyways, we might as well visit all the vertices. Visiting all the vertices is still O(V) time. However, we can both: 1. update the costs of the vertices that are adjacent to this newly visited vertex, and 2. figure out which vertex to visit next by keeping track of the minimum cost vertex d. Each time we visit a vertex, we will scan through the row in the adjacency matrix for that vertex. For each adjacent vertex, w, (i.e., each unknown vertex w with an edge to v), we will i. update w's dist entry ii. keep track of the minimum cost vertex we have visited (this will be the vertex with the smallest value) II. Spanning Trees A. Spanning Tree of a Graph: A subgraph that contains all the vertices of the graph, but only enough edges to form a tree. B. Minimum Spanning Tree Problem: Find a set of edges that connect all the vertices of a graph such that the sum of the weights of the the edges is at least as small as the sum of the weights of any other collection of edges connecting all the vertices. 1. Sample Application: Wire the cable outlets in a home using the least amount of cable possible C. Important Property/Insight: Given any division of the vertices of a graph into two sets, the minimum spanning tree contains the shortest of the edges connecting a vertex in one of the sets to a vertex in the other set. D. Converting this insight to an algorithm: 1. Begin with an arbitrary vertex in the graph 2. Choose the minimum weight edge emanating from this vertex 3. Add the vertex at the other end of this edge to the set of visited vertices (i.e., to the set of tree vertices). 4. Repeat steps 2 and 3, always choosing the lowest cost edge connecting a tree vertex to a fringe vertex. E. Nailing down the algorithm even further: The search described in B is really a priority-first search of the graph--at each step we visit the fringe vertex with the minimum weight edge to a tree vertex. We can use the same code we did for the shortest path problem except that now the calculation is: m.dist = min(m.dist, edge_weight(k,m)) In other words, now we simply want the lowest cost edge between m and the vertices currently in the spanning tree. F. This algorithm is called Prim's algorithm in honor of its creator.