Greedy Algorithms, Graph and Others (Text Ch 9)

We first look at several graph problems where a greedy approach can be proved to give a globally best solution. It turns out the several basic data structures support these graph algorithms, heaps, minheaps with reduce key, and union-find dynamics sets. They are discussed in more detail, too.

We will look at two quite different algorithms with different advantages and drawbacks. First, Prim's algorithm which builds up a larger and larger connected graph:

Start with an arbitrary vertex s in V.

Maintain three disjoint sets (or three classifications for vertices):

- T: the vertices in the currently connected subgraph,
initially empty

- F: fringe (adjacent to the current tree) initially s. For each vertex w in the fringe remember the lowest weight edge from the tree T to w. Initialy set s to have weight 0.
- U: unseen vertices, initially all but s

while F is not empty:

select and remove a vertex v from F with minimum weight

add v to T

for each neighbor w of v:

if w is in U:

take w from U and add it to F with the weight of the edge vw

p(w) = v

else if w is in F:

if the weight of the current edge from T to w is higher than the weight of vw:

identify fringe vertex w with the weight of edge vw

p(w) = v

When the algorithm is over, all vertices with paths from s are in T, and p maps all these vertices other than s to their parent in the tree T. Do an example.

See prim_mst.py where you can execute the algorithm. Before considering the implementation in detail, note that selelcting the fringe element with minimal weight suggests storage in a priority queue. One implementation of a priority queue is a heap. A basic heap implementation is in heap.py. That, however is not efficient in this situation. Not only are elements added and remove from the priority queue(basic operations), there is also the need to decrease the weight of an element already in the priority queue when a better edge from the tree to a fringe element is found. A shortening of a "priority queue where you can decrease the value associated with a key" is a "priority queue with a decrease key operation".

Without specializing to a particular priority queue implementation yet, the time for a graph with n vertices and m edges, assuming m > n, is

O(n*insertItem + n*(popitem) + m(decreaseKey + getWeight))

In a basic heap there is not a way to find a specific current element,
other than a linear time search. In some cases the number of times that
a decrease key operation must be performed is Θ(|E|Time analysis: In this case the general expression from above

O(n*insertItem + n*(popitem) + m(decreaseKey + getWeight))

becomes, with the elaborated heap,
O(n*lg n + n*(log n) + m(log n + 1)) = O((n + m) log n)

There are other more complicated priority queue approaches where decrease key is O(1). The text has references. We will not pursue them in this course.

To prove the correctness of Prim's algorithm we need a theorem, that we will also use again in the future (not in book, but is in DasGupta as the Cut Property!):

MST Cut Theorem

Suppose edges X are part of a minimum spanning tree of G = (V, E ). Pick any subset of vertices S such that X does not cross between S and V − S, and let e be the lightest weight edge across this partition. Then X ∪ {e} is part of some MST.

Proof: Suppose G, X, S are as above, and T is an MST of G that contains X, and vw is a minimum weight edge between V-S and S. If T also contains vw, we are done, so suppose vw is not in T. If we add vw to T, a cycle is created with the path P in the tree from v to w, plus the edge vw. Removing an edge from this cycle recreates a spanning tree. Choose the edge to remove as follows: Since one edge of the cycle crosses between S and V-S, another edge, v'w' must cross back Let T' = T + vw - v'w', another spanning tree. Since vw is a minimum weight edge from V-S to S, and v'w' is another W(vw) ≤ W(v'w') and hence W(T') ≤ W(T), but T is an MST so W(T') = W(T), and T' is also an MST.

Theorem: Prim's algorithm produces an MST.

Proof by induction on the number of steps so far, that the tree so far is a part of an MST, and hence the final step is an MST for G.

Basis: Initially no edges, certaining a subset of any MST.

Induction step: Let T be the tree constructed by Prim's algorithm at some step before the final one, and let vw be the next edge added, with v in T and w not in T. Use the Cut Theorem with T = X, and S the vertices in T, so v is in s and w is in V-S. Then since vw has minimal weight, T+vw must be part of an MST.

In proving things about MST's the construction in the MST Cut Theorem is standard: adding and edge, causing a cycle, and removing an edge somewhere else in the cycle. Be sure you follow the argument.

Beside examples in the text, there is an example in picture from DasGupta, Section 4.4 of DasGupta, page 110, or about the 5th page of the chapter. See eithr of these for a proof that Dijiksta's a;gprithm works. Be sure to note where the nonnegative edge weight comes in.

Essentially the same analysis as Prim for time, O((n+m)log n).

and the set of edges R remaining, initially all of E

while R is not empty:

Remove lightest edge vw from R

if vw does not make a cycle in F

add vw to F

return F

Proof it works - easy from MST Cut Theorem.

Time: getting lightest edges takes time O(m lg m) in worst case. The checking for components is trickier. We have already done this efficiently in the static case, but repeating that is not efficient in the dynamic case. There is a data structure and union-find algorithm that is better. See pictures in our text or Ch 5 of Das Gupta. See kruskal.py, which uses unionfind.py.

So back to the time: with the efficient version of union-find and a maximum of 2m finds, n-1 unions, making a union-find program of length 2m+n-1. If we assume the usual case that m >= n, this contributes O(m lg* m), so sorting edges is the main contribution O(m lg m).

Compare Prim: With the algorithms indicated, both Prim and Kruskal have the same order. Depending on the specific situation, either could be faster in practice. With fancier algorithms Prim could be faster. If for some reason you have the edge weights already sorted, Kruskal ends up faster.

It is useful to understand the situation when there are multiple MST's for one graph:

MST Edge Swapping Theorem (not in book!)

Let G be a weighted graph. If S and T are MST's for G, with k elements of S not in T, then there is a sequence of MST's T = T

Example progression:

Proof: This follows immediately by induction from the following. Basically if you have two MST's that are k edges apart, then you can find an MST that is one edge from one of the MST's and k-1 edges from the other:

Lemma for the MST Swapping Theorem:

If S and T and G are as in the MST Edge Swapping Theorem, and there are k edges in S that are not in T, with k > 0, then there is an MST T' for G that is either formed from S by removing an edge of S that is not in T and adding an edge from T that is not in S, or is formed from T by removing an edge of T that is not in S and adding an edge from S that is not in T.

Proof of Lemma: Homework exercise

If you want to go up one level of abstraction, you could restate the theorem in terms of a graph S

Union Find for Kruskal, Step by Step

The word rank or weight rather than depth in used purposefully inthe discussion of in-trees: Remember that the efficient union-find gets part of its efficiency from compression-find only traversing up from the the specified starting node up to the root, not looking through the whole in-tree somehow. Though one path upward may compress, you cannot assume that the compression of one path reduces the depth, without having a full check of the in-tree. The maximum depth (looking at the whole in-tree) may decrease, but there is no computational time given to determining that. The rank is a simple-to-keep upper bound on the actual depth. The only way rank changes is through the simple calculation when a union of two trees of the same rank is done, increasing the rank by 1. Rank never decreases, so after some compression the rank may be larger than the depth.

Assume that when there is a union of two trees with the same rank, that the first root alphabetically stays as root. The Kruskal sequence with the graph is at the left. The union-find data structure with weighted union and comprtession find is on the right. Read down the column first.

Huffman - follow our text or Ch 5 of Das Gupta pages 12..16 of the chapter.

An important characteristic of any Huffman encoding is that no bit string used to represent a character is the prefix of any other bit string used to represent a character.

Hypothetical Code:

A: 01

B: 011

C: 110

D: 10

How do you decode

0 1 1 1 0

0 1 1 1 0

vs

0 1 1 1 0

You avoid this problem by keeping data only at leaves. Another clear optimization: So you do not waste nodes, there is no need going through a node with just one child. This means the tree structure is a 2-tree with all nodes having 0 or 2 children, never 1. (DasGupta calls this a full tree) for a start at efficient, unambiguous decoding. There are still a lot of 2-trees.

Two ways of looking at counts are important in forming an optimal 2-tree:

1. Sum of all path lengths to leaves, with frequencies. This show the two with the smallest frequencies should go at the bottom.

2. Also label so all interior nodes except the root with a frequecy which is the sum of their children. Each number shows the number of paths that follow the leg from the node's parent to itself. The sum of all path lengths with frequency is the sum of the frquency of use of every edge, which is the sum of the values at each node. This allows the easy reduction to a smaller problem, replacing branch with two smallest, just with the parent node.

Consider “BAA BAA BLACK”

Character frequencies:

A 5

B 3

_ 2 (use _ as a shorthand for blank)

C 1

L 1

K 1

build tree, code, encode

Class practice:

- 9.1 1 [old 2]
- Given an undirected graph G(V, E) with all edges of weight 1, find the number of distinct shortest paths between vertices v and w. (DasGupta Ex 4.5)
- DasGupta Ex 5.9abef

More creative problems from earlier sections:

1. The diameter of a graph is the max of the distance between any two points. In particular it makes sense for a tree. Find an O(n) algorithm for a tree.

2. Given n professional wrestlers, with r rivalries specified between pairs. Find an O(n+r) algorithm succeeding or acknowledging impossibility of declaring a partition into good guys and bad guys, and have all rivalries between a good guy and a bad guy.

3. Create an O(n) algorithm and
explain
why the algorithm has that order:

boolean sumMatch(int[] A, int n, int
val)

// n > 0; A is an array with n elements, and A[0] < A[1] < A[2] < … A[n-1].

// Return true if there are indices i and j such that A[i] + A[j] == val ; return false otherwise.

// Examples: If A contains -3, 1, 2, 5, 7, then sumMatch(A, 5, 6) is true (1+5 is 6), and

// sumMatch(A, 5, 12) is true but sumMatch(A, 5, 11) and sumMatch(A, 5, 0) are both false.

// n > 0; A is an array with n elements, and A[0] < A[1] < A[2] < … A[n-1].

// Return true if there are indices i and j such that A[i] + A[j] == val ; return false otherwise.

// Examples: If A contains -3, 1, 2, 5, 7, then sumMatch(A, 5, 6) is true (1+5 is 6), and

// sumMatch(A, 5, 12) is true but sumMatch(A, 5, 11) and sumMatch(A, 5, 0) are both false.

What if we allow negative edge weights? Section
4.6 of DasGupta p 128. Note this is part of the course not in Levitin.

Class: See usefulness with
DasGupta Ex 4.21, 22.

HW will include Das Gupta 4.8, 4.9

Next All pairs shortest paths and dynamic programming