Problems based on Appendix B or 04A_Recurrence.html: Find the Θ order of T. Assume integer division in the recurrence relation. Recall logb bk = k.
Also these problems not in the book:
Problem A: Suppose my non recursive quicksort algorithm is used, that minimizes stack height. Suppose the tree below shows the size of the lists partitioned from each larger list (assuming quicksort is used all the way down to the lists of size 2). The real stack would need to show both starting indices and length at each step, but the order of processing only depends on the lengths, so I just show lengths and omit the starting indices:
33 / \ 6 26 / \ / \ 3 2 1 24 / \ \ / \ 1 1 1 12 11 / \ / \ 8 3 5 5 / \ / \ / \ / \ 4 3 1 1 2 2 2 2 / \ / \ / / \ \ 2 1 1 1 1 1 1 1 \ 1
Only lengths greater than 1 go on the stack, and when a list is split into two parts that do go on the stack, the smaller goes to the top of the stack.
I show the states at the beginning of the first few times through the main loop, one per line (with only sizes), placing the top of the stack at the left:
33 6 26 2 3 26 3 26
In the same format show the whole sequence of stacks and mark the line showing the longest stack, appending " <<<" to that line.
Problem B: My Java quicksort algorithm has inner loops:
while (data[iHi] > pivot) iHi--; while (data[iLo] < pivot) iLo++;
These loops will both stop at a pivot value, and you could end up swapping two elements that are equal to the pivot. This seems wasteful. An alternative is to include the pivot in only one of the categories (say consider it a large element), and make one of the inequalities not strict:
while (data[iHi] >= pivot) iHi--; while (data[iLo] < pivot) iLo++;
You need one more small change to avoid an infinite loop: As the book discusses, you need to place a sentinel at the beginning of the array, a value smaller than anything in the array (like the minimum possible integer value), and have your actual data start at index 1.
Analyse the running time of these two versions if all the data is equal (bigger than the artificial sentinel value).
Problem C: The square of a matrix A is its product with itself, AA.
Show that five multiplications are sufficient to compute the square of a 2 × 2 matrix of integers.
What is wrong with the following algorithm for computing the square of an n × n matrix?
“Use a divide-and-conquer approach as in Strassen’s algorithm, except that instead of getting 7 subproblems of size n/2, we now get 5 subproblems of size n/2 thanks to part (a). Using the same analysis as in Strassen’s algorithm, we can conclude that the algorithm runs in time O(nlg 5).” You are likely to want my Hint.
Problem D: Modify Mergesort.java or mergesort.py so mergesort returns the number of inversions in the original list, and does not increase the asymptotic order of the function, as discussed in class. Modify the testing code to show off the new version. This version in mergesort.py is verbose, but quite standard, and fairly easy to modify.
Problem E: What would be a good sort to use? Explain
9.4: 1
9.4: 4 (Explain clearly)
9.4: 10
Problem L: Party problem. You know who knows who among n people that you would like to have come to a party. Suppose you number the prople 0 ... (n-1), and have a symmetric nxn boolean matrix Knows of who-knows-who, Knows[i][j] is true if person number i knows person number j. It is symmetric because we assume "knows" is symmetric: If I know you, you know me. You make further requirements that you want each person to have at least 5 new people to meet at the party, and also, so nobody feels too isolated, each person should already know at least 5 people at the party. Your original list may not satisfy these extra two conditions, so you may need to eliminate some people from the invitation list (or maybe you cannot have a party at all with these restrictions). Clearly state an algorithm to find a largest possible subset of the n people that you could invite and satisfy the the other two requirements. For the basic problem, you may find an O(n^3) algorithm and explain its order and its logic. O(n^2) without using hash tables would be even better, but not required! Be careful of your time bound calculations in dealing with collections.
Hint for L: If a person would not satisfy the extra conditions for a list that you are currently considering, could that person satisfy the conditions for any reduced list?
Problem M: As in class, follow Kruskal's algorithm for the graph above, keeping track of the union-find data structure to support it. Note that the union-find in-tree data structure you are asked to keep track of is totally separate from the MST itself. When edges have equal weight, consider them in alphabetical order. Have a step for each edge being considered, and update the in-trees for the two compression-finds and the union (when a union happens). Label each step with the edge being considered. Put a * beside the edge if it gets considered, but not included in the MST. Keep track of the rank for each root vertex. Remember the only rank that can change is the rank for a root of a new union. In doing unions, if the two parts have equal rank, make the vertex EARLIER in the alphabet be the root. This is important for the full illustration of options as well as for checking against my answers. Also note that while edges that would make cycles do not affect the MST, the finds required to discover that they are in the same component can cause compression, so the in-tree data structure can change even for an edge that is only considered for the MST but not added to the MST. You only need to draw the final MST graph, just once at the end of your steps. At the intermediate steps, just list the edge you are considering.
Problem N: Consider a possible divide-and-conquer approach to minimal spanning trees: Suppose the vertices of the connected weighted graph G are partitioned into nonempty disjoint sets V1 and V2 such that the weighted graphs G1 and G2 restricting G to V1 and V2 respectively are each connected. Suppose we find minimal spanning trees T1 for G1 and T2 for G2. Finally, find a minimal weight edge vw from V1 to V2. Let T be T1 + T2 + vw.
Problem P: See the MST Edge Swapping Theorem added to the notes and the Lemma right after it. Prove the Lemma.
Hint for P: First, since the number of edges in an MST for any one component of a graph must be one less than the number of vertices in the component, the number of edges in T not in S is also k. Let vw be the edge of lowest weight in either S or T that is not in the other. Because of the symmetry of the descriptions, we can assume this lowest weight edge is in S.... See the proof of the MST Cut Theorem for inspiration if you need it.
DG4.8: Here is a proposed way to deal with negative weight edges in finding minimum distance paths from one vertex s:
Does this always work? Give a proof or a counter-example.
8.4[8.2]: 1 (Just find the transitive closure matrix. You do not need to use Warshall - likely easiest drawing a picture, completing the transitive closure visually, and translating back to a graph.)
8.4[8.2]: 2b, 6
8.1: 7[9],
8.1: 12[10] (This problem has been simplified merely in the way it was stated. If P(i,j) had not already been defined, you would probably have first thought to quantify probabilities in terms of the number of games won so far, which is not useful.)
8.2[8.4]: 1a, 3
Problem Q (DasGupta): Suppose you are given text with spaces, capitalization, and punctuation removed, like 'whatintheworld'. You also have a dictionary, that in O(1) returns whether a sequence of letters forms a legal word. Give an algorithm to determine whether the text that you are given can be completely split into a sequence of legal words. For a sequence of n letters, your algorithm should be O(n2). (This is important for AI understanding speech, since words tend to run together. The actual problem is much more complicated, since you want to look for words that make sense together in context.) The subproblems should be pretty obvious. What data will you store in your table?
Problems R, S stated in the Dynamic Programming (Ch 8) notes. You can do it in clear, complete pseudocode, but given the Python codes, it may save you checking time to make it executable Python!
Problem T (DasGupta): I want to travel along a route with n motels at mileage m1 < m2 < ... < mn. I start at mile 0, will stay at a motel each night along my way, and end up at the last motel. Ideally I would travel 400 miles a day, but that may not be possible given the spacing of motels. In planning my itinerary, I assume a penalty measure of (400-d)2 if I travel a distance d in a day. I want to minimize the sum of penalties for each day's travel. For example if there are motels at mileage 300, 375, 600, 850, and 1210, I could stay at the motels at mileages 375, 850, and 1210, for a total penalty of
(400-(375-0))2 + (400-(850-375))2 + (400-(1210-850))2 = 252 + 752 + 402.
Hint: Use the most obvious subproblems. Similar, but simpler, than lineBreak.py.
Problem U: Longest common subsequence. Given two sequences,
a1, a2, a3, ..., an andb1, b2, b3..., bm,
determine the length of the longest common subsequence. For instance in
7, 2, 5, 3, 4, 5, 1, 7 and1, 4, 2, 3, 7, 4, 1, 3, 9,
the longest common subsequence has length 4. A maximal length subsequence is: 2, 3, 4, 1. Find an O(mn) solution. With the same comment as in problem R, take it one step further to produce a longest subsequence.
Hint: Use the suggested subproblem data from the notes for this situation. Palindrome was also trying to match two sequences....
11.3: 5
11.3: 6
11.3: 7a (Note that the assumption made in 7a is to reduce your work, but checking legality is required in general to confirm a solution),
11.3: - 9ac (Show polynomial reducibility in both directions for clique and independent set; you can skip the middle one)
11.3: 11[10]
11.3: 12a[11a]
Problem W
- A Hamiltonian Path is a path that visits each vertex once, but being a path rather than a cycle, it starts and ends at different vertices. Show that the undirected Hamiltonian path decision problem is polynomially reducible to the directed one.
- Based on your reduction from part a, and the knowledge that the directed problem in NP complete, could you conclude that the undirected problem is NP complete?
- Same question as b. with except with the italicized directed and undirected swapped.
W Hint: The mapping should be easy.