Might think of growing a graph one vertex at a time. Add new point k, and shortest path from i to j may go through k, with p, r before and after k, so you can look for paths i to p to k to r to j. Unfortunately looking at all possibilities is O(n

http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm

Transitive closure (Warshall's algorithm)

Suppose we do not care about distance, but only whether you can get there. The transitive closure of a graph G is a graph with the same vertices as G and an edge vw if there is a path from v to w in G. For a small graph you can work this out by eye easily (with or without a special algorithm). The book's initial example is below. The thinner edges are the ones added to make the transitive closure. The totally conected component ABD ends up with edges between all pairs in both directions. Since D connected to C, all other vertices in ABD also connect to C. In general, if any edge goes from one totally connected component to another, then in the transitive closure, all edges in the first totally connected component are connected to every edge in the other totally connected component.

When the graphs get big, you cannot just do them by eye. An obvious approach for transitive closure is just to do a DFS from each vertex. This is useful in some cases (see homework).

There is an easy adaptation of Floyd-Warshall works, using boolean values a

a_{ij} = a_{ij} or (a_{ik} and a_{kj} )

Warshall's algorithm for transitive closure is short to write, but not the lowest order. Interpret matrix multiplication of boolean matrices to substitute AND for multiplication and OR for addition with an adjacency matrix A. The identity matrix I, gives all the vertices reachable in 0 steps (just the vertices themselves). Let M = I + A. Then M gives all vertices reachable in 0 or 1 steps.

Consider M

Floyd's algorithm for shortest distance is very similar to Warshall's, but the trick with matrix multiplication works only for transitive closure, however, since multiplying distances does not help at all.

Dynamic programming, general introduction

Dynamic programming is a whole approach to problem solving. Getting an intuitive and practical feel for the idea of dynamic programming is more important than the individual examples. You can read 8.1. For an introduction to the approaches, see fib.py. This ridiculously simple example allows the techniques to be displayed with no complications obscuring them.

All these ideas apply in more complicated cases. The central idea is that results in the recursion "tree" are reused a lot. Essentially that means imagining a tree of cases is the wrong thing to do, since we are essentially saying the branches rejoin in a common subproblem. More generaly the dependent problems form a DAG. To take advantage of the DAG structure there is a tradeoff with time and space usually. Keeping extra space requirements to a minimum is an issue.

In most problems,the DAG is implicit, and connections are found by digging into the code. To emphasize the centrality of DAG's, we can state the problem for an explicitly given DAG, where we explore the sequence of critical path solutions: critPathRec.py, critPathDFS.py, critPathTopo.py. Though we did a version of the last one first, it is the most sophistcated one, improving on the others. The recursive version recalulates for each vertex every time it is reached, so it is inefficient if nodes are used multiple times. The DFS version memoizes, recording data for each node and the idea if checking if a node is visited is checking if data still needs to be calculated, or has already been done. The reverse topological sort avoids the memoizing.

- Tends to be used for recursive optimizations where subproblems are reused, and there are not too many subproblems.
- Recursion makes logical sense if the optimum/answer for subproblems carries over as a part of the optimum/answer for a larger problem containing the subproblem.
- Dynamic programming is distinguished from regular recursion by
the
reuse of
subproblems. It depends on the setting up the
subproblems in a way that the space for the results of all subproblems needed at one time is small enough
to fit in the memory allocation for the problem.

- In order
to store problems there must be an easy way to specify the parameters
that distinguish one subproblem from another. These parameters
are usually just the parameters that change in recursive calls in the
recursive version of the solution.

- Getting started by stating a recursive algorithm that gives the
problem-specific relation of larger problems to smaller problems.
For this part you do not need to worry about the basic dynamic
programming idea of conserving the time spent on reused problems
(except that you do want a recursive structure that does not have too
many subcases). People seem to have trouble with recursion in
general AND this is the creative part of the solution (always the main
trouble). We will spend a lot of time on this and you will need to get your recursion abilities honed!

- Given
a clearly stated relationship of larger to smaller
problems, and not having too many smaller problems total, it is a
pretty mechanical process to convert a recursive algorithm to dynamic
programming, but students still have a problem with it, I'm not sure
why. Be sure to ask questions. This is a basic ability that
I will certainly expect
of you! With little thought you can memoize. We will see
pretty standard patterns for topological orders. The most time
consuming part is keeping a secondary table to remember the optimal
choice of subproblems if you want to reproduce an optimal
sequence. This is still pretty mechanical but admittedly messy,
initially setting up pointers to the provious step, and then
traversing the optimal linked list at the end.

My code generally includes a pure recursive solution, and a dynamic programming one with a topological order, and sometimes the longer version allowing you to reproduce the optimal path. Before looking at my dynamic programming solution, all the examples are good ones for you to practice starting with the recursive version and doing a conversion to dynamic programming, until you are sure you clearly have the mechanics, and you can also make sure you have the mechanics of keeping the secondary table to reproduce the optimal path.

Consider in class the contiguous subsequence problem, finding a maximum sum: (DasGupta 6.1):

Given real numbers a

for each i, j, 1 <= i <= j <= n, the sequence for numbers a

...

Hint: Max subsequence must end somewhere. Use D(i), as the maximal sum ending at index i, which may include no elements, and be 0, so its minimum value is 0.

...

D(i) is either 0 or based on a sequence ending with a

D(i) = max(0, D(i-1) + a

Since subproblems do not get used multiple times, this is not really a dynamic programming problem, but the recursion idea is still important,and illustrates a generally useful point: The general problem is just find the largest sum, somewhere in the sequence, but we defined D(i) with an extra condition, that the sum sequence ends at i.

This is a common distinction to consider: Possibilities:

- The subproblems can be expressed exactly like the overall problem ( just with a smaller data set)
- The subproblem is more specific. This is often the case when then general problem allows some data elements to be included or not, or be the end of the part in the optimal choice or not.... The subproblem may specify that the last element in the partial-problem dataset IS used or IS the ending parameter in the optimal solution. In these cases extra information needs to be maintained (commonly a running maximum or minimum) to produce the final answer to the more general question.

- With a single sequence of input data like a
_{1}, a_{2}, ... , a_{n}, there are two common situations for sub problems and a less common one.

- a
_{1}, a_{2}, ... , a_{i}, for some i = 1, 2, ...n

- a
_{i}, a_{i+1}, ... , a_{j}, for some i <= j - all subsets of the data. This is automatically
exponential in the size of the data, so it is not really good, but
sometimes the alternative is O(n!), which is much larger.

- Another possible input situation is to have two data sequences, a
_{1}, a_{2}, ... , a_{n}, and b_{1}, b_{2}, ... , b_{m}. Here the data for a subproblem is often a_{1}, a_{2}, ... , a_{i}, and b_{1}, b_{2}, ... , b_{j}. - If the problem can be stated in terms of a rooted tree, subproblems often correspond to subtrees.
- Sometimes there are multiple data elements, but they make sense
to be a part of all problems, and there is a much simpler
dependency: If the problem depends on one or more positive
integer numbers parameters specific meanings (not part of a set), maybe
the dependency is just on the same problem with smaller values for
these parameters.

- The overall time is related to the total number of subproblems needing to be considered.
- If you need a table of the result of all these problems, there is an issue of space.

A palindrome subsequence of abracadabra is a b a c a b a. Another is a r a d a r a.

Develop table: sizing, initializing, filling, step by step in class!

Stub to build on basic recursive case, palindromeStub.py, using printtables.py for display:

Check later: complete Palindrome subsequence in Python, using printtables.py for display, with extra versions reconstructing the palindrome, and showing off the data used for reconstruction.

Consider how to do a topological ordering in the table. Assume i is the row and j is the column. The base cases give the main diagonla as 1's. The 0-length intervals correspond to i=j+1 (just below the main diagonal). The entry for (i, j) depends on table locations (i+1, j-1), (i, j-1), (i+1, j). For instance the position with the '?' depends on the 3 locations with D already being filled in.

j

-----------+

1 |

01 |

01 |

i 01 D? |

01 DD |

01 |

01 |

. |

. |

01 |

01|

The dependency on positions down and to the left, means you want to start as far as possible down to the left, and and work from there. The rows are shown as an alphabetic sequece with the direction indicated:

j

-----------+

1>>>>>>>>>z|

01 |

01 |

i 01 ... |

01 |

01>>>>e|

01>>>d|

.>>c|

.>b|

01a|

01|

My old alternate notes on two of the book's examples follow, knapsack and sequence of matrix multiplications:

Simple example: Knapsack holds 5kg (W = 5) and there are 4 items to choose from:

Index |
Weight |
Value |

1 |
2 |
12 |

2 |
1 |
10 |

3 |
3 |
20 |

4 |
2 |
15 |

What is the brute force approach? What is the order of the time? What is the final solution for the data above?

What is a greedy algorithm? What answer does it give for the data above?

It turns out that a very similar problem is much easier: Suppose all the valuables are precious metal powders, so any fraction of an item can be considered. Then the obvious greedy algorithm works to give the optimal solution. Proving this is a reasonable exercise.

Back to the discrete all-or-nothing case. Suppose W is way less than 2

. . .

Clearly it is a simpler problem with fewer items. We do not want to consider all 2

so there are subproblems for the first k items, k <= n. How can we relate the problem for n items and max weight W to smaller problems that do not involve the last item? How does W come in? How does subproblem optimality come in?

If adding w

Let V(i, j) be the maximum value selecting among item 1, 2, ... i, if the weight limit is j.

Then in the situation described above,

V(n,W) = v

The above equation only makes sense if (1) the nth item is helpful (2) W-w

If we were not to include w

. . .

Assume V(i, j) = 0 if i = 0 ( no items) or j = 0 (0 weight limit). Then for i = 1, 2, 3, .... n, j = 1,2, ... W:

if w

V(i, j) = V(i-1, j)

else

V(i, j) = max(V(i-1, j), v

Clearly extra space is order of nW, since time for each entry (assuming dependencies calculated) is O(1), total time in the order of nW also. Note this dynamic programming algorithm is only helpful for sure if the order of nW is smaller than 2

With the simple sample data given above, we start with the top row and left column of 0's, and work out one row at a time:

Best Value |
||||||||

Index | Weight | Value | Maximum
Weight |
|||||

0 |
1 |
2 |
3 |
4 |
5 |
|||

0 |
- |
- |
0 |
0 |
0 |
0 |
0 |
0 |

1 |
2 |
12 |
0 |
|||||

2 |
1 |
10 |
0 |
|||||

3 |
3 |
20 |
0 |
|||||

4 |
2 |
15 |
0 |

generating:

Best Value | ||||||||

Index | Weight | Value | Maximum
Weight |
|||||

0 |
1 |
2 |
3 |
4 |
5 |
|||

0 |
- |
- |
0 |
0 |
0 |
0 |
0 |
0 |

1 |
2 |
12 |
0 |
0 |
12 |
12 |
12 |
12 |

2 |
1 |
10 |
0 |
10 |
12 |
22 |
22 |
22 |

3 |
3 |
20 |
0 |
10 |
12 |
22 |
30 |
32 |

4 |
2 |
15 |
0 |
10 |
15 |
25 |
30 |
37 |

so the final maximum choosing between all 4 items with a weight limit of 5 is 37. (This was a reasonable scale to demonstrate the dynamic technique, though in this small case, there are more entries in the table above than were considered by brute force!). See knapsack.py, using already referenced printtables.py for display.

Homework problem R: Add Python code or pseudocode to print pairs (value, weight) for each item in the optimal set, in increasing order of index. This can be recovered from the basic table, and the lists of values and weights, without generating a separate table, though you can generate an extra table if you like. Hint at end.

A

A

((A

((A

A

To multiple k by m and m by n matrices need k*m*n multiplications by the definition (ignoring Strassen and other such fancy multiplication techniques).

Suppose the dimensions of A

If the sequence of dimensions are 30 1 40 10 25

((A

(A

A

One approach is to just follow the order of computation: pick an adjacent pair to multiple, removing one dimension. do this recursively, testing each position. The work to multiply p by r and r by s matrices (by definition, not Strassen) is O(prs). See multMat1.py

Time?

What if reuse results?

Another approach: How are numerical calculations generally viewed in a compiler or expression evaluator: As a tree:

*

/ \

1 *

/ \

* 4

/ \

2 3

The recursive approach to trees suggests looking at the root and subtrees, and all subtrees that could replace a given subtree. How can we parametrize that different problems? Each subtree will handle a contiguous range of the indices. Let the subproblems be f(i, j), which gives the minimum number of multiplications for the matrices involving dimension indices i .. j. The principal of optimality applies: The optimal solution for a problem involves the optimal solution for subproblems. Given the matricies for a range of dimension indices from i through j, the question remains, where will the root of the tree be? In other words, what will be the dimension index k, so the left subtree comes from dimensions indices i..k, and the right subtree comes from k..j. You need to test for each k, and take the optimal one. Each involves a root multiplication with d

d

See the beginning of multMat2.py (the recursive version only).

How to make a topological order?

0 entries on diagonal are immediately known

To calculate entry at ? you depend only on the entries labeled D

---------+

0 |

0 |

0DDDD? |

0 D |

0 D |

0 D |

0D |

0 |

0

think before looking

...

Order of calculation showing order of row order and direction in rows, similar to the palindrome example

07----->

06---->

05--->

04-->

03->

02>

01

0

MultMat2.py also uses this method, and ends with a version keeping track for the multiplication order.

Homework Problem S: Write an algorithm to take the last array and produce the correct fully parenthesized multiplication expression for the matrix multiplication, in a form like

((1 2) (3 (4 5))) or a postfix (Reverse Polish) notation like

1 2 * 3 4 5 * * *

For brevity and printing ease, instead of using the full subscripted matrix names as in (A

The text covers the optimal binary search tree problem. This is explicitly a tree problem, and immediately suggests the form of subproblem. The algebra is more involved, until you see how it simplifies. This section may be considered below as a problem. It is not a really realistic solution. With the static distribution, a hash table could be used easily.

Problems to possibly consider in class:

In all problems, besides pseudocode, analyse the order of the space and time requirements (here and in homework).

1. Making change: Given coin values 1 = c

2. Determining line splits in a paragraph with fixed width font, and penatly function for the amount the width of all the words differes from the column width W. For instance words of length 3, 2, 4 fit on a line of width 13. They take up space with the intervening blanks of 3+1 + 2+1 + 4 = 11, with 2 spaces left over to give a penalty. The simple transformation is to include the space after each work in the length of the word, and assume the maximum width W is also one larger, so the corresponding situation would have words of length of 4, 3, and 5 on a line of length 14, and the extra space is 14 - (4+3+5) = 2 again. We assume this transformation.

So gives lengths w

...

Brute force is ridiculous. We consider an iterative approach building on shorter lists. We could grow the list from either end, back or front, but it turns out that the special end penalty makes this not be symmetric, and it is slightly easier to imagine starting from the end and working backwards. As long as all the words fit on one line, there is no penalty, but then.... Solution lineBreak.py

I repeat again, the idea of transforming a recursive solution to dynamic programmin is very basic. Be sure you can do that. Besides the homework, you can start with the recursive example solutions I give and get the dynamic solution yourself.

- - - - -

Other possible examples:

edit distance (done in DasGupta)

fastest binary search (8.3) Python recursive only: optSearch1.py, dynamic: optSearch2.py. The latter needs printtables.py.

Longest increasing subsequence: (done in DasGupta) Given numbers a

DasGupta 6.3, 6,6, 6.5, 6.12, 6.21

Hints:

R: Since there are only a couple of possibilities for the reason for each new entry, you can just look at the table to see which place is feeding the current entry, and work backwards as usual, though the questions asks for the final answer to be stated forward through the items.

S: Given the last matrix, this is concisely stated recursively. Another chance to work on your recursion!

Next P and NP