Stable sorts:

It sounds like "stable sort" was not defined generally in 271, and the example in the book turns out to be pretty poor, so here is a more complete one:

Consider 2 implementations of selection sort for a linked list. I use letter suffixes to distinguish different entries with the same value.

unsorted: 4a 5a 2 5b 4b sorted:

If I remember the location of the last copy of the largest element and pull it out and move it to the beginning of a new sorted list, and keep repeating, I would get

unsorted: 4a 5a 2 4b sorted: 5b

unsorted: 4a 2 4b sorted: 5a 5b

unsorted: 4a 2 sorted: 4b 5a 5b

unsorted: 2 sorted: 4a 4b 5a 5b

unsorted: sorted 2 4a 4b 5a 5b

5a come before 5b, and 4a before 4b, as they did in the unsorted list.

If I pull out the *first* maximal element, and otherwise do the same thing, I get

unsorted: 4a 2 4b 5b sorted: 5acp2m

unsorted: 4a 2 4b sorted: 5b 5a

unsorted: 2 4b sorted: 4a 5b 5a

unsorted: 2 sorted: 4b 4a 5a 5b

unsorted: sorted 2 4b 4a 5b 5a

The order of the 4's and 5's are reversed. That means this version is not stable. To be unstable, we only need an example with at least one pair of equal values reversed, not all pairs. This particular implementation will reverse the order of every pair of equal elements.

You can easily show that the first version will be stable in general.

The basic recursive algorithms should be familiar. We can do a
pretty direct translaton. particularly concise Python 3.1 code in mergesort.py. Also

There are much more sophisticated algorithms that handle sorted, or near sorted data better. Anything
recursive can be converted to a stack based algorithm. A more elaborate version looks for natural monotonic runs,
and figures out when to pop or push an increasing sequence by the
relative lengths. This is significant for non-random data, when
you may have considerable monotonic runs. The algorithm used by
Python has interesting optimizations to make it be both efficient on
totally random data and also on data with long monotonic runs. I
have copied a description of the
algorithm from the Python source repository. (Not required but interesting.)

On to quicksort. Everyone familiar with it?

Worst case,
as in book is pivot is always at one end or the other, then it is
very slow, O(n^{2}). This O(n^{2}) is always
possible with quicksort, even if a larger, but limited number of
elements is considered first as the pivot.

Improvements in
finding the pivot are to use the middle element, or a random element,
or the median of a few elements - simplest is the first, last and
middle. Larger group can also be used, leading to a secondary
problem of finding the median efficiently. How do you do
that?

See 04AvgQSort.html for
the details of the calculation of a bound on the average number of
comparisons in Quicksort. The result is:

A(n) ≤
2 n ln n, for all n >0.

In terms of the more common
log in algorithms, lg or log base 2:

A(n) ≤ 2
ln 2 n lg n ≈ 1.386 n lg n

For mergesort, the coeffiicient
is 1, but quicksort is generally faster.

Examples worked out
in class:

1. approach to median

2. counting
the number of inversions, text problem 5.1 #9

Quicksort
Variations and Improvements

It is not essential to separate the pivot from the rest of the elements as in

AltQSort.cs. In this form however, there are technical problems with a proof of average time.

An algorithm is *in place* if
it requires an additional O(1) in memory resources.

Is book's
quicksort in place?

Easier analysis of non recursive version: put start and end indices of parts left to do on a stack

Worst case for time is if one part is always small, but which end is
significant. Suppose first part goes first on stack, and I illustrate

Initial:

0 n

Pop, split, push back

3 n < top

0 1

Pop, split, push back

6 n < top

3 4

0 1

Pop, split, push back

9 n < top

6 7

3 4

0 1

Pop, split, push back

12 n < top

9 10

6 7

3 4

0 1

How big will the stack get?

Compare if the second part goes on the stack first:

Initial:

0 n

Pop, split, push back

0 1 < top

3 n

Pop 01, completely finish this base case

3 n < top

Pop, split, push back

3 4 < top

6 n

Pop 3 4, completely finish

6 n < top

Pop, split, push back

6 7 < top

9 n

How big will the stack get?

How do we make something closer to the latter case always happen?

How big can the stack get then, in general?

Size n

Size < n

size < n/2

Size < n

size < n/2

size < n/4

…

size is < lg n + 1

Another general feature on divide and conquer algorithms is that the method for a large dataset may not be efficient for a small data set. In the case of quicksort, a simpler sort makes sense for small datasets. In general it could be a variety of simpler sorts, and it could be applied to each data set that is small enough. Switching to that algorithm takes some time. Knuth noted that the time spent for insertion sort is bounded in terms of the maximum distance any element needs to move. After partitioning down to a small size, no element needs to move far, so an insertion sort can be deferred until the end and be done in one sweep through the entire array.

See my Java code where you are just comparing integers. It include the elaboration with choosing the pivot from sorting three elements initially, is non-recursive, limits stack size, and has a parameter for switching to insertion sort. I also include a testing class, where you can interactively choose the best parameter the size of sequences left for the final insertion sort, plus see behavior on various sorts of lists. The links in the text here are to .txt files that are more likely to display well in a browser. There are also copies ending in .java in the handouts.

Another issue for homework: The standard partition stops at pivot values in both the upward and downward sweep, so it end up swapping pivot values unnecessarily. Why not just swap the pivot values one way?

You should be familiar with binary search. The worst case number of 3-way comparisons is lg n.

Aside: why is a 3-way comparison particularly reasonable? Know Java compareTo or C# CompareTo?

Now let's look at the average time for binary search, assuming the
element is found, and all are equally likely. To keep things easy,
assume n = 2^{k} – 1.

1st time: end if looking for middle element from 2^{k}
– 1:

1 element found in 1 comparison

next problems of size 2^{k-1} – 1:

2 elements found
in 2 comparisons

next problems of size 2^{k-2} – 1:

4 elements found
in 3 comparisons

…

finally when sizes just 1

2^{k-1} element found in k
comparisons

Adding up and dividing by the total number of possibilites:

avg = [1+ 2*2 + 3*2^{2} + 4*2^{3} + … k*2^{k-1}
]/( 2^{k} – 1)

The sum is almost the one in the back of the book #6. It matches the numerator perfectly if you multiply by 2 (and divide by 2) you get

avg = [(k-1)2^{k} +1]/( 2^{k} – 1)

adjusting the first part to include 2^{k} – 1, to make
dividing through by 2^{k} – 1 easier:

avg = [(k-1)(2^{k} – 1) +k]/( 2^{k} – 1) =
k-1 + k/( 2^{k} – 1)

Since the maximum is k, so the average is within 1 of the maximum. For general n, it is within 2.

The book lists this technique under "divide by a constant factor", which is just divide and conquer when one only smaller problem needs to be considered.def power(a, n):

'''Assume integer n >= 0 for simplicity'''

if n == 0;

return 1

b = power(a, n//2)

if n % 2 == 0:

return b*b

return b*b*a

The recursion goes to depth floor(log n). The is useful in cryptography and various other places.

Large integers: The base 2 version in the beginning of DasGupta, chapter 2, is directly applicable to computers, with their shift operations. The base 10 version in Levitin is still completely analogous.

Both the integer algorithm (first page in DG ch2) and the matrix algorithm (12th page) are useful, and are good examples of applying the Master Theorem. They are not so good a models of creative thinking for this class – we are not likely to see their match anywhere! I have no idea how they were dreamed up.

**Class problem **(DasGupta): Given a sorted array of distinct
integers A[1, . . . , n], you want to find out
whether there is an
index i for which A[i] = i. Give a divide-and-conquer
algorithm
that runs in time O(log n).

These are useful algorithms if you are dealing with geometric
situations. Be aware of them.

**Class problem **(DasGupta): An array A[1 . . . n] is said to
have a majority element if more than half of its
entries are the
same. Given an array, the task is to design an efficient algorithm
to tell whether the array has a majority element, and, if so, to find
that element.
The elements of the array are not necessarily from
some ordered domain like the
integers, and so there can be no
comparisons of the form “is A[i] > A[ j ]?”.
However you
can answer
questions of the form: “is A[i] = A[ j ]?” in
constant time.

(a) Show how to solve this problem in O(n log n) time. (Hint: Split the array A into two arrays A1 and A2 of about half the size. Does knowing the majority elements of A1 and A2 help you figure out the majority element of A? If so, you can use a divide-and-conquer approach.)

(b) Can you give a linear-time algorithm with o(n) extra data storage? Hint: (This is more a reduce by a constant factor algorithm, Ch 4.4.)

Pair up the elements of A arbitrarily, to get ⌊n/2⌋ pairs , and a possible odd element

Look at each pair: if the two elements are different, discard both of them; if they are the same, keep just one of them

Show that after this procedure there are at most ⌊n/2⌋ elements left from the pairs, plus possibly an odd element, and that solving the majority element for what is left will help solve the problem for A.