Back to Intro
Consider the following algorithm for finding the distance between the two
closest elements in an array of real numbers or integers.
Algorithm MinDistance(A[0..n − 1])
//Input: Array A[0..n − 1] of numbers
//Output: Minimum distance between two of its elements
dmin ← ∞
for i ← 0 to n − 1 do
for j ← 0 to n − 1 do
if i != j and |A[i] − A[j]| < dmin
dmin ← |A[i] − A[j]|
return dmin
How many times does the if statement execute? What is the order of the time for the algorithm in terms of n?
How could you tweak the algorithm and make it more efficient? How many if statement executions now? Same or different order?
If you started over, allowing a more complicated algorithm, what better order could you have? Describe briefly.
For average analysis, we need some probability:
Let
U be a finite set of all possible elementary outcomes, with a
probability function P defined on U,
where 0 <= P(e) <= 1
for each e in U, and
where ∑_{ e in U} (P(e)) = 1
NOTE: I like HTML - it is much more compact than pdf or MS-Word docs, and it is portable and easy to display and print. Fancy math is not easy to display conventionally however in HTML. In the past I have used images for fancy math, but this is cumbersome. This semester I would like to try a different approach: using substitute notation wherever feasible that is hopefully readable and easily displayable. ∑_{ e in U} (P(e)) is the first example. I put the range of the sum in a subscript, and the general term of the sum in parentheses, so this is the sum for every e in U of P(e).
Intuitively, P(e) is the proportion of the time that the random outcome e occurs, and the sum for all events is 1, since all the time something must occur.
For rolling a die, elementary outcomes are generally considered to be number on up face 1, 2, ...6.
The simplest case is when all elementary outcomes are equally likely (equal probability). Suppose there are n elements in U; P(e) = p for all e in U. Then np = 1 and the common probability p = 1/n.
This is 1/6 for each elementary outcome for a die.
In general an event is a subset of U. The even event
for rolling a die is E = {2, 4, 6}. I will generally use e for
an event which must be elementary, and E for any event, which may or
may not be elementary.
The probability of E = P(E) = the sum
of the probabilites of each elementary event in E = ∑_{ e
in E }(P(e)) For the even event P(E) = P(2) +
P(4) + P(6) = 1/6 + 1/6 + 1/6 = 3/6
Note: If I write P(E) using the definition of E, I would have P({2, 4, 6}). This notation is simplified often to P(2, 4, 6), omitting the braces {}.
P(E) = (size of set E)/(size of U) -- general rule when
all elementary events are equally likely.
A random
variable f is a real function defined on U. The expected value
of f is E(f) = ∑_{e in U}(f(e)*P(e)).
This is the long run average value of f.
For example, if
random variable f on die throws is the square of the outcome,
E(f) = f(1)*P(1) + f(2)*P(2) + f(3)*P(3) + f(4)*P(4) + f(5)*P(5)
+ f(6)*P(6)
= 1*1/6 + 4*1/6
+ 9*1/6 + 16*1/6 + 25*1/6 + 36*1/6 = 91/6
On to the average number of comparisons in linear search.
Outcomes are U = {E_{f} , E_{0} , E_{1}
, E_{2} , E_{3} , ... E_{n-1} } where E_{f}
(f for fail) means the target is not found, and E_{i} means
the target is at index i (i = 0, 1, 2, ...n-1). Assume the
probability of success in the search is s = P(E_{0} , E_{1}
, E_{2} , E_{3} , ... E_{n-1} ), and each
location for the target is equally likely, so
P(E_{i}) =
s/n i = 0, 1, ... n-1
P(E_{f}) = 1- s.
The
number of comparisons C, is a random variable based on the outcome.
The average number of comparisons is
E(C) = P(E_{f})*C(E_{f})
+ ∑_{i = 0 to n-1} (P(E_{i})*C(E_{i})) .
= (1-s)n + ∑_{ i = 0 to n-1} ((s/n)*(i+1)) =
(1-s)(n) + (s/n)[n(n+1)/2]
= (1-s)n - s(n+1)/2
If s= 1
(sure to find it), E(C) = (n+1)/2 (expect about in the
middle!).
The text takes an important approach to counting for order estimates: counting one type of operation. Using this as the only model is strange. Counting just one kind of operation, like comparisons of two pieces of input data in a sort, is relatively easy, gives an explicit count, and is particularly useful for discussing classes of algorithms, where you do not have the details of individual ones, and it makes sense if one particular operation related to your data is particularly expensive.
For analyzing the asymptotic order for the time of a particular algorithm, the counting approach is pretty limiting, however. If a block of code has no loops, function calls, or array initializations then the whole block of code is O(1). Higher orders come from loops and recursion. If the body of a loop takes time that is O(f(n)) and the loop operates O(g(n)) times, then the total time is O(f(n)*g(n)).... We will get to counting time with recursion shortly.
Note that the O(f(n)*g(n)) in the loop description is is an upper bound! If some of operations of the inner loop have a lower order, you must be careful if you want a better bound.
Example 1:
n + n-1 + n-2 + … + 3 + 2 + 1
contains n terms, each at most n, and even though it contains a 1,
it is O(n^{2}) and not O(n) , as we have seen from explicit
calculation: The sum is n(n+1)/2. A quick way to see a lower bound for the order without an
explicit summation formula is that the first n/2 terms are all at least of size
n/2, so the total is at least n^{2} / 4, which is in O( n^{2}). Here the big oh symbol is not very communicative. (see next section)
Example 2:
n + n/2^{1} + n/2^{2} + n/2^{3} + … + n/2^{n-2} + n/2^{n-1}
contains n terms, each at most n, so it is trivially O(n^{2})
but it is also O(n): The sum is a geometric series with total n(2 -
1/2^{n}).
Read 2.2. Its discussion of Big Oh notation should be mostly review. We will have little need to apply the formal definition with n_{0} and c. See slides ch02n.ppt 11-17.
The book goes on to other classifications of function growth that we will use and will discuss in class, and that you may not have seen before:
f ∈ Ω(g) iff g ∈ O(f); (order of f >= g means the same as order of g <= f)
g ∈ Θ(f) iff g ∈ O(f) AND f ∈ O(g). Order of f = g means order of f <= g and order of g <= f)
That implies g ∈ Θ(f) iff f
∈ Θ(g). (Informally, equal order is reflexive.)
With n integral and nonnegative, the n > N part of the formal definition is unnecessary for the definition of O(n) and the other variants.
The idea of "lower order" is often useful, and the intuitive idea for a definition, analogous to numerical inequalities, that the order of f is lower than g iff not(f = Ω(g)) is NOT useful. A more careful definition: g is of lower asymptotic order than f, written
g ∈ o(f) (little oh of f) if
for any c > 0 there is an N
> 0, such that
g(n) <= cf(n) for all n > N
Note the difference from big Oh: rather than there being some one c that works as for O(f), for o(f) each c > 0 must work for large enough n. As n grows, g must get arbitrarily small relative to f.
Most simple closed form functions can be checked for asymptotic order relationships using limits from calculus, often with the help of L'Hopital's Rule, but this is really NOT necessary for the limited situations considered in class. It is sufficient to build with the following rules:
Assume 0 < a < b; 1 < c < d. Functions in asymptotic order from highest to lowest: n!, d^{n}, c^{n}, n^{b}, n^{a}, log n, log log n, log log log n, 1
Multiplication by a positive constant (as outermost operation) can be ignored.
A finite number of terms with lower order are
negligible when added to a higher order. A help for recognizing
product terms of lower order:
If f ∈ o(g) and F ∈ O(G)
then f*F ∈ o(g*G)
With these further useful results for the common operations on positive functions mixing with order classes, you can handle most common expression forms. The variants are also true with big O everywhere replaced by Ω or everywhere replaced by Θ:
One simple consequence is that if p(n) is any polynomial of degree k with positive leading term, then p(n) ∈ Θ(n^{k}).
Using the various simplification rules together,
Explanation: The order of log(n+5) is the same as log n (by 4d.), which is lower than the order of n (by 1.), so 1000n^{2}log (n + 5) has lower order than n^{3} (by 4b. and 2.). While 7n^{3} ∈ Θ(n^{3}) (by 2.), all the other terms are of lower order and can be ignored (by 3), so the result is in Θ(n^{3}).
Read Appendix A carefully. From here on formulas from Appendix A can apply. Most are identities that you should be very familiar with. Some are useful in this course and not so common. Pay extra attention to them.
For divide and conquer techniques, logs
come in strongly. Do not get stumped because of these basic
relationships! Keeping unfamiliar formulas in front of you may help.
The approximation symbols in the
summation formulas are sloppy/inconsistent:
All the summation approximations come
from the definite integral bounds, but you can use the result. You do not need to do the derivation via calculus.
We will often combine quantities that must be integers with fractions and logs, so the floor and ceiling formulas will appear fairly often.
When speaking generally, I will mean algorithms where you reduce to
smaller case(s) of the same algorithm. The final implemetation may
explicitly be recursive, or it may be implemented with a simple loop,
but the idea is recursive reuse of the basic problem.
There are lots of ways to reduce a problem to smaller versions.
Here I take only the most general overview. All involve these steps,
given the problem for a general size n:
In some algorithms either part 1 or 3 is totally trivial.
We will consider the more complicated situations later. Meanwhile,
the simplest way to get a smaller problem is to replace n by n-1.
I will emphasize the exact algebraic calculations less than the book except in the case corresponding to a recursive algorithm that calls itself once for a one smaller value,
For instance counting multiplications in factorial n! = n*(n-1)!,
0! = 1:
Let C(n) be the number of multiplications in calculating n!
C(0) = 0
More generally find a non recursive expression for
T(n) = T(n-1) + f(n) for n > a
T(a) = f(a) (known)
In the simple example for factorial
C(n) = 1 + C(n-1)
C(0) = 0
C(1) = C(0) + 1 = 0 + 1
C(2)
= C(1) + 1 = 1 + 1 = 2
C(3) = C(2) + 1 = 2 + 1 = 3
...
C(n)
= n
In the generalization it is more convenient to work backwards. Given
T(a) = f(a)
(some basis)
T(n) = T(n-1) + f(n) for n> a
= f(n) +
f(n-1) + T(n-2)
= f(n) + f(n-1) + f(n-2) + T(n-3) and so on so
= f(n) +
f(n-1) + ... + f(a+1) + f(a)
= ∑ _{i=a to n }
f(i)
In our example f(n) = 1, a = 1, and the sum is n*1
= n
In class example:
a. Find a non-recursive
formula for T(n) if
T(n) = T(n-1) + n^{2}, for n > 1
T(1) = 1
b. What is the order? What is the exact value (no summation symbol or "...")
Another example, with inequalities: The book does exact answer for number of moves in Tower or Hanoi, with recursion
C(1) = 1
C(n) = 2C(n-1) + 1
Simpler question: Show C(n) ∈ Ω( 2^{n})
C(n) > 2*C(n-1)
By induction
C(n) > 2^{k}C(n-k) so C(n) > 2^{n-1}C(1) = 2^{n}C(1)/2 = 2^{n }*constant
so C(n) ∈ Ω( 2^{n}).
Algorithm to reverse the order of an array A[0...n-1].
for i = 0, 1, … (n-1)/2
swap A[i] and A[n – i – 1]
clearly O(n) Is this the best order? Why?
Is the algorithm I gave for finding the minimum distance between two array elements the best order possible?
More in Ch 11
More recursion in 5.3 with trees - an obvious place given the recursive definition of binary trees:
Other recursion finding combinatorial objects: permutations, combinations, subsets. If you want more practice, see recursion.html (taken from an old 271 course of mine).
Then to Ch 3 and basic searches and brute force techniques