Algorithms Outline for Final Exam

4 sides of notes allowed.

Exam emphases

Individual topics that are new since the last quiz will be more emphasized than the topics you have been examined on before, probably meaning 20-25% of the exam will be on new topics.
The best characterization of the course is the course itself, but I have tried to give you homework on all topics, so reviewing homework is a good review. Looking at old exams or quizzes is a quick but not complete way to review the older material: You should have covered much more in all your work than there was space for in the midterm and quizzes. Obviously if you missed something on an exam, it would be good to make sure you know it now, but exams involve a number of arbitrary choices and omissions, and different choices are likely to be made on the final. Major topics are likely to reappear, but often be treated from a somewhat different angle than last time, or combined in different ways. A mostly different collection of secondary topics is likely to be on the final.
I repeat: the best review of what you need to be able to do is to go over homework. If you need further exercises on any subject, let me know.

Topics from previous topic guides plus:

Seeing the data structure and algorithm underneath a problem description.
Dynamic Programming:
- Seeing recursive relationships and concise solutions
- Dynamic solution by memoizing a recursive solution
- Dynamic solution with reverse topological ordering
- Fancier solutions reconstructing the details of the optimal solution
- Calculating time and space requirements; minimizing their orders.
Unlike HW, where I often asked you to do all parts, I'm likely to ask you to just get a recursive solution or be given one version of a solution and modify it into another version.
P-NP, Decision problems
- Definition of a decision problem; relationship to optimization problems
- Understand the famous problems associated with assorted standard names: CNF and 3-CNF satisfiability, Hamiltonian Cycle, Hamiltonian Path, Graph coloring, Knapsack, Subset Sum, Travelling Salesperson.
- Follow other problems in 13.3.2 if their statements are repeated.
- The class P
- Measuring input size
- Polynomial time solutions
- The class NP: checking a correct solution in polynomial time
- Polynomial reducibility
- Reduction of a problem to a problem in P shows it is in P
- NP hardness, NP completeness
- Reduction of an NP complete problem to Q shows Q is NP hard
- Do very simple reductions
- Does P = NP? (just kidding)

Problems

Unlike the party problem, assume “knowing is not symmetric for this celebrity problem: Given n people, a celebrity is one who knows nobody else but everybody else knows the celebrity. To determine if there is a celebrity in a group of n people, you can ask questions only of the form, “Does A know B?” Suppose I say I can come up with an algorithm involving O(n) such questions to determine if and who is a celebrity within the group. Is it possible to do better than O(n)?
For a set of symbols x₁, x₂, x₃, ... x_n, you are given a combination of m constraints: some equality constraints of the form x_i= x_j, and some inequality constraints, x_i≠ x_j. Clearly explain and give pseudo-code for an O(n+m) algorithm to satisfy all of them with integer values, or state it is impossible. For instance, the constraints: x₁= x₂, x₂= x₃, x₃= x₄, x₁≠ x₄, cannot be satisfied. The constraints x₁= x₂, x₁= x₃, x₁≠ x₄, can be satisfied with 1, 1, 1, 2. Explain why your algorithm is O(n + m). Note: You are given only the symbols and relations. You are NOT given the values for the symbols.

Hint: You cannot treat the = and ≠ relations in a similar way. Where is the better place to start?

Answers

below

...

Solutions

If you ask fewer than n/2 questions, you get no information about at least one member. That person may or may not know any individual and may or may not be know by any individual, allowing for or disallowing anyone as celebrity. Hence there must be at least O(n) questions. You cannot have a better order than the one proposed.

You can make a finer argument for more than n/2, but the short argument above is sufficient to get O(n).
N distinct objects with relations in pairs - sure looks like a something to state as a graph. How to deal with equalities vs inequalities? The situation is not symmetric: Equalities give an equivalence relation; inequalities do not, so try just using the equalities initially. Because of the equivalence relation with equality, any two vertices connected by a path must be equal, so there cannot be an inequality between them. How to check all this efficiently? Run the component numbering variation of the DFS once. Let all variables then get as value the number of their component (cc[i] below). This is the answer if all the inequalities are satisfied. Otherwise there is no solution. Details in pseudo-code:
```
Create empty lists neighbors[i] i = 1...n           #O(n)
for each equality constraint xi= xj:                #O(m), yielding no more than m edges
    Add i to neighbors[j] and j to neighbors[i]
Run the component numbering algorithm, generating an array cc,
                where cc[i] is the component number for xi.   #O(n+m)
for each inequality constraints, xi≠ xj:                      #O(m)
    if cc[i] == cc[j]:
         return False
return True   #The values in cc are a solution                #total time still O(n+m)
```