Algorithms Outline for Final Exam

4 sides of notes allowed + APPENDIX A.

Exam emphases

Individual topics that are new since the last quiz will be more emphasized than the topics you have been examined on before, probably meaning 30-35% of the exam will be on new topics.
Problems from later in the semester generally include skills needed from early in the semester implicitly, so most questions will not be straight from the early part of the course, though there may be some topics from earlier in the semester that did not get used much in the later part of the course.
The best characterization of the course is the course itself, but I have tried to give you homework on all topics, so reviewing homework is a good review. Looking at old exams or sample exams is a quick but not complete way to review the older material: You should have covered much more in all your work than there was space for in the midterm and quizzes. Obviously if you missed something on an exam, it would be good to make sure you know it now, but exams involve a number of arbitrary choices and omissions, and different choices are likely to be made on the final. Major topics are likely to reappear, but often be treated from a somewhat different angle than last time, or combined in different ways. A mostly different collection of secondary topics is likely to be on the final.
I repeat: the best review of what you need to be able to do is to go over homework. If you need further exercises on any subject, let me know.

Topics from previous topic guides plus:

Intree data structure
Seeing the graph and algorithm underneath a problem description.
Floyd, Warshall, and Bellman-Ford algorithms and when to use them
Dynamic Programming:
- Seeing recursive relationships and concise solutions
- Dynamic solution with memoizing
- Dynamic solution with reverse topological ordering
- Fancier solutions reconstructing the details of the optimal solution
- Calculating time and space requirements; minimizing their orders.
Unlike HW, where I often asked you to do all parts, I'm likely to ask you to just get a recursive solution or be given one version of a solution and modify it into another version.
P-NP
Decision problems
- Understand the famous problems associated with assorted standard names: CNF and 3-CNF satisfiability, Hamiltonian Cycle, Hamiltonian Path, Graph coloring, Knapsack, Subset Sum, Travelling Salesperson.
- Follow other problems in 11.3 if their statements are repeated.
- The class P
- Measuring input
- Polynomial time solutions
- The class NP: checking a correct solution in polynomial time
- Polynomial reducibility
- Reduction of a problem to a problem in P shows it is in P
- NP hardness, NP completeness
- Reduction of an NP complete problem to Q shows Q is NP hard
- Do very simple reductions
- Does P = NP? (just kidding)

Problems

Unlike the party problem, assume “knowing is not symmetric for this celebrity problem: Given n people, a celebrity is one who knows nobody else but everybody else knows the celebrity. To determine if there is a celebrity in a group of n people, you can ask questions only of the form, “Does A know B?” Suppose I say I can come up with an algorithm involving O(n) such questions to determine if and who is a celebrity within the group. Is it possible to do better than that?

Find an O(n) solution.
For a set of symbols x₁, x₂, x₃, ... x_n, you are given some equality constraints of the form x_i= x_j, and some inequality constraints, x_i≠ x_j. Is it possible to satisfy all of them with integer values? For instance, the constraints: x₁= x₂, x₂= x₃, x₃= x₄, x₁≠ x₄, cannot be satisfied. Suppose the number of constraints is m. Clearly explain and give pseudo-code for an O(n + m) algorithm to determine if such a set of constraints can be satisfied. Explain why your algorithm is O(n + m). Note: You are given only the symbols and relations. You are NOT given the values for the symbols. The question is about the ability to find consistent values for the symbols.

Hint: You cannot treat the = and ≠ relations in a similar way. Where is the better place to start?

Answers

below

...

Solutions

If you ask fewer than n/2 questions, you get no information about at least one member. That person may or may not know any individual and may or may not be know by any individual, allowing for or disallowing anyone as celebrity. Hence there must be at least O(n) questions. You cannot have a better order than the one proposed.

You can make a finer argument for more than n/2, but the short argument above is sufficient to get O(n).

An O(n) solution is a greedy process initially: Each question eliminates one person: If A knows B, A is eliminated as celebrity. If A does not know B, B is eliminated as a celebrity. If you keep asking questions, and never involve someone already determined not to be the celebrity, then it takes n-1 questions to eliminate all but one candidate. Then if that person knows nobody (max n-1 more questions) and everyone knows that person (max n-1 more questions) then the person is a celebrity. This determination takes no more than 3(n-1) questions, definitely O(n).
N distinct objects with relations in pairs - sure looks like a something to state as a graph. How to deal with equalities vs inequalities? The situation is not symmetric: Equalities give an equivalence relation; inequalities do not, so try just using the equalities initially. Because of the equivalence relation with equality, any two vertices connected by a path must be equal, so there cannot be an inequality between them. How to check all this efficiently? Run the component numbering variation of the DFS once. All variables in the same component must have the same value. Then check each inequality to make sure that the variables in the pair are not in the same component. Details in pseudo-code:
```
Create empty lists neighbors[i] i = 1...n           #O(n)
for each equality constraint xi= xj:                #O(m), yielding no more than m edges
    Add i to neighbors[j] and j to neighbors[i]
Run the component numbering algorithm, generating an array cc,
                where cc[i] is the component number for xi.   #O(n+m)
for each inequality constraints, xi≠ xj:                      #O(m)
    if cc[i] == cc[j]:
         return False
return True   #The values in cc are a solution                #total time still O(n+m)
```