Theory of Algorithms
CSCI 400, Fall semester, 2016
To
the Bottom of the Page
Instructor: Dr. Shieu-Hong Lin
Email:
Class: T Th: 10:30-11:45 Location: LIB 141
Office Hours:
M~Th
8:30am~10:30am, MW 11:30~1:00pm Math & CS department at Grove 8
Contact Dr. Lin to set up an appointment
in advance
TAs: Alvin Suh, William Tan TA
hours: TBA
Textbook:
- J.
Kleinberg and E. Tardos. Algorithm Design,
Addison Wesley, 2005.
Course Syllabus
*************************************************************************************************
Late policy:
·
The
submission link remains open for 2 more days after the due date as a grace
period, but 1 point will be deducted
for late submission after the due date while the submission link is still open.
·
You will receive no points after the submission link on
canvas is closed unless it is something like
a serious health issue with statements from the doctor as proof.
*************************************************************************************************
Week 1: Time Complexity
of Algorithms
Reading #1 due: Thursday, Sept. 1: (i) Sections 2.1~2.2, and 2.4 of Algorithms Design.
(ii) Read and think about this similarity-search problem.
Homework #1A due: Thursday,
Sept. 1: Counting 2n numbers
using a binary counter of n bits.
- Efficiency of polynomial time
algorithms: Download and unzip this C++ project
to feel the amount of time taken by O(n), O(n log n), O(n2),
and O (n5) algorithms.
Try n=1,000,000, n=10,000,000, n=100,000,000, and n=1,000,000,000 if
possible and measure the amount of time taken respectively if possible.
Record your findings in the report in Step 3.
- Efficiency of super-polynomial time
algorithms: Add your own code to the project so that we can
simulate the amount of time taken by a O(n * 2n) algorithm.
To accomplish this, you should create a vector of n bits (or an array of n
integers) as a counter to represent a binary number of n bits. Initialize
the contents of the all elements to 0’s to represent the value 0 in the
binary system (base 2). Then use an outer loop to increase the value of
the binary number by 1 each time by checking and changing the contents of
the vector accordingly using an inner loop, until the point where the
contents of the vector are all 1’s to represent the value 2n-1.
Try n=20, n=30, n=40, and n=50 if possible, and measure the amount of time
taken respectively if possible. Record your findings in the report in Step
3.
- Submission:
(i)
Compress your entire Program 1A folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report, including your findings of computational time needed in the
experiments into the report and upload it through Biola
Canvas.
- Note: The
binary counter mechanism you implement in homework #1A will be needed to
deal with the exhaustive search for the mixture sequence problem for Homework #1B next.
*************************************************************************************************
Week 2: Introduction to
Greedy Algorithms
- Reading #2:
Due Thursday, Sept. 8
- Sections 4.1, 4.2 and 4.4 of Algorithms Design.
- Read and think about this truck refueling problem.
Homework #1B: due Thursday,
Sept. 8.
- Extend your implementation of the binary counter in
homework #1A to conduct exhaustive search for solving the mixture sequence
problem.
- Submission:
(i)
Compress your entire Program 1B folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report, including your findings of computational time needed in the
experiments into the report and upload it through Biola
Canvas.
*************************************************************************************************
Week 3: Classical Problems
Solved by Greedy Algorithms
- Reading #3:
Due Thursday, Sept. 15
1.
Carefully review Sections 4.1, 4.2 and 4.4
of Algorithms Design.
2.
Read Section 4.5 of Algorithms Design.
3. Continue to think about this truck refueling problem.
Homework #1C
(Branch and bound): Due Thursday, Sept. 15.
- Conduct branch and bound search for solving the
mixture sequence problem more efficiently.
- Submission:
(i)
Compress your entire Program 1C folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report, including your findings of computational time and answers
required by the homework into the report and upload it through Biola Canvas.
*************************************************************************************************
Week 4: Greedy Algorithms
and Proofs of Correctness
- Reading #4:
Due Thursday, Sept. 22
1.
Carefully review the proofs of correctness
for the algorithms by exchange arguments in Sections 4.2~4.3, especially the
one in Section 4.2.
2.
Read Sections 4.5 ~4.6 of Algorithms
Design.
Homework #2A:
Due Thursday, Sept. 22.
- This is the first exploration in search of a possible
greedy algorithm.
- Submission:
(i)
Compress your entire Program 2A folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report, including all your answers and findings as required in
Homework #2A into the report and upload it through Biola
Canvas.
*************************************************************************************************
Week 5: Greedy Algorithms
and Proofs of Correctness
- Reading #5:
Due Thursday, Sept. 29
1. Read
Sections 4.7~4.8
2. Carefully
review the algorithms and problems in Sections 4.1~ 4.6 of Algorithms Design
and especially the proofs of correctness in Sections 4.1~4.3.
Homework #2B: due: Thursday,
Sept. 29
- There is actually a simple greedy criterion for
determining the optimal solutions for the scheduling problem in Homework
#2A. Write down the following
things in a WORD document: (i) the greedy criterion, (ii) the greedy algorithm,
and (iii) a proof to show
that the greedy criterion
always leads to optimal solutions. Save them in a WORD document and upload
it. For submission, upload the
file under Canvas.
- Hint: Note
that this problem is very similar to the scheduling problem in Section
4.2. The proof to show that the greedy
criterion always lead to an optimal solution for this problem is also very
similar to the proof for the scheduling problem in Section 4.2. You should
carefully study the scheduling problem and the proof in Section 4.2.
*************************************************************************************************
Week 6: Introduction to
Divide and Conquer.
- Reading #6:
due Thursday, Oct. 6
- Read Sections 5.1-5.3 on merge sort, recurrence,
and inversion counting.
- Read the description of Master Theorem on the first
two pages of this
document.
*************************************************************************************************
Week 7: More on Divide
and Conquer I.
- Reading #7: due Thursday, Oct. 13
- Read
Sections 5.4-5.5 on closest pairs and integer multiplication.
- Read
and review the description of Master Theorem on the first two pages of this
document.
*************************************************************************************************
Week 8: Test #1 | Torrey Conference
Test #1: Preparation for the test
- #1. Capability
of presenting proof of correctness (of greedy algorithms) using the
exchange argument: (i) Review the related
proof for the max-lateness scheduling problem in Section 4.2. (ii) Review
the related proof in Homework 2B regarding the sum of weighted finishing
time scheduling problem.
- #2.
Understanding of tractability and asymptotic notation and running time:
Sections 2.1, 2.2, 2.4
- #3. Understanding of (i)
Dijkstra’s algorithm for the shortest path problem in Section 4.4 and
(ii) the algorithms for the minimum spanning tree problem in Section 4.5.
Test #1 (open-book test): Tuesday Oct. 18.
- At
10:29am on Tuesday Oct.
18, log into Canvas => Assignment => Test 1 to see and work
on the problem set.
- What are allowed: It is an open-book
test. You can only refer to the textbook during the test.
- What are NOT allowed: You are not allowed to
use any note such as the class notes on proof for Homework #2B.
- Submission: Write down you
answers on blank paper as answer sheets.
*************************************************************************************************
Week 9: More on Divide
and Conquer II. Progress report
- Reading #9: due Thursday, Oct. 27
(Submission link open till Nov. 10)
- Review
Sections 5.1-5.5 on merge sort, recurrence, inversion counting, closest
pairs and integer multiplication.
- Read
and review the description of Master Theorem on the first two pages of this
document.
Homework #3A (Implementing integer
operations with unlimited precision, Part A): due: Thursday, Oct. 27
- #1: Implement a function bool add(vector<char> & n1,
vector<char> & n2, vector<char > & n3)
such that (i) the caller should
provide two integers (with unlimited number of digits and possibly with a
positive sign + or a negative sign –) represented by the two vectors n1
and n2 where each digit or sign associated with the numbers is represented
as a single character in the vectors, (ii) the function should check
whether n1 and n2 encode two valid integers in them, (iii) if not, the function
returns false, (iv) if yes, the
function should then simulate the addition
of these two integer to add these two numbers and store the result in n3
and then return true.
- #2: Implement a function bool subtract(vector<char> &
n1, vector<char> & n2, vector<char > & n3)
such that (i) the caller should
provide two integers (with unlimited number of digits and possibly with a
positive sign + or a negative sign –) represented by the two vectors n1
and n2 where each digit or sign associated with the numbers is represented
as a single character in the vectors, (ii) the function should check
whether n1 and n2 encode two valid integers in them, (iii) if not, the
function returns false, (iv) if
yes, the function should then simulate the subtraction operation (of subtracting the number in n2 from
the number in n1) and store the result in n3 and then return true.
- #3: In your main function, declare three vectors of characters
a, b, and c for encoding integers of unlimited precision. Then set up a
loop to repeated do the following: (i) set up a
loop for the user to enter the contents of vector a, (ii) set up another loop for the user to enter the contents
of vector b, (iii) call add(a,
b, c) to do the addition operation and store the result into
vector c, (iv) if add(a,
b, c) returns true, display
the contents of the vector c to
see the result of addition, (v) if add(a, b, c) returns false, display a message to
indicate either the contents of vector a or the contents of vector b do not
represent valid integers, (iii) call subtract (a, b, c) to do the
subtraction operation and store the result into vector c, (iv) if subtract (a, b, c) returns
true, display the contents of
the vector c to see the result
of subtraction, (v) if subtract (a, b, c) returns false, display a message to
indicate either the contents of vector a
or the contents of vector b does
not represent valid integers.
- Note on the algorithms: Use the algorithms you learned in the
elementary school to implement the addition operation and the subtraction
operation.
- Submission:
(i)
Compress your entire Program 3A folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report and upload it through Biola Canvas.
*************************************************************************************************
Week 10: Dynamic
Programming (I).
- Reading #10: due Thursday, , Nov.
3 (Submission link open till Nov.
10)
- Read
Sections 6.1~6.2 on dynamic programming.
Homework #3B (Implementing integer
operations with unlimited precision, Part B): due: Thursday, Nov. 3 (Submission link open
till Nov. 10)
- #1: Implement a function bool multiply(vector<char> &
n1, vector<char> & n2, vector<char > & n3)
such that (i) the caller should
provide two integers (with unlimited number of digits and possibly with a
positive sign + or a negative sign –) represented by the two vectors n1
and n2 where each digit or sign associated with the numbers is represented
as a single character in the vectors, (ii) the function should check
whether n1 and n2 encode two valid integers in them, (iii) if not, the
function returns false, (iv) if
yes, the function should then simulate the multiplication of these two integer to add these two numbers
and store the result in n3 and then return true.
- #2: In your main function, declare three vectors of characters
a, b, and c for encoding integers of unlimited precision. Then set up a
loop to repeated do the following: (i) set up a
loop for the user to enter the contents of vector a, (ii) set up another loop for the user to enter the contents
of vector b, (iii) call multiply(a,
b, c) to do the addition operation and store the result into
vector c, (iv) if multiply(a,
b, c) returns true, display
the contents of the vector c to
see the result of multiplication, and (v) if multiply (a, b, c) returns
false, display a message to
indicate either the contents of vector a or the contents of vector b does
not represent valid integers.
- Note on the algorithms: Use the algorithm you learned in the
elementary school to implement the multiplication operation.
- Submission:
(i)
Compress your entire Program 3B folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report and upload it through Biola Canvas.
Homework 4: Divide
and conquer: the good vs the bad (i.e. How you divide
and conquer does matter). Thursday,
Nov. 3 (Submission link open till Nov. 10)
- (i) Download and play with the binary executables
from two implementations (fast version, slow version)
of a Tree class (for binary search trees) from our CSCI 106 Data
Structures class. The only essential difference is in the implementation
of the method int Tree::BranchHeight(TreeNode * ptrNode) in tree.cpp: (fast
version, slow version). (ii) Examine the difference. When
you run the program and select the tree-height experiment option to
examine the average heights of randomly generated binary search trees,
the slow version is very slow even for handling random binary search
trees of 4000 nodes while the fast version has no problem in handing
random binary search trees hundreds of thousand nodes very fast.
- For
this homework, provide your explanations of how this could happen in two
situations: (i) when the tree is simply a
linked list and (ii) when the trees are full
or complete binary search trees (i.e. every node has two children,
except for the leaves at the bottom level). You should put down the recurrence
relations of the running time in those two situations and then try to
determine the running time based on the recurrence relation.
*************************************************************************************************
Week 11: Dynamic
Programming (II).
- Reading #11: due: Thursday,
Nov. 10.
- Read Sections 6.3~6.4 on dynamic programming
- Download and examine the sample executable that
solves the 0-1 Knapsack problem using dynamic programming with the
options of (i) recursion with memoization, (ii) recursion without memoization
(slow), and (iii) no-recursion with memoization.
*************************************************************************************************
Week 12: Dynamic
Programming (III).
- Reading #12: due: Thursday,
Nov. 17.
- Read Sections 6.5~6.6 on dynamic programming
- See the description here for finding the energy-minimization secondary
structures.
- Demo Program: Download this zip file for a sample executable and sample RNAs. Run the
enclosed executable to analyze the sample RNAs inside and see the 2D
layout of the energy-minimization secondary structures in the resulting
text files.
Homework #3C (Implementing integer
operations with unlimited precision, Part C): due: Thursday, Nov. 17
- Implement the function void multiply(vector<char> & n1,
vector<char> & n2, vector<char > & n3) just as
specified in Homework 3B but
this time implement it based on the
O(n1.59) divide-and-conquer algorithm in Section 5.5. Note that
instead of base 2 you are dealing with base 10 in the implementation as
discussed in the class. To make it simple for your
implementation of recursion, let’s assume the contents of the vectors do
represent valid integers and thus there is no need to check the validity
of the contents this time.
- Experiment 3C: (i) Write code in
your main function for Homework #3C to generate 5 pairs of random integers
in terms of vectors of 1000, 5000, 10000, 50000, and 100000 digits
respectively. For each pair, call your multiply function in
Homework #3C to multiply the pair of numbers, measure the amount of time for
multiplication, and record it in
the self-evaluation report. (ii) As a comparison, write code in your main
function for Homework #3B to generate 5 pairs of random integers in terms
of vectors of 1000, 5000, 10000, 50000, and 100000 digits respectively.
For each pair, call your multiply function in Homework
#3B to multiply the pair of numbers,
measure the amount of time for multiplication, and record it in the self-evaluation
report.
- Submission:
(i)
Compress your entire Program 3B folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report together with the results from Experiment 3C and upload it through Biola
Canvas.
*************************************************************************************************
Week
13: Computational Complexity: P, NP,
and NP-Complete.
Reading #13: Sections 8.1~8.4 on P, NP, and NP-Complete due: Thursday Dec.
1.
Homework #5. Due
date: Thursday, Dec. 1.
- Study the O(nw) dynamic-programming algorithm
described in Section 6.4 of the textbook for solving the 0-1 knapsack
problem. Implement the algorithm and then solve the two problem instances
of the 0-1 knapsack problem here using the new O(nw)
dynamic-programming algorithm. Your program should print out what the
selected items are to accomplish the maximal total value without violating
the knapsack capacity limit. Observe and record for each case (i) an optimal solution to the case and (ii) the amount
of time it takes.
- Verification: You can check whether your
implementation is working properly by comparing your results with the
solution to the following two cases: Case 1 and Case 2. The time efficiency of your program
should be compatible with the
sample executable that solves the 0-1 Knapsack problem using dynamic
programming with the options of (i) recursion plus memoization
or (iii) no-recursion plus memoization.
- Submission:
(i)
Compress your entire Program 3B folder into a zip file and upload it
through Biola Canvas. (ii) Carefully fill out this self-evaluation
report together with the results from Experiment 3C and upload it through Biola
Canvas.
*************************************************************************************************
Test #2 (in-class open-book
test): Tuesday Dec. 6.
- Divide and conquer: recurrence
relations, master theorem, the inversion couting problem, the
multiplication problem
- Dynamic programming: memoization and complexity analysis, the Knapsack problem, the RNA
secondary structures problem
- What are allowed: It is an open-book
test. But you can only refer to the textbook during the test.
- Submission: Write down you
answers on blank paper as answer sheets.
Practice Problems in preparation
for the final exam:
- The mixture
sequence problem: Implement a program to solve the mixture sequence
problem by dynamic programming, show the recursive problem reduction to subproblems, abd analyze the
complexity.
- P, NP, NP-Complete regarding the mixture sequence problem
and the knapsack problem.
- The decision version of the knapsack problem: Whether
there is solution with a total value (from items selected) greater than a
given amount.
- Show both the mixture sequence problem and (the
decision version of) the knapsack problem are in
P and in NP.
- Find and describe a polynomial-time reduction from the
mixture sequence problem to the decision version of the knapsack problem.
- Find and describe a polynomial-time reduction from to
the decision version of the knapsack problem to the mixture sequence
problem.
- We know the knapsack problem is NP-Complete. Prove
that the mixture sequence problem is NP-Complete.
- The RNA
secondary structure problem: Implement a program to solve the RNA
secondary structure problem by dynamic programming. Demo Program:
Download this zip
file
for a sample executable and sample RNAs. Run the enclosed executable to
analyze the sample RNAs inside and see the 2D layout of the
energy-minimization secondary structures in the resulting text files.
Final
Exam (Test #3): Take-home open-book test):
Monday Dec. 19.
- Points: Maximal points: 90 points. Getting 70 points will be counted as 100%.
a) Problem #1 20 points: 10 points for the polynomial-time
reduction in 1.(ii); 5 points each for 1.(i) and 1.(iii).
b) Problem #2 20 points: 5 points each for 2.(i) and 2.(ii) . 10
points for the implementation for 2.(iii).
c) Problem #3 20 points: 5 points each for 3.(i) and 3.(ii) . 10
points for the implementation for 3.(iii).
d) Problem #4 30 points: 5 points each for 4.(i) and 4.(ii) . 10
points for the proof in 2.(iii). 10 points for the implementation for 2.(iiv).
- Submission:
(i)
For the implementation described in 2.(iii) 3.(iii), 4.(ii) and 4.(iv) of
the problem set, for each program please fill out one
self-evaluation report separately and compress the program
folder into a zip file and upload it through Biola
Canvas. (ii) For the other
parts of the test, put down your answers in a WORD document and upload it
through Biola Canvas.
*************************************************************************************************
TA hours:
T Th
1:00~4:00pm (Alvin Suh, William Tan)
*************************************************************************************************
References for C++ STL algorithms and
more: (i)
the STL
documentation by SGI, (ii) the summary table of generic algorithms and
examine example programs in the section of
function objects from the book website of The C++ Standard Library - A
Tutorial and Reference by Josuttis, and
(iii) Appendix A.2 in C++ Primer for an overview of the generic algorithms.
*************************************************************************************************
To the Top
of the Page