To the Bottom
of the Page
Instructor: Dr. Shieu-Hong Lin () Course Syllabus
Class: MW 12:00-13:15 pm at Busn 210
Submission of all your work: go to Biola
Canvas Your grades: see them under Biola Canvas
*************************************************************************************************
Week 1. Overview of the Landscape of Machine Learning
Reading 1: Report due Wednesday, Due:
Wednesday, Sept. 12
Showcase: Application of Hidden Markov Models (HMMs) as Bayesian
Networks
Lab # 1 Rock-Paper-Scissor: Report due Wednesday, Due: Wednesday, Sept. 12
1. Collecting data: Download, unzip, and run rock-paper-scissor Agent#1 (or this alternative x64 executable) for a couple of times. Each time the program would require you to play with the agent for 100 matches and yield a transcript file RPS_transcript.txt about the outcomes of these 100 matches in the same folder. You can rename these text transcripts and then put them together into a single combined transcript file of all matches. What is percentage of matches in which you won? What is percentage of matches in which you lost?
2. Learning from data: Try to learn from the results in the transcript of matches to improve your chance of winning the game. Then play with Agent #1 again based on what you have learned from the data in Step #1. Put down (i) what you have learned from the data and (ii) whether it did help you to improve the chance of winning the game into a WORD or text document.
3. Submission of your work: Upload the combined transcript in Step #1 and the file of your thoughts and exploration in Step #2 file under canvas.
*************************************************************************************************
Week 2. Probabilistic Models for Reasoning: An
Introduction to Hidden Markov Models (HMMs) and Bayesian Networks
Reading 2: Report due Wednesday, Due:
Wednesday, Sept. 19
Programming #1A: Wednesday, Sept. 19.
*************************************************************************************************
Week 3. Hidden Markov Models (HMMs) for Spelling
Recognition: Implementation of the keyboard model and the spelling model
Reading 3: Report due Wednesday, Due:
Wednesday, Sept. 26
Programming #1B: Wednesday, Sept. 26.
*************************************************************************************************
Week 4. Simulation of Typing using Hidden Markov
Models (HMMs) + Basics of Probabilistic Reasoning Using HMMs
Reading 4: Wednesday, Oct 3.
Programming #2A: due Wednesday, Oct 3.
Homework #1: due Wednesday, Oct. 3.
*************************************************************************************************
Weeks 5-6. Hidden Markov Models (HMMs): Simulation of
Typing + More on Probabilistic Reasoning
Reading 5-6: Wednesday, Oct 17.
Programming #2B: due Wednesday, Oct. 17.
Lab #2 (Supervised leaning for classification using WEKA): Wednesday, Oct. 17.
*************************************************************************************************
Week 7. Supervised Learning: Naïve Bayes Classification| Forward Algorithm for Probabilistic
Reasoning on HMMs
Reading 7: Report due Wednesday, Oct. 24.
Homework #2: Forward
algorithm for solving the first HMM Problem: Wednesday, Oct. 24.
On the forward algorithm for probabilistic reasoning
Programming #2C: due Wednesday, Oct. 24.
*************************************************************************************************
Week 8. Supervised Learning: Decision Trees | Implementation of the Forward Algorithm
Reading 8: Wednesday, Oct. 31.
Homework#3 (Naïve Bayes classification): Wednesday, Oct. 31.
Programming #3A: due Wednesday, Nov. 7.
*************************************************************************************************
Week 9. Supervised Learning: Basics of Linear Models | Identity Recognition Based on Typing/Spelling Behaviors
Reading 9: Wednesday, Nov. 7.
Homework #4: (Decision tree induction based on entropy
and information gain): Wednesday,
Nov. 7
Programming #3B: due Wednesday, Nov. 14.
*************************************************************************************************
Week 10. Supervised Learning: Support Vector Machines and More on Linear Models | Identity Recognition Based on Typing/Spelling Behaviors
Reading 10: Wednesday, Nov. 14.
*************************************************************************************************
Weeks 11-12. Neural Networks and Deep Learning | Learning HMM Models I
Faith and Learning Integration Assignment on Creation and Computer Science due: Monday, Nov. 19
¡P
Dr. Lin will be out of town for a conference on Nov. 19.
Please use the class time for reflection needed to do this assignment.
¡P
You should put down what you
have in the reflection process according to the requirement in the assignment.
¡P Submit your reflection report accordingly through Canvas.
Reading 11: Wednesday, Nov. 21. (submission open till Nov. 26 without penalty)
Homework #5: (Linear
Regression and Linear Models): Wednesday, Nov. 21. (submission open till
Nov. 26 without penalty)
Reading 12: Wednesday, Nov. 28.
Programming #4A: Wednesday, Nov. 28.
*************************************************************************************************
Week 13. Machine Learning Using Scikitlearn vs Weka | Learning HMM Models II
Reading 13: Wednesday, Dec. 5
*************************************************************************************************
Week 14. Machine Learning Using Scikitlearn vs Weka
Reading 14: Wednesday, Dec. 12
Programming 4B: Spelling-Recognition with Training Data. Due Wednesday, Dec. 12
¡P
Demo executable: Please download and carefully play with the demo executable for automatic recovery of a message
X described below. Demo executable and the
programming task: Please download
and carefully play with options L R, T, and U provided in the new demo executable for automatic recovery of a message
X described below. It correctly ranks the 4 most likely candidate words (for
each corrupted word) in the descending order of their probabilities. It also
correctly calculates the rates of accuracy of the recovered message according
to the top 1 list, the top 2 list, the top 3 list, and the top 4 list
respectively.
¡P
Programming Task: In your Programming #4A, you have already
implemented Option L, and now you need to enhance the new options R, T, and U
such that you can go through the steps to recognize from corruptedMessage1.txt and corruptedMessage2.txt as the results of Mr. X trying
to type an unknown document recorded in messageX.txt
twice given that all the original words are in vocabulary.txt.
Lab #3: Due: due Wednesday, Dec. 12.
1) Knowing more about WEKA explorer: Read the manual of WEKA 3.7.10 to see (i) how you can open up datasets in CSV (comma separated values) files and save them in arff format for classification tasks and (ii) how you can use classifiers to datasets to learn predictive models and run cross validation experiments. Read the description of cross validation here.
2) Getting the retention datasets and sign the agreement: Log into the Canvas to download 2016_Project.zip under File. Unzip the zip file and explore the contents inside. Use the cleaned data sets in the folder with files in the Arff format. Please do carefully read the enclosed confidential agreement and sign it before you use the retention datasets for this lab assignment. Copy and paste the agreement and the signature into your report.
3) Learning tree models using J48: Use WEKA and apply the J48 method under decision tree to the training datasets in the Data folder separately to learn to learn decision trees as predictive models. Put down the resulting decision trees in your report.
4) Cross validation experiments using J48: Do (3) above again and conduct 10-fold cross validation experiments together accordingly. Based on the results of the cross validation experiments, put down the expected precision and recall of the prediction model in (3) in terms in your report.
5) Cross validation experiments using Naïve Bayes: Instead of the J48 classifier under decision tree, use any Naïve Bayes classifier under lazy and do (4) above again and conduct 10-fold cross validation experiments together accordingly. Based on the results of the cross validation experiments, put down the expected precision and recall of the prediction model in (5) in terms in your report.
6) Cross validation experiments using IBk: Instead of the J48 classifier under decision tree, use the IBk classifier under lazy and do (4) above again and conduct 10-fold cross validation experiments together accordingly. Based on the results of the cross validation experiments, put down the expected precision and recall of the prediction model in (5) in terms in your report.
7) Submission:
Upload your report for Lab
#3 with the results from Step 2 to Step 5 under Canvas.
*************************************************************************************************
Final take-home exam: Submission due: Dec. 19
¡P
Open-book test. To be announced and discussed on Dec. 12 in the
class. Find the problem set under Canvas | Files | FinalTest.zip
*************************************************************************************************
Links to online resources
*************************************************************************************************
To the Top of the Page