Machine Learning and Data Analytics

CSCI 480, Spring semester, 2019

 

To the Bottom of the Page

 

Instructor:                 Dr. Shieu-Hong Lin (Description: Description: Description: Description: Description: Description: Description: LinEmail)  

Course Syllabus

Class:                        TR 1:30-2:45 pm at Lim 41

Office Hours:             Lim 137 MW 1:30-3:30pm  T Th 3:00-5:00pm

 

Submission of all your work: go to Biola Canvas               Your grades: see them under Biola Canvas

 

*************************************************************************************************

 

Week 1. Overview of the Landscape of (i) Machine Learning, (ii) the WEKA toolkit, and (iii) the SciPy ecosystem for Data Science

 

Lab # 1: WEKA: Report due: Thursday Jan. 24

 

Reading 1: Report due: Thursday Jan. 24

 

 

Thoughts about Project: Rock-Paper-Scissor as an example

 

*************************************************************************************************

 

Week 2. Intro to Python and Scipy (I)  |  Machine Learning: Decision Trees

 

Presentation #1 (10-20 minutes each person): Tuesday Jan. 29

 

Lab #2 (Entropy, information gain, and numpy basics):

Exploration: Play with the Jupiter notebook Lab2.ipynb to see how you may calculate the entropy of a given distribution in a numpy array.

 

Reading 2: Report due: Thursday Jan. 31

 

 

*************************************************************************************************

 

Week 3. Numpy I  |  Machine Learning: Naïve Bayes

 

Presentation #2 (10-20 minutes each person): Tuesday Feb. 5

 

Homework #1: (Decision tree induction based on entropy and information gain): Thursday Feb. 7

 

Reading 3: Report due: Thursday Feb. 7

 

 

*************************************************************************************************

 

Week 4. Numpy (II) and Pandas (I) 

 

Presentation #3 (10-20 minutes each person): Tuesday Feb. 12

 

Reading 4: Report due: Thursday Feb. 14

 

Lab #3 (Finding information gain given the distribution information stored in a 2-dimensional numpy array): Thursday Feb. 14

 

*************************************************************************************************

 

Week 5.  More on Numpy (II) and Pandas (I) 

 

Presentation #3 Continued (10-20 minutes each person): Tuesday Feb. 19

 

Reading 5: Report due: Thursday Feb. 21

 

Homework#2 (Naïve Bayes classification): Thursday Feb. 21

 

Lab #4 (Analysis of rock-paper-scissor transcripts using numpy): Thursday Feb. 28

 

 

*************************************************************************************************

 

Weeks 6-7.  More on Pandas (II)  | Spring Break

 

Presentation #4 Continued (10-20 minutes each person): Thursday March. 14

 

Reading 6: Report due: Thursday Feb. 28

 

Reading 7: Report due: Thursday March 14

 

Lab #5 (Analysis of information gain from rock-paper-scissor transcripts using numpy): Thursday March. 14

 

 

*************************************************************************************************

 

Weeks 8-9.  Review  |  Test 1  |  Missions Conference

 

Test #1 (Develop and use Naïve Bayes classifiers based on rock-paper-scissor transcripts using numpy): Thursday March. 28

 

Review: Naïve Bayes classification

 

*************************************************************************************************

 

Week 10. More on Pandas (III)

 

Presentation #4 Continued (10-20 minutes each person): Tuesday April. 2

 

Reading 10: Thursday, April. 4

 

 

*************************************************************************************************

 

Week 11. More on Pandas (IV)  |   Supervised Learning: Linear Models

 

Reading 11: Thursday, April. 11

 

Lab 6 (Naïve Bayes classification using Pandas): Thursday, April. 11

 

*************************************************************************************************

 

Week 12. Matplotlib | Supervised Learning: Linear Models

 

Presentation #4 Continued (10-20 minutes each person): Tuesday April. 16

 

Reading 12: Thursday, April. 18

 

 

*************************************************************************************************

 

Week 13.  Machine Learning Using Scikitlearn (I)

Reading 13: Thursday, April. 25

 

Test #2 Develop and use Naïve Bayes classifiers based on rock-paper-scissor transcripts using Pandas: Thursday April. 25

 

Homework #3: (Linear Regression): Thursday, April. 25

 

*************************************************************************************************

 

Week 14.  Machine Learning Using Scikitlearn (II)

 

Reading 14: Thursday, May 2

 

Test #3: due: Thursday, May 9

¡P         Problem set: See here (updated version posted 17:20 April 30).

¡P         Overview: Rock-Paper-Scissor and classification algorithms using Scikitlean. In the previous tests, you implemented functions for Naïve Bayes classification based on Numpy and Pandas. You also apply the functions to Rock-Paper-Scissor transcripts to learn to predict the behavior of a specific agent based on either (i) the actions of the two sides in the previous two matches or (ii) the actions of the two sides in the previous match only. In the exam, you¡¦ll be given the transcripts of 3 different agents to play with. Your task is to load the agent transcripts as Pandas dataframe objects and use Scikitlearn to apply Naïve Bayes (see Chapter 5 of Python Data Science Handbook) to analyze the behavior of the agents. Applying Naïve Bayes using Scikitlearn, you should (i) learn a classifier respectively for prediction and (ii) get an empirical estimate the accuracy of prediction respectively based on the prediction accuracy on the testing data. 

¡P         Overview: Linear regression and kernel trick using Scikitlean. You need to use Scikitlearn to apply linear regression (decently covered in Chapter 5 of Python Data Science Handbook) to conduct some basic learning tasks.

¡P         Submission of your work: Fill out this self-evaluation report. Upload the self-evaluation report and your Jupiter notebooks (to show your code and results) and under canvas.

 

*************************************************************************************************

Links to online resources

*************************************************************************************************

 

 

 

To the Top of the Page                               

 

 

.