On the Nave Bayes classifier

 

 

Table 1 above is a weather data set with 5 attributes (Outlook, Temp, Humidity, Windy, and Play) and 15 records. The last attribute Play is the class attribute and our goal is to find out a way to classify any given new record where the values of the first 4 attributes are known into one of the two classes Play=Yes or Play=Yes. In other words, we want to learn a way to predict whether Play=Yes or Play=Yes based on the values of the first 4 attributes of the record.

 

Outlook
sunny
sunny
overcast
rainy
rainy
rainy
overcast
sunny
sunny
rainy
sunny
overcast
overcast
rainy

 

Temp
hot
hot
hot
mild
cool
cool
cool
mild
cool
mild
mild
mild
hot
mild

 

Humidity
high
high
high
high
normal
normal
normal
high
normal
normal
normal
high
normal
high

 

Windy
FALSE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
TRUE
 
Play (Class)
no
no
yes
yes
yes

no

yes 

no

yes
yes
yes
yes
yes 

no

 

 

Introduction to the Nave Bayes classier:

 

Carefully read the introduction to Nave Bayes classier here, which use the data set in Table I as an example. It is based on Sections 4.1~4.2 of Data Mining: Practical Machine Learning Tools and Techniques and describes the use of the nave Bayes approach to collect statistics based on existing data to classify new instances. It shows how you can collect the statistics needed in the nave Bayes method step by step using the weather data set in Table 1 above and show we can apply them to classify a new case like the following one.

 

Outlook

Temp

Humidity

Windy

Play (Class)

Sunny

Cool

High

True

???

 

 

 

Things to do for this homework:

 

Table 2 is another weather data set for this homework.

 

Outlook
Sunny
Sunny
Overcast
Rainy
Rainy
Rainy
Overcast
Sunny
Sunny
Rainy
Sunny
Overcast
Overcast
Rainy

 

Temp Humidity  Windy
hot high FALSE
hot high  TRUE
hot high FALSE
mild high  FALSE
cool normal FALSE
cool normal TRUE
cool normal TRUE
mild high  FALSE
cool normal FALSE
mild normal FALSE
mild normal TRUE
mild high TRUE
hot normal FALSE
mild high TRUE

 

Play (Class)
no
yes
no
no
yes
no
no
no
yes
yes
no
yes
yes
yes

 

 

 

Show step by step, how you can determine the statistics needed for using the nave Bayes method for classification based on the revised weather data set in Table 2, and show how you will classify the new case below in the bottom.

 

 

Outlook

Temp

Humidity

Windy

Play (Class)

Sunny

Cool

High

True

???