On the Naïve
Bayes classifier
Table 1 below is a weather data set with 5 attributes (Outlook, Temp, Humidity, Windy, and
Play) and 15 records. The last attribute Play is the class attribute and our goal is to find out a
way to classify any given new record where the values of the first 4 attributes
are known into one of the two classes Play=Yes or Play=Yes. In other words, we want to learn a way
to predict whether Play=Yes or Play=Yes based on the values of the first 4 attributes
of the record.
Outlook sunny sunny overcast rainy rainy rainy overcast sunny sunny rainy sunny overcast overcast rainy
|
Temp hot hot hot mild cool cool cool mild cool mild mild mild hot mild
|
Humidity high high high high normal normal normal high normal normal normal high normal high
|
Windy FALSE TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE TRUE
|
Play (Class) no no yes yes yes
no yes
no yes yes yes yes yes
no |
Introduction to the Naïve Bayes classier:
Carefully read the overview of Naïve Bayes classifier here,
which use the data set in Table I as an example. It is based on Sections 4.2 of
Data Mining: Practical Machine Learning
Tools and Techniques and describes the use of the naïve Bayes approach to
collect statistics based on existing data to classify new instances. It shows
how you can collect the statistics needed in the naïve Bayes method step by
step using the weather data set in Table 1 above and show we can apply them to
classify a new case like the following one.
Outlook |
Temp |
Humidity |
Windy |
Play (Class) |
Sunny |
Cool |
High |
True |
??? |
Things
to do for this homework:
Table 2 is another weather data set for this homework.
Outlook Sunny Sunny Overcast Rainy Rainy Rainy Overcast Sunny Sunny Rainy Sunny Overcast Overcast Rainy
|
Temp Humidity Windy hot high FALSE hot high hot high FALSE mild high FALSE cool normal FALSE cool normal cool normal mild high FALSE cool normal FALSE mild normal FALSE mild normal mild high hot normal FALSE mild high
|
Play (Class) no yes no no yes no no no yes yes no yes yes yes |
Show
step by step, how you can determine the statistics needed for using the naïve
Bayes method for classification based on the revised weather data set in Table 2, and show how you will classify the new case below in the
bottom.
Outlook |
Temp |
Humidity |
Windy |
Play (Class) |
Sunny |
Cool |
High |
True |
??? |