On the Naïve
Bayes classifier
Table 1 below is a weather data set with 5 attributes (Outlook, Temp, Humidity, Windy, and
Play) and 15 records. The last attribute Play is the class attribute and our goal is to find out a
way to classify any given new record where the values of the first 4 attributes
are known into one of the two classes Play=Yes or Play=Yes. In other words, we want to learn a way
to predict whether Play=Yes or Play=Yes based on the values of the first 4 attributes
of the record.
Outlooksunnysunnyovercastrainyrainyrainyovercastsunnysunnyrainysunnyovercastovercastrainy
|
Temphothothotmildcoolcoolcoolmildcoolmildmildmildhotmild
|
Humidityhighhighhighhighnormalnormalnormalhighnormalnormalnormalhighnormalhigh
|
WindyFALSETRUEFALSEFALSEFALSETRUETRUEFALSEFALSEFALSETRUETRUEFALSETRUE |
Play (Class)nonoyesyesyes
no yes
no yesyesyesyesyes
no |
Introduction to the Naïve Bayes classier:
Carefully read the overview of Naïve Bayes classifier here,
which use the data set in Table I as an example. It is based on Sections 4.2 of
Data Mining: Practical Machine Learning
Tools and Techniques and describes the use of the naïve Bayes approach to
collect statistics based on existing data to classify new instances. It shows
how you can collect the statistics needed in the naïve Bayes method step by
step using the weather data set in Table 1 above and show we can apply them to
classify a new case like the following one.
|
Outlook |
Temp |
Humidity |
Windy |
Play (Class) |
|
Sunny |
Cool |
High |
True |
??? |
Things
to do for this homework:
Table 2 is another weather data set for this homework.
OutlookSunnySunnyOvercastRainyRainyRainyOvercastSunnySunnyRainySunnyOvercastOvercastRainy
|
Temp Humidity Windyhot high FALSEhot high hot high FALSEmild high FALSEcool normal FALSEcool normal cool normal mild high FALSEcool normal FALSEmild normal FALSEmild normal mild high hot normal FALSEmild high
|
Play (Class)no yes no no yes no no no yes yes no yes yes yes |
Show
step by step, how you can determine the statistics needed for using the naïve
Bayes method for classification based on the revised weather data set in Table 2, and show how you will classify the new case below in the
bottom.
|
Outlook |
Temp |
Humidity |
Windy |
Play (Class) |
|
Sunny |
Cool |
High |
True |
??? |