Principles of Data Mining

Data Mining, the automatic extraction of implicit and potentially useful information from data, is increasingly used in commercial, scientific and other application areas. This book explains and explores the principal techniques of Data Mining: for classi

  • PDF / 376,075 Bytes
  • 17 Pages / 595.276 x 841.89 pts (A4) Page_size
  • 35 Downloads / 208 Views

DOWNLOAD

REPORT


1 What is Classification? Classification is a task that occurs very frequently in everyday life. Essentially it involves dividing up objects so that each is assigned to one of a number of mutually exhaustive and exclusive categories known as classes. The term ‘mutually exhaustive and exclusive’ simply means that each object must be assigned to precisely one class, i.e. never to more than one and never to no class at all. Many practical decision-making tasks can be formulated as classification problems, i.e. assigning people or objects to one of a number of categories, for example – customers who are likely to buy or not buy a particular product in a supermarket – people who are at high, medium or low risk of acquiring a certain illness – student projects worthy of a distinction, merit, pass or fail grade – objects on a radar display which correspond to vehicles, people, buildings or trees – people who closely resemble, slightly resemble or do not resemble someone seen committing a crime

24

Principles of Data Mining

– houses that are likely to rise in value, fall in value or have an unchanged value in 12 months’ time – people who are at high, medium or low risk of a car accident in the next 12 months – people who are likely to vote for each of a number of political parties (or none) – the likelihood of rain the next day for a weather forecast (very likely, likely, unlikely, very unlikely). We have already seen an example of a (fictitious) classification task, the ‘degree classification’ example, in the Introduction. In this chapter we introduce two classification algorithms: one that can be used when all the attributes are categorical, the other when all the attributes are continuous. In the following chapters we come on to algorithms for generating classification trees and rules (also illustrated in the Introduction).

2.2 Na¨ıve Bayes Classifiers In this section we look at a method of classification that does not use rules, a decision tree or any other explicit representation of the classifier. Rather, it uses the branch of Mathematics known as probability theory to find the most likely of the possible classifications. The significance of the first word of the title of this section will be explained later. The second word refers to the Reverend Thomas Bayes (1702–1761), an English Presbyterian minister and Mathematician whose publications included “Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and Government is the Happiness of His Creatures” as well as pioneering work on probability. He is credited as the first Mathematician to use probability in an inductive fashion. A detailed discussion of probability theory would be substantially outside the scope of this book. However the mathematical notion of probability corresponds fairly closely to the meaning of the word in everyday life. The probability of an event, e.g. that the 6.30 p.m. train from London to your local station arrives on time, is a number from 0 to 1 inclusive, with 0 indicating ‘impossible’ and 1 indicating ‘certain’.