Descriptive Data Mining

This book addresses the basic aspect of data mining, descriptive analytics. As stated in the preface, this concerns studying what has happened, looking at various forms of statistics to gain understanding of the state of whatever field is being examined.

  • PDF / 145,087 Bytes
  • 2 Pages / 439.37 x 666.142 pts Page_size
  • 78 Downloads / 225 Views

DOWNLOAD

REPORT


Descriptive Data Mining

This book addresses the basic aspect of data mining, descriptive analytics. As stated in the preface, this concerns studying what has happened, looking at various forms of statistics to gain understanding of the state of whatever field is being examined. The book begins with a chapter on knowledge management, seeking to provide a context of analytics in the overall framework of information management. Chapter 2 focuses on the general topic of visualization. Of the many ways visualization is implemented to inform humans of what statistics can reveal, we look at data mining software visualization tools, as well as simple spreadsheet graphs enabling understanding of time series data. These graphs of energy data offer rich opportunities for students to further study important societal issues. Chapter 3 describes basic cash register information by sale that has been used by retail organizations to infer understanding of what items tend to be purchased together. This can be useful to support product positioning in stores, as well as other business applications. Market basket analysis is among the most primitive forms of descriptive data mining. Chapter 4 addresses a basic marketing tool that has been around for decades. Retailers have found that identifying how recently a customer has made a purchase is important in gauging their value to the firm, as well as how often they have made purchases, and the amount purchased. Market basket analysis provides a quick and relatively easy to implement methodology to categorize customers. There are better ways of analysis, and there is a lot of data transformation work involved, but this methodology helps understand how descriptive data can be used to support retail businesses. Chapter 5 deals with the first real data mining tool—generation of association rules by computer algorithm. The basic a priori algorithm is described, and R software support demonstrated. A hypothetical representation of e-commerce sales is used for demonstration.

© Springer Nature Singapore Pte Ltd. 2017 D.L. Olson, Descriptive Data Mining, Computational Risk Management, DOI 10.1007/978-981-10-3340-7_8

113

8  Descriptive Data Mining

114 Table 8.1  Descriptive data mining methods Method Visualization Market basket analysis Recency/Frequency /Monetary Association rules Cluster analysis Link analysis

Descriptive Process Initial exploration

Basis

Software Spreadsheet

Retail cart analysis Sales analysis

Graphical statistics Correlation Volume

Spreadsheet Spreadsheet manipulation

Grouping Grouping Display

Correlation Statistics Graphics

APriori, others Data mining (R, WEKA) PolyAnalyst, NodeXL

Chapter 6 is a long chapter, presenting the basic algorithms used in cluster analysis, followed by analysis of typical bank loan data by three forms of open source data mining software. More powerful tools such as self-organizing maps are briefly discussed. Finally, the use of link analysis is shown with two forms of software. First, basic social network metrics are presented. An open s