Data Visualization

In this chapter, we use a broad range of simulations and hands-on activities to highlight some of the basic data visualization techniques using R. A brief discussion of alternative visualization methods is followed by demonstrations of histograms, density

  • PDF / 7,660,541 Bytes
  • 57 Pages / 439.37 x 666.142 pts Page_size
  • 51 Downloads / 219 Views

DOWNLOAD

REPORT


Data Visualization

In this chapter, we use a broad range of simulations and hands-on activities to highlight some of the basic data visualization techniques using R. A brief discussion of alternative visualization methods is followed by demonstrations of histograms, density, pie, jitter, bar, line and scatter plots, as well as strategies for displaying trees, more general graphs, and 3D surface plots. Many of these are also used throughout the textbook in the context of addressing the graphical needs of specific case-studies. It is practically impossible to cover all options of every different visualization routine. Readers are encouraged to experiment with each visualization type, change input data and parameters, explore the function documentation using R-help (e.g., ?plot), and search online for new R visualization packages and new functionality, which are continuously being developed. We will cover (1) one specific classification of visualization methods, (2) composition (e.g., density, histogram), comparison (e.g., jitter, bar, correlation) and relationship (e.g., line) plots, (3) 2D kernel density and 3D surface plots, and (4) 3D and 4D visualization of solids, (hyper)volumes.

4.1

Common Questions

• What exploratory visualization techniques are available to graphically interrogate my specific data? • How do we examine paired associations and correlations in a multivariate dataset?

© Ivo D. Dinov 2018 I. D. Dinov, Data Science and Predictive Analytics, https://doi.org/10.1007/978-3-319-72347-1_4

143

144

4 Data Visualization

4.2

Classification of Visualization Methods

Scientific data-driven or simulation-driven visualization methods are hard to classify. The following list of criteria can be used to characterize alternative data visualization strategies: • Data Type: structured/unstructured, small/large, complete/incomplete, time/ space, ASCII/binary, Euclidean/non-Euclidean, etc. • Task type: Task type is one of the aspects considered in classification of visualization techniques, which provides a means of interaction between the researcher, the data, and the display software/platform • Scalability: Visualization techniques are subject to some limitations, such as the amount of data that a particular technique can exhibit • Dimensionality: Visualization techniques can also be classified according to the number of attributes • Positioning and Attributes: the distribution of attributes on the chart may affect the interpretation of the display representation, e.g., correlation analysis, where the relative distance among the plotted attributes is relevant for observation • Investigative Need: the specific scientific question or exploratory interest may also determine the type of visualization: – – – –

Examining the composition of the data Exploring the distribution of the data Contrasting or comparing several data elements, relations, association Unsupervised exploratory data mining.

Also, we have the following table for common data visualization methods according to task types (Fig. 4.1): We introduce common