Presentation Graphics

The purpose of this chapter is to describe methods for adjusting the attributes of the graph and for interacting with the graph that will enable the user to produce a publication-level graphical display. We focus on the methods that we have found useful i

  • PDF / 389,877 Bytes
  • 31 Pages / 439.37 x 666.14 pts Page_size
  • 81 Downloads / 250 Views

DOWNLOAD

REPORT


Presentation Graphics

4.1 Introduction One of the most attractive aspects of the R system is its capability to produce state-of-the-art statistical graphics. All of the chapters of this book illustrate the application of a variety of R functions for graphing quantitative and categorical data for one or several dimensions. The purpose of this chapter is to describe methods for adjusting the attributes of the graph and for interacting with the graph that will enable the user to produce a publication-level graphical display. We focus on the methods that we have found useful in our own work. The book by Murrell [37] provides a good description of the traditional graphics system. Sarkar [43] and Wickham [52] provide overviews respectively of the lattice and ggplot2 packages for producing graphics that will be introduced at the end of this chapter. Example 4.1 (Home run hitting in baseball history). We begin with an interesting graph that helps us understand the history of professional baseball in the United States. Major League Baseball has been in existence since 1871 and counts of particular baseball events, such as the number of hits, doubles, and home runs, have been recorded for all of the years. One of the most exciting events in a baseball game is a home run (a batted ball typically hit out of the ballpark) and one may be interested in exploring the pattern of home run hitting over the years. The data file ”batting.history.txt” (extracted from the baseball-reference.com web site) contains various measures of hitting for each season of professional baseball. We read in the dataset and use the attach function to make the variables available for use. > hitting.data = read.table("batting.history.txt", header=TRUE, + sep="\t") > attach(hitting.data)

J. Albert and M. Rizzo, R by Example, Use R, DOI 10.1007/978-1-4614-1365-3__4, © Springer Science+Business Media, LLC 2012

101

102

4 Presentation Graphics

The variables Year and HR contain respectively the baseball season and number of home runs per game per team. We construct a time series plot of home runs against season using the plot function in Figure 4.1. > plot(Year, HR)

1.2

Generally we see that home run hitting has increased over time, although there are more subtle patterns of change that we’ll notice in the following sections.

1.0



● ● ●● ● ●●● ● ●

● ● ● ● ●●●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ●●

0.8

● ● ● ●● ● ●●●● ●

0.6

HR

● ● ● ●●



●●

0.4

● ● ●● ● ● ● ●●



●●



● ●





0.2



● ●● ●

● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●●●● ●● ● ●● ● ●● ● ●● ● ●●●● ●● ●●● ● ●●●

1880

1900

1920

1940

1960

1980

2000

Year

Fig. 4.1 Display of the average number of home runs hit per game per team through all of the years of Major League Baseball.

4.2 Labeling the Axes and Adding a Title To communicate a statistical graphic, it is important that the horizontal and vertical scales are given descriptive labels. In addition, it is helpful to give a graphic a good title so the reader understands the purpose of the graphical dis