stattutorials.com
Statistics Tutorials
for SAS, SPSS, WINKS, Excel, and R


Statistical
Video Training
DVDs

Against All Odds Videos
Against All
Odds VIDEOS

Teaching Videos from Annenberg/PBS.
A video instructional series on statistics for college and high school classrooms.
Special Pricing $20 off!
Click here for Against All Odds info


WINKS Statistical Software
Affordable. Reliable. Relevant.
www.texasoft.com
 


 

 

Main Tutorial Menu

Numbers

Graphing Measurement (Quantitative) data using R - Histograms

 

 

These R statistics tutorials briefly explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for R Software. Although there are millions of R users around the world, there is a substantial learning curve involved in mastering the program.These tutorials are an introduction to using R statistical software that could be used in an applied statistics course or as your own self-paced tutorial.

If you have suggestions, or if you encounter errors in any of these tutorials, please contact us.

See www.stattutorials.com/RDATA for files mentioned in these tutorials, © TexaSoft, 2007-11. All rights reserved.


Basic Histograms in R

This tutorial illustrates some of the ways you can create histograms from numeric data, to describe the distribution of a set of data.

(This tutorial uses the raw data file CARSMPG.CSV. Download this file here. The program assumes the data file is in the folder C:\RDATA. Make changes to the R program if you save the file to a different folder.) See also Import Data into R from Excel

The following code creates an object named cars, then uses the summary function to produce summary statistics.

cars<-read.csv(file="C:\\RDATA\\CARSMPG.CSV",head=TRUE,sep=",")
hist(cars$HWYMPG)

Or, to avoid having to use cars$variable name, you can first use the attach() function.

attach(cars)
hist(CHWYMPG)

either way will create the following basic  histogram:

R Histogram

Adding Labels to the histogram

Use the xlab= ylab= options to define x-axis and y-axis lables, and the main= option to define a title.  Also, control the x limits by using the xlim= option as shown in the example.

hist(cars$HWYMPG, xlab="Highway Miles Per Gallon", main="2005 Cars Database", xlim=c(0,60), ylab="Count")

R Histogram with lablels

Plot a subset of the data

By defining the variable as

cars$HWYMPG[cars$SUV=="1"]

you are limiting the data in the cars dataset to only those where SUV equals 1.

hist(cars$HWYMPG[cars$SUV=="1"], xlab="Highway Miles Per Gallon", main="SUVs Only", xlim=c(10,40),ylab="Count")

 

R Histogram Subset

 

Add a normal curve to a histogram

To add a normal curve to the plot, you must first calculate a mean, and standard deviation.  The hist() function options  

prob=TRUE makes the histogram a representation of frequencies
density=20 indicates a shading density for the bars

The statement

curve(dnorm(x, mean=m, sd=std),add=TRUE) 

defines the curve that will be superimposed onto the graph using data defined by the dnorm() function (produces data base on a probability density function of the normal distributionwith the calculated mean and standard deviation. The add=TRUE option causes the curve to be3 added to the existing plot.  the density of shading lines, in lines per inch. The default value of NULL means that no shading lines are drawn. Non-positive values of density also inhibit the drawing of shading lines.  

m<-mean(cars$HWYMPG)
std<-sqrt(var(cars$HWYMPG))
hist(cars$HWYMPG, density=20, prob=TRUE,
main="Histogram with normal curve")
curve(dnorm(x, mean=m, sd=std), add=TRUE)

R Histogram Normal Curve

Create a plot lattice and define value labels for the the SUV variable

# If needed read in data
cars<-read.csv(file="C:\\RDATA\\CARSMPG.CSV",head=TRUE,sep=",")

#Define the values of the SUV
cars$SUV <- factor(cars$SUV,levels = c(0,1),labels = c("Non-SUV", "SUV"))
library(lattice)
histogram(~ HWYMPG | SUV, data=cars, type="count", col="red")

R histogram lattice

For more information on the hist() function
For more information on the curve()function

 

- End of Tutorial -

See also...

Part 1: Describing and Examining Measurement (Quantitative) data using R

Part 3: Performing a statistical test to assess normality

Part 4: Examining data by group.

 

(c) Alan C. Elliott, 2011