stattutorials.com
Statistics Tutorials
for SAS, SPSS, WINKS, Excel, and R

Statistical
Video Training
DVDs

Against All
Odds VIDEOS

Teaching Videos from Annenberg/PBS.
A video instructional series on statistics for college and high school classrooms.
Special Pricing \$20 off!

WINKS Statistical Software
Affordable. Reliable. Relevant.
www.texasoft.com

## Import data into R from an Excel File

These R statistics tutorials briefly explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for R Software. Although there are millions of R users around the world, there is a substantial learning curve involved in mastering the program.These tutorials are an introduction to using R statistical software that could be used in an applied statistics course or as your own self-paced tutorial.

## Import Excel Data into R

Creating and distributing data in an Excel file is common, and when you use this data in R, you must import it into R before using it in any analysis. This tutorial describes how to prepare your data in Excel and then how to import it into R.

Preparing an Excel File to Import into R

1. Row 1 of your Excel spreadsheet should contain variable names that are compatible with R naming conventions. Variable names should only be on row 1. Do not extend names to row 2. Here are some guidelines: (I’ve made these guidelines general in nature so that the Excel data could also be imported into SAS, SPSS or WINKS.)
1. Begin variable names with a letter.
2. Keep names short (no more than 64 characters.)
3. Do not use blanks or special characters in names (for example, !, ?, ', and *). Underscore is okay.
4. Variable names must be unique; duplication is not allowed.
5. Do not use typical keywords as names: ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, WITH.
6. Case matters in R. NAME is different from Name or name.
2. Each row (line) in the Excel spreadsheet (other than the variable row) should represent a single subject or observed entity.
3. Avoid blank rows – it will complicate your import and analysis
4. If you have missing data, do not leave that cell in Excel blank. Define a missing value code and place that code in any cell that contains missing data. Or use the code NA.
5. Always use date variables with 4 digit year formats. That is, enter date in Excel using the format 01/01/2005 and not 01/01/05. Otherwise the old Y2K problem can still be a problem for date calculations.
6. For more information, including how to do “best practice) double data entry in Excel, refer to the paper:

Importing an Excel file into R

Use an intermediate “.CSV” file. (Recommended.)

Step 1. Open your data file in Excel. For example, open the file CARSMPG.XLS (download it here). It is shown here in Execl 2007:

Step 2. Save the date as a .CSV (common separated variables) file. By choosing File, Save As…, Excel Workbook. In the “Files of Type” option on the Save dialog box, select the option “Comma delimited” .CSV. Save the file under the name CARSMPG.CSV.

The first few records in this file are as follows

BRAND,MODEL,MINIVAN,WAGON,PICKUP, ... etc...
TOYOTA,Prius,0,0,0,1,1.5,4,60,51,0,0,1
HONDA,Civic Hybrid,0,0,0,1,1.3,4,48,47,0,0,1
HONDA,Civic Hybrid,0,0,0,1,1.3,4,47,48,0,0,1
HONDA,Civic Hybrid,0,0,0,0,1.3,4,46,51,0,0,1
HONDA,Civic Hybrid,0,0,0,0,1.3,4,45,51,0,0,1
VOLKSWAGEN,Golf,0,0,0,0,1.9,4,38,46,0,0,0
VOLKSWAGEN,Jetta,0,0,0,0,1.9,4,38,46,0,0,0
VOLKSWAGEN,New Beetle,0,0,0,0,1.9,4,38,46,0,0,0

Note that the data are listed one record per row, with commas separating the data values. The first line contains the names of the varables (appbrviated here.)

Step 3. In R, use the function read.csv() to read the data into a data set. For example, if your data are in the Windows folder C:\MYRDATA, enter the following command at the R > prompt to import the data into an R data set named cars.

This imports the data into the cars dataset. To list the contents of the data set enter

> cars
BRAND                                MODEL MINIVAN WAGON PICKUP
1           TOYOTA                                Prius       0     0      0
2            HONDA                         Civic Hybrid       0     0      0
3            HONDA                         Civic Hybrid       0     0      0
4            HONDA                         Civic Hybrid       0     0      0
5            HONDA                         Civic Hybrid       0     0      0

etc…

Note: Instead of using the path as a part of the data file name, you can change the default path in R first using

setwd("C:/MYRDATA/")

then use the simpler command

To import the data.

For information on the xlsReadWrite routine, which allows you to read Excel files directly, go to the following link