SAS, SPSS, WINKS, Excel
Excel Tutorials Index
For more professional
WINKS Statistical Software
Affordable. Reliable. Relevant.
Analyze with WINKS.
Double Data Entry Using
These tutorials briefly explain the use
and interpretation of standard statistical analysis techniques. The
examples include how-to instructions for Excel. Although there are
different version of Excel in use, these should work about the same for
most recent versions. They also assume that you have installed the Excel
Analysis Pak which is free and comes with Excel (Go to Tools,
Addins... if it is not already installed in your version of Excel.)
for files mentioned in this tutorial,
© TexaSoft, 2008-2010.
Double Data Entry Example
The gold standard for professional data entry is to enter data not once, but twice. The two data sets are then compared, differences are examined and corrections are made. This is a “good practice” technique that will help you enter your data with fewer errors. An IBM programmer, Wilf Hey, coined a term in 1966; “Garbage in- Garbage out.” Don’t enter garbage data. If you want any analysis of your data to be valid, your data itself must be valid. Although there are specific program designed for data entry, the ubiquitous Excel provides many data analysts a quit and cheap alternative. If you choose to use Excel, at least do it correctly. This example shows you how.
To use the Excel double data entry method, create two identical blank data entry spreadsheets, with only the labels at the top of the column as shown in the figure below. For this example, the first worksheet will be in the tab Sheet1 and the second (identical sheet) in Sheet2. Preparing the spreadsheet for data entry is discussed in the referenced paper by Elliott et al (2007).
Data should then be entered into the spreadsheets by two different people. If it is impossible to use two different people, at least enter the data at two different sessions. Data records must be entered in the same order, so in this example, each person (ID) should be entered in order on the rows starting with row number 2. An example spreadsheet (Sheet1) with data enterd is shown below.
For this example the following information is entered into Sheet2. You may notice that there are a few typos.
Once the data are entered into both spreadsheets, the procedure described here can be used to compare the two spreadsheets for differences.
If the two spreadsheets containing the entered data are not in the same worksheet file, copy the second spreadsheet and paste it into Sheet2 of the original worksheet. For this example to work, the spreadsheets must be in the same workbook, one in Sheet1 and the other in Sheet2.
To compare these two spreadsheets follow these steps:
- If there is not already a Sheet3 tab at the bottom of your worksheet, insert a third worksheet (Sheet3) by right-clicking on the Sheet1 tab and selecting Insert. Name the spreadsheet Sheet3.
- Copy the labels (row 1) from the Sheet1 worksheet to the Sheet3 worksheet.
- In Sheet3 place the cursor in cell A2 and (carefully) enter the following Excel formula:
Briefly, this formula uses Excel’s “IF” statement to compare the entry in the first spreadsheet SHEET1!A2 to the entry in the second sheet SHEET2!A2 for an exact match. If the match is exact, the IF statement puts a 0 (zero) in the cell. If the two cells do not match, the SHEET1!A2&"/"&SHEET2!A2 statement puts the information from sheet1 followed by a slash and the information from sheet2 in the cell. You’ll see how this works as you continue the example…
- Copy the Excel formula to all cells to compare in your worksheets (in this case from A2 to F5). One method of copying this formula in Excel is to place the cursor in cell A2 and press CTRL-C (Copy). Then highlight the cells from A2 to F5 and press CTRL-V (Paste). This copies the formula to all of the specified cells. The “difference” spreadsheet (Sheet3) looks like the one illustrated below.
- Notice the cells in the difference spreadsheet. Cells containing a 0 indicate a match. If the values between the two spreadsheet do not match, the cell displays the actual data values from the two sheets displayed so that the differences are apparent. For example, the digits for AGE in cell B3 are reversed on the two sheets (32 versus 23). Notice in the date comparison in cell D4 that date codes (33429/33430) are displayed rather than actual dates. Since these numbers are one digit apart it means that the dates on Spread1 and Spread2 are one day apart. The original spreadsheet contains the date as July 10, 1974 and the other spreadsheet contains it as July 11, 1974.
Using the information in the difference spreadsheet, you should refer to your original data entry documents. You or an arbitrator must then decide which of the data entries are correct. Make corrections as needed. Once you have verified that the two spreadsheets are identical, all cells in the difference spreadsheet will display a 0.
Once your data is verified, you should have a data set with an improved level of accuracy than if it had been entered only once. This technique will, of course, not pick up errors that were written down on original documents, or if the same error in data entry was made in both spreadsheets. For further information on preparing data in Excel for data analysis, refer to Elliott (2007).
- English Dictionary. Retrieved July 18, 2005 from http://www.english-dictionary.us/meaning/wilf_hey.asp.
- Elliott, AC, Hynan LS, Reisch JS, Smith J (2007) “Preparing Data for Analysis using Microsoft Excel,” Journal of Investigative Medicine, pp 334-341.
An example Excel worksheet showing this technique is at www.stattutorials.com/EXCELDATA
For more tutorials see
For more information... we recommend:
- WINKS -- a simple to use and affordable statistical software program that will help you analyze, interpret and write-up your results. Download a free trial copy. WINKS reads Excel files, so you can read your Excel data into WINKS and do many more types of analysis than in Excel alone.
- SAS Essentials: Mastering SAS for Research - provides an introduction to SAS statistical software, the premiere statistical data analysis tool for scientific research. Through its straightforward approach, the text presents SAS with step-by-step examples. SAS Essentials introduces a step-by-step approach to mastering SAS software for statistical data analysis. It's also a valuable reference tool for any researcher currently using SAS. Designed for those new to SAS and filled with illustrative examples, the book shows how to read, write and import data; prepare data for analysis; use SAS procedures; evaluate quantitative data; analyze counts and crosstabulation tables; and compare means using the t-test. The book also provides instruction and examples on analysis of variance, correlation and regression, nonparametric analysis, logistic regression, creating graphs, controlling outputs using ODS, as well as advanced topics in SAS programming.ISBN: 0470461292. Order from publisher -Jossey-Bass/Wiley. Barnes & Noble. Amazon.
- Against All Odds VIDEOS - Now in DVD format -- Teaching Videos from Annenberg/PBS --"This highly engaging primer on statistical methods and inference introduces the practical applications of statistics. Produced by the Consortium for Mathematics and Its Applications and Chedd-Angier." Click here for info
- BeSmartNotes Reference sheets for SAS, SAS ODS, SPSS and WINKS - Click here for info.
- Statistical Analysis Quick Reference Guidebook: With SPSS Examples is a practical "cut to the chase" handbook that quickly explains the when, where, and how of statistical data analysis as it is used for real-world decision-making in a wide variety of disciplines. It contains examples using SPSS Statistics software. In this one-stop reference, the authors provide succinct guidelines for performing an analysis, avoiding pitfalls, interpreting results, and reporting outcomes. Paperback. Sage Publishers ISBN: 1412925606 Order book from Amazon