Twosample ttest Using SPSS
See
www.stattutorials.com/SPSSDATA
for files mentioned in this tutorial © TexaSoft, 2008
These SPSS statistics tutorials briefly explain the use and
interpretation of standard statistical analysis techniques for Medical,
Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples
include howto instructions for SPSS Software.
This
example is adapted from information in Statistical Analysis Quick Reference
Guidebook (2007).
The twosample (independent groups) ttest is used to determine whether the unknown means of two populations are different from each other based on independent samples from each population. If the two sample means are sufficiently different from each other, then the population means are declared to be different.
The samples for a twosample ttest can be obtained from a single population that has been randomly divided into two subgroups with each subgroup subjected to one of two treatments (e.g., two medications) or from two separate populations (e.g., male and female). In either case, for the twosample ttest to be valid, it is necessary that the two samples are independent, i.e. unrelated to each other.
The characteristics of the ttests in the above examples are:
 A two sample ttest compares means: In an experiment designed to use the twosample ttest, you want to compare means from a quantitative variable such as height, weight, amount spent, or grade. In other words, it should make sense to calculate the mean of the observations.
 You are comparing independent samples: The two groups contain subjects (or objects) that are not paired or matched in any way.
 The ttest assumes normality: A standard assumption for the ttest to be valid when you have small sample sizes is that the outcome variable measurements are normally distributed.
 Are the variances equal? Another consideration that should be addressed before using the ttest is whether the population variances can be considered to be equal.
The twosample ttest is robust against moderate departures from the normality and variance assumption, but independence of samples must not be violated.
Hypotheses for a twosample ttest
Typical hypotheses for the comparison of the means in a twosample ttest are as follows:
H_{o}: m_{1} = m_{ 2} (In words: The population means of the two groups are the same.)
H_{a}: m_{ 1} ≠ m_{ 2} (The population means of the two groups are different.)
Onetailed tests: If your experiment is designed so that you are only interested in detecting whether one mean is larger than the other, you may choose to perform a onetailed (sometimes called onesided) ttest. Since SPSS always reports a twotailed pvalue you must modify the reported pvalue to fit a onetailed test by dividing it by 2.
Example  Twosample ttest with equal variances
Describing the problem: A researcher wants to know whether one fertilizer (Brand 1) causes plants to grow faster than another brand of fertilizer (Brand 2). Starting with seeds, he grows plants in identical conditions and randomly assigns fertilizer “Brand 1” to 7 plants and fertilizer “Brand 2” to 6 plants. The data for this experiment are as follows where the outcome measurement is the height of the plant after 3 weeks of growth. The data are shown in the following table.
Fertilizer 1 
Fertilizer 2 
51.0 cm 
54.0 cm 
53.3 
56.1 
55.6 
52.1 
51.0 
56.4 
55.5 
54.0 
53.0 
52.9 
52.1 

Since either fertilizer could be superior, a twosided ttest is appropriate. The hypotheses for this test are (in words):
Null Hypothesis : The mean growth heights of the plants using the two different fertilizers are the same.
Alternative Hypothesis : The mean growth heights of the plants using the two fertilizers are different.
Arranging the data for analysis: o perform the analysis for the fertilizer data using most statistical software programs (including SPSS), you must set up the data using two variables: a classification or group code and an observed (outcome/response) variable. Thus, the way the data are shown in the figure below:
 Select a grouping code to represent the two fertilizer types. This code could be numeric (i.e., 1, 2) or text (i.e., A, B or BRAND1, BRAND2.) For this example, use the grouping code named Type, where 1 represents Brand1 and 2 represents Brand2.
 Name the outcome variable. The outcome (response) variable is the observed height and is designated with the variable named Height.
 The grouping codes specify which observation belongs to which type of fertilizer. Thus, to set up the data for most statistics programs, place one observation per line, with each data line containing two variables: a fertilizer code (Type) and the corresponding response variable (Height.)
The values 1 and 2 in the “type” column represent the two brands of fertilizer and the “height” variable is the outcome height measurement on the plants. (The codes 1 and 2 in this data set were arbitrarily selected. You could have used 0 and 1 or any other set of binary codes.)
SPSS stepbystep: Twosample ttest with equal variances
To run the twosample ttest on the FERTILIZER.SAV data, follow these steps:
 Open the dataset FERTILIZER.SAV and select Analyze/Compare Means/Independent Samples T Test … . (Or create the data set using the data shown above)
 Select Height as the Test Variable and Type as the Grouping Variable.
 Click on the Define Groups button and define the group values as 1 and 2.
 Click Continue and OK and the tables appear shown in Table 3.4 are displayed.
 To display the boxplot select Graphs/Boxplot and choose Simple Boxplot and then Define. Select Height as the Variable and Type as the Category Axis..
The boxplot is a way to examine the assumptions of normality and equality of variances.
From the listing of the data and boxplot notice that the sample sizes are small with only 7 observations in group 1 and 6 observations in group 2. Also note that the distributions of both groups are relatively symmetric and the variances appear to be fairly similar. There is no evidence of any sizeable outliers. Here is the information in the SPSS output you need to perform the ttest.
First notice the results of the Ftest (Levene’s test) for evaluating the equality of variance. There it can be seen that the pvalue is 0.79 which indicates that the variances are not significantly different. You now have two pieces of information that indicate the variances are similar (the boxplot and Levene’s test).
Therefore, the appropriate ttest is the one that assumes equal variances. However, if you choose to go with the conservative approach, you will use the “Equal variances not assumed” ttest. In this case your final decision for the significance of the ttest would not be different.
The following information discusses methods of interpreting the output from “Independent Samples Test” table.
Making a decision based on the pvalue: The pvalue for the equal variances ttest is p = 0.269. Since this pvalue is greater than 0.05, the decision would be that there is no significant difference between the two groups. (Do not reject the null hypothesis.) Thus, there is not enough evidence to conclude that the mean heights are different.
Making a decision based on the confidence interval: The 95% confidence intervals for the difference in means are given in the last two columns of Table 3.4. The interval associated with the assumption of equal variances is (3.41 to 1.05) while the confidence interval when equal variances are not assumed is (3.39 to 1.03). Since these intervals include 0 (zero) we again conclude that there is no significant difference between the means using either assumption regarding the variances. Thus, you would make the same decisions discussed in the pvalue section above.
Reporting the results of a (nonsignificant) twosample ttest
The following sample writeups illustrate how you might report this twosample ttest in publication format. For purposes of illustration, we use the “Equal variance” ttest for the remainder of this example:
Narrative for the Methods Section:
“A twosample Student’s ttest assuming equal variances using a pooled estimate of the variance was performed to test the hypothesis that the resulting mean heights of the plants for the two types of fertilizer were equal.”
Narrative for the Results Section:
“The mean heights of plants using the two brands of fertilizer were not significantly different, t(11) = 1.17, p =0.27.”
or to be more complete,
“The mean height of plants using fertilizer Brand 1 (M = 53.07, SD =1.91, N=7) was not significantly different from that using fertilizer Brand 2 (M = 54.25, SD = 1.71 N=6), t(11) = 1.17, p =0.27.”
A description of the confidence interval would read:
“A 95% confidence interval on the difference between the two population means using a Student’s t distribution with 11 degrees of freedom is (3.41, 1.05) which indicates that there is not significant evidence that the fertilizers produce different mean growth heights.”
For more details on the process see Statistical Analysis Quick Reference Guidebook with SPSS Example, Elliott and Woodward, Sage Publications.
See
www.stattutorials.com/SPSSDATA
for files mentioned in this tutorial © TexaSoft, 2008