Comparing Independent Group and Paired t-tests using SAS
PROC TTEST
See www.stattutorials.com/SASDATA for files mentioned in this tutorial © TexaSoft, 2007
These SAS statistics tutorials briefly explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for SAS Software
Comparing Independent Group and Paired t-tests
It is not uncommon for researchers to perform an incorrect t-test when comparing two “groups.” The correct t-test depends on how the data are observed (the design of the experiment.)
Independent Samples: When data are collected on subjects where subjects are (hopefully randomly) divided into two groups, this is called an independent or parallel study. That is, the subjects in one group (treatment, etc) are different from the subjects in the other group. This data may be analyzed using an independent group t-test (sometimes called an independent samples t-test or parallel test.) This version of the t-test is testing the null hypothesis (two-sided):
Ho: m1 = m2 (means of the two groups are equal)
Ha: m1 ¹ m2 (means are not equal)
Dependent Samples: When data are collected twice on the same subjects (or matched subjects) the proper analysis is a paired t-test (also called a dependent samples t-test). In this case, subjects may be measured in a before – after fashion, or in a design where a treatment is administered for a time, there is a washout period, and another treatment is administered (in random order for each subject). Or, data might be measured on the same individual in two areas such as one treatment in one eye and another treatment for another eye (or leg, or arm, etc). In these cases the measurement of interest is the difference between the first and second measure. Thus, the null hypothesis (two-sided) is:
Ho: mdifference = 0 (The average difference is 0)
Ha: mdifference ≠ 0 (The average difference is not 0)
Why it makes a difference: Performing an incorrect t-test on your data can cause you to miss a significant difference when one might exist. As an example, consider the data from a paper by Raskin and Unger (1978) where four diabetic patients were used to compare the effects of insulin infusion regimens. One treatment was insulin and somatostatin (IS) and the other treatment was insulin, somatostatin and gulcagon (ISG). Each subject was given each treatment with a period of washout between treatments. The data follow:
Patient |
Treatment |
|
|
Number |
IS |
ISG |
Difference |
1 |
14 |
17 |
3 |
2 |
6 |
8 |
2 |
3 |
7 |
11 |
4 |
4 |
6 |
9 |
3 |
Mean |
8.25 |
11.25 |
3.0 |
S.E.M. |
1.9 |
2 |
.40 |
A paper by Thomas Louis (1984) looked at this data using both types of t-tests. The correct version of the t-test to use for this data set is the paired t-test since each patient is observed twice. However, it is all too common for researchers to compare the means 8.25 versus 11.25 using an independent group approach. To see how these approaches differ, consider how the two analyses would be performed in SAS.
Independent group analysis: The code to perform this analysis using an independent group t-test is: (PROCTTEST_IND.SAS)
data diabetic;
input treatment $ urea;
datalines;
IS 14
IS 6
IS 7
IS 6
ISG 17
ISG 8
ISG 11
ISG 9
;
ODS HTML;
PROC TTEST;
CLASS TREATMENT;
VAR UREA;
RUN;
PROC BOXPLOT;
PLOT UREA*TREATMENT;
RUN;
ODS HTML CLOSE;
You get the following output (only part of the output is shown here). (Remember that this is the incorrect t-test to analyze this data):
The first table shows you that the two means differ by 11.25-8.25 = 3 with a (pooled) standard error of 2.80.
Variable |
treatment |
N |
Mean |
Std Dev |
Std Err |
Min |
Max |
urea |
IS |
4 |
8.25 |
3.8622 |
1.93 |
6 |
14 |
urea |
ISG |
4 |
11.25 |
4.0311 |
2.02 |
8 |
17 |
urea |
Diff (1-2) |
|
-3 |
3.9476 |
2.80 |
|
|
Since the “Equality of variances” table below indicates that the variances can be assumed equal (p=.95), you perform the “Pooled/Equal” t-test, which gives a p-value of p=.32. (Not a statistically significant result.)
t-Tests |
|||||
Variable |
Method |
Variances |
DF |
t Value |
Pr > |t| |
urea |
Pooled |
Equal |
6 |
-1.07 |
0.3238 |
urea |
Satterthwaite |
Unequal |
5.99 |
-1.07 |
0.3239 |
Equality of Variances |
|||||
Variable |
Method |
Num DF |
Den DF |
F Value |
Pr > F |
urea |
Folded F |
3 |
3 |
1.09 |
0.9455 |
Furthermore, a comparative box plot shows a lot of overlap between the two groups.
This independent group analysis is NOT the correct analysis. This graph, by the way, is also misleading and not appropriate for a paired analysis.
Since the data in this example are paired you should instead do the PAIRED version of the t-test.
Paired t-test analysis: The appropriate analysis
for this data is a paired t-test. The calculations for this test can be
performed using the following SAS code (PROCTTEST_PAIRED.SAS):
data diabetic;
input IS ISG;
datalines;
14 17
6 8
7 11
6 9
ODS HTML;
PROC TTEST;
PAIRED IS*ISG;
RUN;
ODS HTML CLOSE;
The (partial) output is as follows. Note that the analysis is performed on the mean of the differences (-4.299) and that the standard error of the difference is 0.41 (much less than the standard error (2.80) in the previous analysis.)
Difference |
N |
Lower CL |
Mean |
Upper CL |
Lower CL |
Std Dev |
Upper CL |
Std Err |
IS - ISG |
4 |
-4.299 |
-3 |
-1.701 |
0.4625 |
0.8165 |
3.0443 |
0.4082 |
The paired t-test yields p=0.005, which is statistically significant.
T-Tests |
|||
Difference |
DF |
t Value |
Pr > |t| |
IS - ISG |
3 |
-7.35 |
0.0052 |
The reason that the paired t-test found significance when the independent t-test on the same data did not achieve significance is because the paired analysis is the more correct analysis and therefore it is able to make use of a much smaller standard error (of the mean difference rather than pooled.)
In his paper, Louis explains that to achieve the power of this paired t-test, an independent group t-test (parallel test) would require 14 times as many subjects. Thus, when the model is appropriate, the paired t-test can be a more powerful design to analysis your data. On the other hand, if you use a paired analysis on independent group data you will get incorrect and misleading results. Therefore, carefully consider how your experiment is designed before you select which t-test to perform.
References:
Louis TA, Lavori, PW, Bailer, JC and Polansky, M (1984), “Crossover and Self Controlled Designs in Clinical Research,” NEJM, 310:24-31.
Raskin, P, Unger, RH, Hyperglucagonemia and its suppression: importance in the metabolic control of diabetes. N Engl J Med 1978: 299;433-6.
End of tutorial
See http://www.stattutorials.com/SAS


