| Table of Contents
| Function Reference
| Function Finder
| R Project |
RELATED MEASURES t TEST
Syntax
The syntax for the t.test() function is given
here from the help page in R.
## Default S3 method:
t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, ...)
## S3 method for class 'formula':
t.test(formula, data, subset, na.action, ...)
"S3" refers to the S language (version 3), which is often the same as the
methods and syntax used by R. In the case of the
t.test() function, there
are two alternative syntaxes, the default, and the "formula" syntax. Both
syntaxes are relevant to the two-sample t-tests. The default syntax requires
two data vectors, "x" and "y", to be specified. To get the dependent measures
t-test, the option "paired=" must be set to TRUE, which is not the default.
The "alternative=" option is set by default to "two.sided" but can
be set to any of the three values shown above. The default null hypothesis is
"mu = 0", which in this case should be read as "mu1-mu2=0". This is usually what
we want, but doesn't have to be, and should be changed by the user if this is
not the null hypothesis. The rest is either irrelevant to this tutorial or can
be ignored for the moment.
The t Test With Two Dependent Groups
Dependent groups can be the same subjects used again (repeated measures), or
they can be matched samples. Either way, the t-test is performed on the
difference scores and amounts to little more than a single sample t-test. A
normal distribution of difference scores is strongly encouraged, unless the
sample is large enough that you can hide behind the central limit theorem and
claim a normal sampling distribution of means of the differences, in which case
the t-test is robust to violations of the normality assumption.
If you are doing the test on the difference scores, see the
Single Sample t Test tutorial. If the groups
are represented by two vectors (or columns in a data frame), then it is just
a matter of setting the "paired=" option in the
t.test() function to TRUE. One caution: If the data are two vectors in
your workspace, you need to remember that the scores are paired, i.e., score 1
in the "x" vector is paired with score 1 in the "y" vector, score 2 with score 2,
etc. The scores must be kept in the correct "paired" order in the two vectors.
> data(anorexia, package="MASS") # weight gain (lbs.) in anorexic women
> attach(anorexia)
> str(anorexia)
'data.frame': 72 obs. of 3 variables:
$ Treat : Factor w/ 3 levels "CBT","Cont","FT": 2 2 2 2 2 2 2 2 2 2 ...
$ Prewt : num 80.7 89.4 91.8 74 78.1 88.3 87.3 75.1 80.6 78.4 ...
$ Postwt: num 80.2 80.1 86.4 86.3 76.1 78.1 75.1 86.7 73.5 84.6 ...
> ft = subset(anorexia, subset=(Treat=="FT")) # just the family therapy threatment
> ft
Treat Prewt Postwt
56 FT 83.8 95.2
57 FT 83.3 94.3
58 FT 86.0 91.5
59 FT 82.5 91.9
60 FT 86.7 100.3
61 FT 79.6 76.7
62 FT 76.9 76.8
63 FT 94.2 101.6
64 FT 73.4 94.9
65 FT 80.5 75.2
66 FT 81.6 77.8
67 FT 82.1 95.5
68 FT 77.6 90.7
69 FT 83.5 92.5
70 FT 89.9 93.8
71 FT 86.0 91.7
72 FT 87.3 98.0
> detach(anorexia)
> rm(anorexia)
The anorexia data frame has been retrieved from the "MASS" package, and the
data corresponding to the Family Therapy treatment have been extracted in a new
data frame called "ft" (and we cleaned up after ourselves). A note: in the
subset() function, "subset=" is the second
argument in the default syntax, so this command could have been written somewhat
more simply and logically as:
ft=subset(anorexia,Treat="FT"). The data frame has
to be attached to do this. Otherwise, you must use
anorexia$Treat="FT".
At this point, "ft" is set up in such a way that a test on the difference
scores would be easy.
> t.test(Postwt-Prewt, mu=0, data=ft, alternative="greater")
Error in t.test(Postwt - Prewt, mu = 0, data = ft, alternative = "greater") :
object "Postwt" not found
EXCEPT there is no "data=" option unless we are using the formula interface.
Drat!
> with(ft, t.test(Postwt-Prewt, mu=0, alternative="greater")) # we could also attach(ft)
One Sample t-test
data: Postwt - Prewt
t = 4.1849, df = 16, p-value = 0.0003501
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
4.233975 Inf
sample estimates:
mean of x
7.264706
The null hypothesis is rejected at any reasonable alpha level. The 95% CI tells
us the true mean difference is 4.23 or more with 95% confidence, and the sample mean
difference is reported as 7.26 lbs. I.e., women receiving family therapy for
anorexia gained, on average, 7.26 pounds during the treatment period. If you
want a different confidence level, set that with the "conf.level=" option.
The same result will be obtained from the dependent t-test.
> with(ft, t.test(Postwt, Prewt, paired=T, alternative="greater"))
Paired t-test
data: Postwt and Prewt
t = 4.1849, df = 16, p-value = 0.0003501
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
4.233975 Inf
sample estimates:
mean of the differences
7.264706
In the first version of the test, the single-sample version, we let R do the
subtraction to get the difference scores right inside the
t.test()
function. No use in creating a new data object that we will just have to discard
anyway. In the second version, we listed the two vectors individually with a
comma between them. This works as long as Xi in the first
vector corresponds to Yi in the second vector, and as long as
the "paired=T" option is set. (Otherwise, we will get an independent t-test.)
Notice R subtracts first group listed minus second group listed. This determined
that the alternative should be set as "greater" in this case, since we expect
the patients to gain weight during the treatment period.
A formula interface is also available for instances where the data frame is
arranged in long form. Our current data frame is very simple. It contains two
columns of scores representing the two times at which patients were
measured (and a useless Treat column). Getting such a data frame into long form
is relatively easy.
> ft$Treat = NULL # delete unnecessary columns
> ft.long = stack(ft)
> ft.long
values ind
1 83.8 Prewt
2 83.3 Prewt
3 86.0 Prewt
4 82.5 Prewt
5 86.7 Prewt
6 79.6 Prewt
7 76.9 Prewt
8 94.2 Prewt
9 73.4 Prewt
10 80.5 Prewt
11 81.6 Prewt
12 82.1 Prewt
13 77.6 Prewt
14 83.5 Prewt
15 89.9 Prewt
16 86.0 Prewt
17 87.3 Prewt
18 95.2 Postwt
19 94.3 Postwt
20 91.5 Postwt
21 91.9 Postwt
22 100.3 Postwt
23 76.7 Postwt
24 76.8 Postwt
25 101.6 Postwt
26 94.9 Postwt
27 75.2 Postwt
28 77.8 Postwt
29 95.5 Postwt
30 90.7 Postwt
31 92.5 Postwt
32 93.8 Postwt
33 91.7 Postwt
34 98.0 Postwt
For the t-test, that's all we need. (If there were more than two groups, or two
measurement times, this would be more complex.) In such a data frame, however,
the subjects' scores must remain paired. I.e., the first Prewt must correspond
to--be from the same (or possibly matched) subject--as the first Postwt, the
second Prewt with the second Postwt, etc. If that's NOT the case, then we would
need a subject identifier in the data frame, AND we would need to do a
different test. We could also rename the columns if we want to, but it's not
mandatory.
> t.test(values ~ ind, paired=T, alternative="less", data=ft.long) ### THIS IS WRONG!
Paired t-test
data: values by ind
t = 4.1849, df = 16, p-value = 0.9996
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf 10.29544
sample estimates:
mean of the differences
7.264706
We got a little confused about the order in which the subtraction
would be done. When R is fed a factor as part of a formula, it arranges the
factor levels in alphabetical order. A summary will confirm this.
> summary(ft.long)
values ind
Min. : 73.40 Postwt:17
1st Qu.: 80.78 Prewt :17
Median : 86.35
Mean : 86.86
3rd Qu.: 93.47
Max. :101.60
And R subtracts first level minus second level, i.e., in this case, Postwt
minus Prewt. That would make the correct alternative "greater". (The order in
which the factor levels occur in the data frame is irrelevant.)
> t.test(values ~ ind, paired=T, alternative="greater", data=ft.long)
Paired t-test
data: values by ind
t = 4.1849, df = 16, p-value = 0.0003501
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
4.233975 Inf
sample estimates:
mean of the differences
7.264706
There we go. But, WARNING, the factor levels don't have to be seen by R in
alphabetical order! So it's always a good idea to check.
> levels(ft.long$ind)
[1] "Postwt" "Prewt"
Why might they be in some other order? Because WE, the humans, put them in
some other order. Unlike a lot of modern software, R does what it's told to
do by the humans who are using it! Well... usually it does! Don't do what I'm
about to do!
> ### WARNING: DO NOT DO THIS! THIS IS VERY BAD!! ###
> levels(ft.long$ind) = c("Prewt","Postwt")
[1] "Prewt" "Postwt"
> summary(ft.long)
values ind
Min. : 73.40 Prewt :17
1st Qu.: 80.78 Postwt:17
Median : 86.35
Mean : 86.86
3rd Qu.: 93.47
Max. :101.60
The joke is certainly on me! Turns out the levels()
function, not only relevels the variable, it also relabels it. I.e., it flips the
"ind" vector without flipping the "values" vector! That is very bad and surely a
bug (and if it isn't, it certainly should be). So what is the "correct" way to
do it? Okay, you can do this one, but be very careful with your typing. (And
remember, if you have the data frame attached, detach it first.)
> ft.long$ind = factor(ft.long$ind, levels=c("Prewt","Postwt")) ### WARNING: DON'T MISTYPE!
> ### Now the original test is the correct one!
> t.test(values ~ ind, paired=T, alternative="less", data=ft.long)
Paired t-test
data: values by ind
t = -4.1849, df = 16, p-value = 0.0003501
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -4.233975
sample estimates:
mean of the differences
-7.264706
> ### Because...
> levels(ft.long$ind)
[1] "Prewt" "Postwt"
> ### ... the factor has been releveled without the data frame being altered!
When you are fooling with the levels of a factor, DON'T MISTYPE! A typing mistake
here could wipe out your variable in the data frame. Take my word for that! I've
done it! (Sadly, more than once!) Now the subtraction will be in the order Prewt
minus Postwt, and the original test with alternative="less" is the correct one.
Confusing! But that is the price you pay for having repeated measures
factors! Just be sure you know the order in which R is going to see your
factor levels. You can do that with levels()
WITHOUT AN ASSIGNMENT, or with summary().
Sane people would not organize their data frame as we have above. Instead,
sane people would keep the data from each subject together on contiguous lines
of the data frame, like this.
> ### If you want to follow along, type the following to reorder the data frame. ###
> ft.long$subjects = rep(LETTERS[1:17],2)
> ft.long = ft.long[order(ft.long$subjects),]
> ft.long
values ind subjects
1 83.8 Prewt A
18 95.2 Postwt A
2 83.3 Prewt B
19 94.3 Postwt B
3 86.0 Prewt C
20 91.5 Postwt C
4 82.5 Prewt D
21 91.9 Postwt D
5 86.7 Prewt E
22 100.3 Postwt E
6 79.6 Prewt F
23 76.7 Postwt F
7 76.9 Prewt G
24 76.8 Postwt G
8 94.2 Prewt H
25 101.6 Postwt H
9 73.4 Prewt I
26 94.9 Postwt I
10 80.5 Prewt J
27 75.2 Postwt J
11 81.6 Prewt K
28 77.8 Postwt K
12 82.1 Prewt L
29 95.5 Postwt L
13 77.6 Prewt M
30 90.7 Postwt M
14 83.5 Prewt N
31 92.5 Postwt N
15 89.9 Prewt O
32 93.8 Postwt O
16 86.0 Prewt P
33 91.7 Postwt P
17 87.3 Prewt Q
34 98.0 Postwt Q
It doesn't matter as far as the t.test() function
is concerned. That function will pair the first value it finds of Prewt with
the first value it finds of Postwt, the second value with the second value,
etc.
> levels(ft.long$ind)
[1] "Prewt" "Postwt"
> t.test(values ~ ind, paired=T, alternative="less", data=ft.long)
Paired t-test
data: values by ind
t = -4.1849, df = 16, p-value = 0.0003501
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -4.233975
sample estimates:
mean of the differences
-7.264706
Of course, we should be pilloried in the public square for not doing a proper
examination of our data beforehand.
> qqnorm(ft$Postwt - ft$Prewt) # output not shown
> qqline(ft$Postwt - ft$Prewt)
> plot(ft$Prewt, ft$Postwt)
> plot(ft$Prewt, Change <- ft$Postwt-ft$Prewt) # cannot use = here!
We have a problem with normality in a small data set (n=17). We also have a
problem with nonresponders (four women whose body weight didn't change or
actually fell during therapy). And even among those who did change, the
effect was not additive. Women who started off with lower body weights
tended to change the most. These are all violations of the assumptions of
the test.
Matched Samples
The previous example was a repeated measures design--i.e., the same subjects
(experimental units) were being measured repeatedly. What if each subject is
measured only once, but the subjects are paired up on a matching variable?
Then we have a matched samples (or matched subjects or matched groups) design. I
want to be sure to include one of these, because I'm going to refer to it in
a later tutorial. The important thing to remember
for now is that the analysis is the same.
For decades it's been suspected that schizophrenia involves anatomical
abnormalities in the hippocampus, an area of the brain involved with memory. The
following data bearing on this issue are from Suddath et al. (1990) and were
used by (and obtained from) Ramsey and Schafer (3rd ed., 2013, p. 31. Display
2.2). The researchers obtained MRI measurements of the volume of the left
hippocampus from 15 pairs of identical twins discordant for schizophrenia. The
data are displayed in the following table. (You should be able to copy and
paste the following lines to get the data into R.)
schizophrenia = read.table(header=T, text="
pair affected unaffected
1 1.27 1.94
2 1.63 1.44
3 1.47 1.56
4 1.39 1.58
5 1.93 2.06
6 1.26 1.66
7 1.71 1.75
8 1.67 1.77
9 1.28 1.78
10 1.85 1.92
11 1.02 1.25
12 1.34 1.93
13 2.02 2.04
14 1.59 1.62
15 1.97 2.08
")
Just eyeballing it leaves little doubt that the hippocampus is smaller in the
affected cotwin--the "interocular trauma test" as my old stat prof (Tom Wickens)
called it, because the result jumps out and hits you between the eyes. Most
journal editors don't recognize this statistical technique, however, and would
prefer that we cite the result of a t-test. Here it is.
> with(schizophrenia, t.test(affected, unaffected, paired=T, alternative="less"))
Paired t-test
data: affected and unaffected
t = -3.2289, df = 14, p-value = 0.003031
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -0.09029832
sample estimates:
mean of the differences
-0.1986667
- Ramsey, Fred L. and Schafer, Daniel W. The Statistical Sleuth: A Course
in Methods of Data Analysis (3rd ed.). Boston: Brooks/Cole Cengage, 2013.
(Note: If anyone is interested in my opinion, this is one of the very best
statistics books I've ever had the pleasure of encountering.)
- Suddath, R. L., et al., (1990). Anatomical abnormalities in the brains of
monozygotic twins discordant for schizophrenia. New England Journal of
Medicine, 322(12), 789-793.
Alternatives to the Related Samples t Test
The primary alternative to the paired t-test when normality is in question
has historically been the Wilcoxin signed ranks test, the syntax of which (from
the help page) is very similar.
wilcox.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, exact = NULL, correct = TRUE,
conf.int = FALSE, conf.level = 0.95, ...)
It will work either without or with the formula interface and also as a single
sample test.
> ### single sample case
> with(ft, wilcox.test(Postwt-Prewt, mu=0, alternative="greater"))
Wilcoxon signed rank test
data: Postwt - Prewt
V = 142, p-value = 0.0004196
alternative hypothesis: true location is greater than 0
> ### two-sample case without the formula interface
> with(ft, wilcox.test(Postwt, Prewt, alternative="greater", paired=T))
Wilcoxon signed rank test
data: Postwt and Prewt
V = 142, p-value = 0.0004196
alternative hypothesis: true location shift is greater than 0
> ### two-sample case with the formula interface
> wilcox.test(values ~ ind, paired=T, alternative="less", data=ft.long)
Wilcoxon signed rank test
data: values by ind
V = 11, p-value = 0.0004196
alternative hypothesis: true location shift is less than 0
> ### with a confidence interval
> wilcox.test(values ~ ind, paired=T, alternative="less", data=ft.long, conf.int=T)
Wilcoxon signed rank test
data: values by ind
V = 11, p-value = 0.0004196
alternative hypothesis: true location shift is less than 0
95 percent confidence interval:
-Inf -4.05
sample estimates:
(pseudo)median
-7.65
In older versions of R, there was no "data=" option in the
wilcox.test()
function. That appears to have been corrected in newer versions.
revised 2016 January 30
| Table of Contents
| Function Reference
| Function Finder
| R Project |
|