| Table of Contents
| Function Reference
| Function Finder
| R Project |
FACTORIAL BETWEEN SUBJECTS ANOVA
Preliminaries
Model Formulae
You would probably profit from reading the Model
Formulae tutorial before going on.
Complexities
Anyone who had a graduate school course in analysis of variance is probably
still having flashbacks. Experimental designs can become extremely complex,
with complications like nesting, blocking, split-plot, Latin square, etc. In
this short tutorial, I will stick to the basics, which is to say I will stick to
factorial designs (which involve replication in the design cells) that are
entirely between subjects and that involve no restrictions on randomization
(e.g., such as blocking, etc.).
While we're on the topic of complexity, I should add a comment about keeping
experimental designs balanced (i.e., equal numbers of subjects in each cell of
the design). In single-factor designs, this is not so crucial, as unbalancing
the design has the primary effect of making the ANOVA more sensitive to
violation of assumptions. However, when a factorial design becomes unbalanced,
either because of subject attrition or because the study is observational in
nature and subjects were "taken as they came," then the factors lose their
independence from one another. BALANCED DESIGNS ONLY will be discussed in
this tutorial. Unbalanced Factorial Designs now
have a tutorial to themselves. If you have an unbalanced factorial design and
are not familiar with the issues involved, READ THAT TUTORIAL before using
these procedures.
Summarizing Factorial Data
For this tuturial, we will examine data on the effect of vitamin C on tooth
growth in guinea pigs.
> data(ToothGrowth)
> str(ToothGrowth)
'data.frame': 60 obs. of 3 variables:
$ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
$ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
> summary(ToothGrowth)
len supp dose
Min. : 4.20 OJ:30 Min. :0.500
1st Qu.:13.07 VC:30 1st Qu.:0.500
Median :19.25 Median :1.000
Mean :18.81 Mean :1.167
3rd Qu.:25.27 3rd Qu.:2.000
Max. :33.90 Max. :2.000
The variables are "len", which is tooth length, the response variable, the way
in which the supplement was administered, "supp", of which the levels are "OJ"
(as orange juice) and "VC" (as ascorbic acid), and "dose" of the supplement in
milligrams (0.5, 1.0, and 2.0). To begin, we should note that R considers "dose"
to be numeric. If we want dose treated as a factor (we do), we will have to be
absolutely certain that R sees it as such
when we construct the model formula. Otherwise, we will end up with an analysis
of covariance, which is not what we want.
First, let's end all doubt that we want "dose" to be a factor.
> ToothGrowth.bak = ToothGrowth
> ToothGrowth$dose = factor(ToothGrowth$dose,
+ levels=c(0.5,1.0,2.0),
+ labels=c("low","med","high"))
> summary(ToothGrowth)
len supp dose
Min. : 4.20 OJ:30 low :20
1st Qu.:13.07 VC:30 med :20
Median :19.25 high:20
Mean :18.81
3rd Qu.:25.27
Max. :33.90
The function factor() is used to declare
"ToothGrowth$dose" as a factor,
with levels currently coded as 0.5, 1.0, and 2.0, but now recoded with labels
of "low", "med", and "high". Just to be sure, I looked at a summary. Can't be
too careful. And be ESPECIALLY CAREFUL with the factor()
function. One little typo here can turn your data to garbage. I'd recommend
making a backup copy of the data frame before fooling with it like that.
Which you'll notice I did. I've learned that the hard way!
Now let's check for a balanced design by crosstabulating the factors. Note:
You could do it fancier by using
replications(len~supp*dose, data=ToothGrowth),
but crosstabling the factors is easier and gives a more straightforward answer.
> with(ToothGrowth, table(supp, dose))
dose
supp low med high
OJ 10 10 10
VC 10 10 10
We find that we have 10 subjects (teeth--actually odontoblasts) in each of
the 2x3=6 design cells. We have to assume that each of these tooth measurements
is an independent measurement. For example, it would be bad news if more than
one of these measurements came form the same subject.
There are a number of ways the data can be summarized graphically. First,
let's look at boxplots.
> boxplot(len ~ supp * dose, data=ToothGrowth,
+ ylab="Tooth Length", main="Boxplots of Tooth Growth Data")
Try dose * supp and see how the boxplots change. In the meantime, we should
definitely make a note to check our homogeneity and normality assumptions! We
might get a better idea of what our main effects and interactions look like
if we look at a traditional interaction plot, also called a profile plot.
> with(ToothGrowth, interaction.plot(x.factor=dose, trace.factor=supp,
+ response=len, fun=mean, type="b", legend=T,
+ ylab="Tooth Length", main="Interaction Plot",
+ pch=c(1,19)))
The syntax is straightforward, but there are a lot of options. First,
"x.factor=" specifies the factor to be plotted on the horizontal axis. Second,
"trace.factor=" specifies the factor to plotted as individual lines on the
graph. Third, "response=" specifies the response variable. Fourth, "fun="
specifies the function to be calculated on the response variable. "Mean" is the
default, so we could have left this one out. Fifth, "type=" specifies whether to
plot lines, points, or both. We plotted "b"oth. We also asked for a legend,
a label on the y-axis, and a main title on the graph. Finally, we specified the
point characters to use. We could also have changed the line type, line color,
and just about everything else about the graph, if we wanted to get really
fussy, er, I mean, aRtistic.
Another type of graphic we could look at is called a conditioning plot.
I shall borrow the example right off the help page for the ToothGrowth data set.
> coplot(len ~ dose | supp, data=ToothGrowth, panel=panel.smooth,
+ xlab="ToothGrowth data: length vs dose, given type of supplement")
The formula is the tricky part here and should be read as "length by dose
given supplement." The "panel=" option draws the red lines when set to
"panel.smooth". The top panel can be suppressed by setting "show.given=" to
FALSE. This gives a much more sensible looking plot in my humble opinion.
Numerical summaries of the data can be done with the tapply() function.
> with(ToothGrowth, tapply(len, list(supp,dose), mean))
low med high
OJ 13.23 22.70 26.06
VC 7.98 16.77 26.14
> with(ToothGrowth, tapply(len, list(supp,dose), var))
low med high
OJ 19.889 15.295556 7.049333
VC 7.544 6.326778 23.018222
Remember, in tapply(), the first argument is the
response variable, the second argument is a list of factors by which the
response should be partitioned (first factor given will be printed in the rows
of the resulting table), and the third argument is the function you wish to
calculate. Once again, we make note of some consternation over those cell
variances.
We could also use the by() function (same
syntax), but the output is not nearly so concisely formatted. The aggregate() function is tidier, and the function has a
much more convenient formula interface, but I still prefer
the output of tapply().
> aggregate(len ~ supp + dose, ToothGrowth, FUN=mean)
supp dose len
1 OJ low 13.23
2 VC low 7.98
3 OJ med 22.70
4 VC med 16.77
5 OJ high 26.06
6 VC high 26.14
Another way to get similar information is to use the
model.tables() function, but the ANOVA has to be done first and saved
into a model object. (WARNING: Do not do this with unbalanced data. The
answer will NOT be correct!)
> aov.out = aov(len ~ supp * dose, data=ToothGrowth)
Note: The model formula specifies "length as a function of supplement and dose
with all possible interactions between the factors. (The asterisk says "include
the interactions.") Now we can look at some model tables.
> model.tables(aov.out, type="means", se=T)
Tables of means
Grand mean
18.81333
supp
supp
OJ VC
20.663 16.963
dose
dose
low med high
10.605 19.735 26.100
supp:dose
dose
supp low med high
OJ 13.23 22.70 26.06
VC 7.98 16.77 26.14
Standard errors for differences of means
supp dose supp:dose
0.9376 1.1484 1.6240
replic. 30 20 10
We have requested the various cell and marginal means along with standard errors
for the differences between means. The standard errors are calculated in the
usual way for comparisons by protected t-tests (Fisher LSD). The options must
be explicitly specified, as they are not the defaults (see the help page).
Finally, let's test for homogeneity of variance. The function for the Bartlett
test will not properly handle a factorial design, and it won't warn you about it
either! It'll give you an answer, but that answer will be wrong. The
Fligner-Killeen test function doesn't understand factorial designs either, but
at least it warns you. A work-around is to create a factor that identifies every
one of the design cells individually.
> ToothGrowth$cells = factor(paste(ToothGrowth$supp, ToothGrowth$dose, sep="."))
> summary(ToothGrowth)
len supp dose cells
Min. : 4.20 OJ:30 low :20 OJ.high:10
1st Qu.:13.07 VC:30 med :20 OJ.low :10
Median :19.25 high:20 OJ.med :10
Mean :18.81 VC.high:10
3rd Qu.:25.27 VC.low :10
Max. :33.90 VC.med :10
> bartlett.test(len ~ cells, data=ToothGrowth)
Bartlett test of homogeneity of variances
data: len by cells
Bartlett's K-squared = 6.9273, df = 5, p-value = 0.2261
Aside: The paste() functions "pastes" together
strings. For example:
> paste("horse", "shoe", sep="")
[1] "horseshoe"
Here we used it to paste together the level names from the two factors,
separated by a period. The default separater is a space, by the way.
The leveneTest() function in the "car" package
(not installed by default) appears to handle the formula interface properly,
although the help page is unclear about this.
> library(car) # will fail if you haven't installed this package
> leveneTest(len ~ dose * supp, data=ToothGrowth)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 5 1.7086 0.1484
54
The cell variances are not significantly different.
Factorial ANOVA
We've already run the ANOVA above when we wanted to look at the model tables,
but just to remind you, the function was as follows.
> aov.out = aov(len ~ supp * dose, data=ToothGrowth)
You do not have to execute it again if you've already done it. This creates a
model object in your workspace, which we've named "aov.out", that contains
a wealth of information. The extension to cases with more factors, for
completely between-groups designs where no Error structure needs to be specified,
would be straightforward. Just list all the factors after the tilde and separate
them by asterisks. This gives a model with all possible main effects and
interactions. To leave out interactions, separate the factor names with plus
signs rather than asterisks.
Before we print anything to the Console, I'm going to turn off the printing
of significance stars because I find them annoying.
> options(show.signif.stars=F)
Now we can look at the ANOVA table using either
summary() or summary.aov().
> summary(aov.out)
Df Sum Sq Mean Sq F value Pr(>F)
supp 1 205.35 205.35 15.572 0.0002312
dose 2 2426.43 1213.22 92.000 < 2.2e-16
supp:dose 2 108.32 54.16 4.107 0.0218603
Residuals 54 712.11 13.19
We see two significant main effects and a significant interaction. In R lingo,
two or more factor names separated by colons refers to the interaction between
those factors. (I warned you there would be another meaning for a colon. There
it is.)
For post hoc tests of main effects, we have the traditional Tukey HSD test.
We don't need a post hoc test on "supp" because it has only two levels, and
the ANOVA has already told us that those groups are significantly different.
So I'll restrict the test to just the "dose" factor. (If you don't do this,
you'll get comparisons between every cell in the table.)
> TukeyHSD(aov.out, which="dose")
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = len ~ supp * dose, data = ToothGrowth)
$dose
diff lwr upr p adj
med-low 9.130 6.362488 11.897512 0.0e+00
high-low 15.495 12.727488 18.262512 0.0e+00
high-med 6.365 3.597488 9.132512 2.7e-06
The degree of confidence required for the confidence intervals can be set using
the "conf.level=" option. The default value is 0.95. Granted, examining main
effects in the presence of a significant interaction is usually not called for,
but I wanted to demonstrate the method.
If you want pairwise t-tests on any factor, you can have them, with a variety
of different methods available for adjusting the p-value for the number of tests
being done (see the help page for details).
> with(ToothGrowth, pairwise.t.test(x=len, g=dose, p.adjust.method="bonferroni")) # x=DV, g=IV
Pairwise comparisons using t tests with pooled SD
data: len and dose
low med
med 2.0e-08 -
high 4.4e-16 4.3e-05
P value adjustment method: bonferroni
Enter the response variable as the first argument followed by a comma and then
the grouping variable or factor. Methods available for adjusting the p-values
are holm (default), hochberg, hommel, bonferonni, BH, BY, fdr, and none. The
help page will explain what those are (because I can't). An optional package
called "multcomp" is available that implements additional post hoc multiple
comparison tests. Also, see the Multiple Comparisons
tutorial for more information.
Examining Assumptions Graphically
There are other plots available.
> plot(aov.out, 1) # visual check of homogeneity of variance
> plot(aov.out, 2) # visual check of normality
These are regression diagnostic plots, and there are six of them available
altogether. For ANOVA, the first two are the most useful. The first one plots
residuals against fitted values (which are just the cell means). We've got a
bit of a problem with this one as two of the cell means are so close that
residuals from those two cells are being overplotted. That's why we only see
five groups represented. Nevertheless, we can see that there is no obvious
relationship between the size of the cell means and the variability, which is
good. The second plot is a normal probability plot of the residuals, and it
shows that the residuals are not too badly off from being normally
distributed.
Contrasts
You should see the Multiple Comparisons tutorial
for a good many of the gory details on contrasts. This section will be a brief
demonstration of how they can be obtained.
First, let's look to see what contrasts we already have.
> contrasts(ToothGrowth$supp)
VC
OJ 0
VC 1
> contrasts(ToothGrowth$dose)
med high
low 0 0
med 1 0
high 0 1
These are what the R folks call "treatment contrasts." I'm going to reset them
to something a bit more interesting (although perhaps not in this particular
case).
> options(contrasts = c("contr.helmert","contr.poly"))
> contrasts(ToothGrowth$supp)
[,1]
OJ -1
VC 1
> contrasts(ToothGrowth$dose)
[,1] [,2]
low -1 -1
med 1 -1
high 0 2
Now we have to redo the ANOVA.
> aov.out2 = aov(len ~ supp * dose, data=ToothGrowth)
And finally we can examine our contrasts.
> summary(aov.out2, split=list(
+ supp=list("OJ vs VC"=1),
+ dose=list("low vs med"=1,"low.med vs high"=2)),
+ expand.split=F
+ )
Df Sum Sq Mean Sq F value Pr(>F)
supp 1 205.4 205.4 15.572 0.000231 ***
supp: OJ vs VC 1 205.4 205.4 15.572 0.000231 ***
dose 2 2426.4 1213.2 92.000 < 2e-16 ***
dose: low vs med 1 833.6 833.6 63.211 1.19e-10 ***
dose: low.med vs high 1 1592.9 1592.9 120.789 2.17e-15 ***
supp:dose 2 108.3 54.2 4.107 0.021860 *
Residuals 54 712.1 13.2
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Oops! Got significance stars turned back on somehow. The first contrast,
OJ vs VC is just silly, because the ANOVA already gives us this result. The
contrasts for dose are a bit more interesting, although perhaps it would make
more sense to look at linear and quadratic trends here. I leave you at the
mercy of the help page (help(summary.aov)) for
more information.
A note on the expand.split option: When set to FALSE it prevents the
interaction from being split into a bunch of nonsensical contrasts. You'd
think this would be the default. If you don't set it, you'll get a longer
and messier table, part of which you can just ignore.
We should probably reset our contrasts.
> options(contrasts = c("contr.treatment","contr.poly"))
Simple Effects
As we have a significant interaction in this problem, looking at the main
effects post hoc is of questionable value. We should be dissecting the interaction,
and one way to do that is by looking at simple effects. I'm sure there must be
an optional package for this, but I don't know what it is, so I'm going to show
you an easy way to find simple effects by doing a few calculations at the command
line.
We need three things: a table of means, the group size (from the balanced
design), and a function to calculate sums of squares.
> means = with(ToothGrowth, tapply(len, list(supp,dose), mean))
> n.per.group = 10
> SS = function(x) sum(x^2) - sum(x)^2 / length(x)
See the Functions and Scripts tutorial for an
explanation of the SS function. Or just run the command, which puts the
SS() function in your workspace. Now we're going
to use the apply() function on the means table.
> apply(means, MARGIN=1, FUN=SS) * n.per.group # simple effects sums of squares for dose
OJ VC
885.2647 1649.4887
> apply(means, MARGIN=1, FUN=SS) * n.per.group / 2 # mean squares
OJ VC
442.6323 824.7443
> apply(means, MARGIN=1, FUN=SS) * n.per.group / 2 / 13.18715 # F values
OJ VC
33.56543 62.54151
> pf(apply(means, MARGIN=1, FUN=SS) * n.per.group / 2 / 13.18715, 2, 54, lower=F) # p-values
OJ VC
3.363197e-10 8.753099e-15
These are the simple main effects of dose by supp levels. Now we can do the same
thing in the columns to get the simple main effects of supp by dose levels.
(Note: This may look like the "hard way," typing these long commands every time,
but in fact I am pressing the up arrow key and just adding the part I need for
the next calculation. What looks like the hard way is actually the easy way!)
> apply(means, MARGIN=2, FUN=SS) * n.per.group # SSes and MSes as df=1
low med high
137.8125 175.8245 0.0320
> apply(means, MARGIN=2, FUN=SS) * n.per.group / 13.18715 # F values
low med high
10.450514326 13.333017369 0.002426605
> pf(apply(means, MARGIN=2, FUN=SS) * n.per.group / 13.18715, 1, 54, lower=F)
low med high
0.0020924712 0.0005897011 0.9608933868
Finally, following an example in Howell's textbook (see
Refs), we could present these results in a summary table as follows.
Source df SS MS F p
------------------------------------------------------------
Supplement
supp at dose=low 1 137.81 137.81 10.45 .002
supp at dose=mod 1 175.82 175.82 13.33 <.001
supp at dose=high 1 0.032 0.032 0.002 .961
Dose
dose at supp=OJ 2 885.26 442.63 33.57 <.001
dose at supp=VC 2 1649.49 824.74 62.54 <.001
Error 54 712.106 13.18715
------------------------------------------------------------
Note: The same exact thing could be done with a unbalanced design, I believe.
See the Unbalanced Factorial Designs tutorial
for an example.
(Final Note: This method does not exactly partition the total sum of squares,
because the interaction sum of squares appears twice, once in the supp simple
effects and once in the dose simple effects.
> 137.81+175.82+0.032+885.26+1649.49+712.106
[1] 3560.518
> SS(ToothGrowth$len)
[1] 3452.209
> sum(c(205.350, 2426.434, 108.319, 712.106))
[1] 3452.209
> 137.81+175.82+0.032+885.26+1649.49+712.106-108.319
[1] 3452.199
With a touch of rounding error at the 6th significant figure. That explains why
the interaction is not shown in the summary table. It's already there. Twice!)
MANOVA
There is also a function available for multivariate ANOVA when there are
multiple response variables: manova(). The basic
syntax is...
model = manova(resp.tabl ~ IV1 * IV2 * ...)
summary(model)
summary.manova(model)
summary.aov(model)
The "resp.tabl" is a table in which the response variables are arranged in
columns of the table. I have rarely had need to use this function and will
not discuss it further.
revised 2016 February 5; updated 2016 March 21
| Table of Contents
| Function Reference
| Function Finder
| R Project |
|