PSYC 480 -- Spring 2021 -- Dr. King Factorial ANOVA Practice Problems (mostly balanced) 1) butterfat file = "http://ww2.coastal.edu/kingw/psyc480/data/butterfat.txt" X = read.table(file=file, header=T, stringsAsFactors=T) Butterfat in the milk of diary cows by breed (5 breeds) and age, 2-year-old cows and mature (5-year-old or older) cows. How do these factors relate to butterfat content of milk? Is there an interaction? If so, describe it. (These data are from Hand et al. (1994). A Handbook of Small Data Sets, and were taken from Sokal and Rohlf (1981) Biometry (2nd ed.). They come originally from Canadian records of pure-bred dairy cattle. Butterfat content is in percent butterfat. FYI, 4% butterfat is considered whole milk. Be sure to check the cell variances! (It's not psychology, but I like this problem, having spent a fair amount of my youth around dairy cows, and an ANOVA is an ANOVA whether the subjects are people or cows!) > aov.butterfat = aov(butterfat ~ age * breed, data=X) > summary(aov.butterfat) Df Sum Sq Mean Sq F value Pr(>F) age 1 0.21 0.207 1.172 0.282 breed 4 34.32 8.580 48.595 <2e-16 *** age:breed 4 0.27 0.067 0.381 0.821 Residuals 90 15.89 0.177 --- > with(X, tapply(butterfat, breed, mean)) Ayrshire Canadian Guernsey Holstein Jersey 4.0600 4.4385 4.9500 3.6695 5.2925 > TukeyHSD(aov.out, which="breed") # be sure to specify which effect you want Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = butterfat ~ breed * age, data = X) $breed diff lwr upr p adj Canadian-Ayrshire 0.3785 0.00858404 0.74841596 0.0422752 Guernsey-Ayrshire 0.8900 0.52008404 1.25991596 0.0000000 Holstein-Ayrshire -0.3905 -0.76041596 -0.02058404 0.0332224 Jersey-Ayrshire 1.2325 0.86258404 1.60241596 0.0000000 Guernsey-Canadian 0.5115 0.14158404 0.88141596 0.0020194 Holstein-Canadian -0.7690 -1.13891596 -0.39908404 0.0000010 Jersey-Canadian 0.8540 0.48408404 1.22391596 0.0000001 Holstein-Guernsey -1.2805 -1.65041596 -0.91058404 0.0000000 Jersey-Guernsey 0.3425 -0.02741596 0.71241596 0.0832286 Jersey-Holstein 1.6230 1.25308404 1.99291596 0.0000000 2) writing file = "http://ww2.coastal.edu/kingw/psyc480/data/writing.txt" X = read.table(file=file, header=T, stringsAsFactors=T) For her senior research project (Psyc 497, Sp 2002), Susan McClean presented 180 subjects with a short technical article on linquistics and asked them to rate the competency of the writing on a 1-to-5 Likert-type rating scale, with a rating of 1=highly unfavorable, and a rating of 5=highly favorable. Half the raters were female and half were male. All were freshman at CCU. In each gender group, one-third of the subjects were told the author of the article was female, one-third were told the author was male, and one-third were given no information about the author. I do not have the original data, but these data were reconstructed to match Susan's cell and marginal means and SDs as closely as possible, and thus to give the same results in the ANOVA table. Variables: rating - the student's (subject's) rating of the writing competency gender - the student's (subject's) gender author - what the student/subject was told about the article's author How does the subject's gender and what the subject was told about the gender of the author impact the subject's rating? > aov.writing = aov(rating ~ gender * author, data=X) > summary(aov.writing) Df Sum Sq Mean Sq F value Pr(>F) gender 1 0.67 0.6722 1.155 0.284 author 2 0.48 0.2389 0.411 0.664 gender:author 2 2.34 1.1722 2.015 0.136 Residuals 174 101.23 0.5818 Question: If we graph the cell means in an interaction plot, it looks like there is an interaction, but we didn't find one. What's up with that? Answer: Expand the limits on the vertical axis to the full range of the DV and see what happens. Note: Balanced designs are hard to come by in Psyc 497 projects. I've been collecting 497 data for 25 years, and this is one of the very few I have. That's why we're going to have to talk about unbalanced designs, even though that might be a little painful. Unbalanced factorial designs are common in social science research. 3) audit AUDIT stands for Alcohol Use Disorders Identification Test. Scores on this test greater than 8 supposedly indicate the possible existence of "problem drinking." Kellie Dunlap used the AUDIT as the dependent measure in her Psyc 497 project (Fall 2008). 70 CCU students were scored on the test. Other information she collected from her subjects was gender and home state. Here are the scores sorted by gender. Does there appear to be a gender difference in AUDIT scores? If so, is it a big difference? (You've already done this problem. Check your notes.) females: 17 3 19 9 14 9 9 5 16 10 4 3 10 3 2 10 20 5 2 6 4 8 4 4 1 2 1 21 14 1 4 11 10 8 7 0 18 1 9 6 0 9 8 4 14 4 11 22 0 17 males: 11 7 9 10 6 13 1 8 13 11 26 13 19 6 1 20 17 12 15 24 Here are the scores sorted by whether the student's home state was north or south of the Mason-Dixon line. Does there appear to be a difference in AUDIT scores between students from northern states and students from southern states? If so, is it a big difference? (You've done this one, too.) north: 17 11 19 14 6 9 10 20 2 4 8 4 1 8 21 14 4 26 11 0 19 1 4 11 22 17 24 south: 3 7 9 9 10 13 9 5 16 4 3 10 3 2 10 5 6 4 1 2 1 1 13 11 13 10 8 7 18 6 1 9 6 0 9 8 4 14 20 17 12 0 15 These are t-tests. Now we will cross-classify these two variables. From the cell means, can you tell what's going on here? Is there a Gender x Region interaction? Why can't we analyze these data the way we have been doing? MSW = 36.677. Can that be used to help our cause? Region marginal means north south wgted. unwgted. --------------------------- M = 10.600 M = 6.233 7.98 8.42 females SD = 7.126 SD = 4.621 n = 20 n = 30 Gender --------------------------- M = 13.571 M = 11.308 12.10 12.44 males SD = 9.537 SD = 4.922 n = 7 n = 13 --------------------------- wgted. 11.37 7.77 unwgted. 12.09 8.77 Because the design is unbalanced, there are several possible ANOVAs here. Let me explain why. The problem is with the main effects. All generally accepted methods of doing this will give the same result for the interaction. (This is true ONLY for two factor ANOVAs.) In this case, the interaction is not signifi- cant. That means we move on and look at the main effects. The main effects are seen in the marginal means. So how do we get those? Aha! There are two methods for calculating the marginal means, and therein lies the problem. If the design is balanced, they both give the same answer. If the design is unbalanced, they do not. Weighted Marginal Means Look at the first row in the table above (females). The weighted marginal mean is what you would get if you ignored region and just dumped all 50 of those subjects into one big hat and calculated the mean. It's also the mean you would get in R if you used tapply(AUDIT, Gender, mean). It can also be obtained as follows from the information in the table. 10.600 * 20 + 6.233 * 30 = 212 + 187 = 399 Do you see what we've done here? When I teach this in Psyc 225, I tell the students that we are "going back to the sums." If the mean is sum(X)/n, then the sum is the mean times n. So the sum of the scores in the first row of the table (females) is 399. Then to get the mean, we divide by n in the row. 399 / (20 + 30) = 7.98 This is called a weighted mean, or weighted marginal mean, because the means in the individual cells are not just added together and divided by the number of means. (That would work if the design were balanced!) Rather, the means in the individual cells are multiplied by a weight, then added, and then the sum is divided by the sum of the weights. (Yes, you were taught that in your first stat course!) In this case, the weights are n in the cells. Now you do the males row and the two columns. The answers are (don't look until you've done the calculations!) 12.100, 11.370, and 7.767. Hint: the sums in the cells have to be whole numbers. Why is that? Unweighted Marginal Means The unweighted marginal mean is the mean of the means. That is, we add the cell means across a row (or down a column) and we divide by the number of means. For the first row (females), the unweighted marginal mean is... (10.600 + 6.233) / 2 = 8.417 Now you do the rest of them. The answers are 12.440, 12.086, and 8.771. Do you see the problem? With an unbalanced design, the marginal means are different depending upon how they are calculated. So which ones should we use to calculate the main effects? You ask hard questions! That depends upon how you want to calculate the ANOVA, and that depends upon what questions you are asking of the data. These are the kinds of problems we are going to face when it comes to unbalanced designs. 4) BONUS - Here is an alternate analysis of the audit results. Can you explain what I'm doing? Send me an email with your answer. Generous bonus points for anyone who gets it right! north south marginal mean --------------------------- M = 10.600 M = 6.233 7.980 females SD = 7.126 SD = 4.621 n = 20 n = 30 n = 50 --------------------------- M = 13.571 M = 11.308 12.100 males SD = 9.537 SD = 4.922 n = 7 n = 13 n = 20 --------------------------- marginal 11.370 7.767 mean n = 27 n = 43 SS.error = 7.126^2 * 19 + 4.621^2 * 29 + 9.537^2 * 6 + 4.922^2 * 12 = 2420.512 MS.error = 2420.512 / (19 + 29 + 6 + 12) = 36.67442 sd.pooled = sqrt(36.67442) = 6.056 test on Gender: t = (12.1 - 7.98) / (6.056 * sqrt(1/50 + 1/20)) = 2.571357, df = 66 p.value (from R) = 0.0124 adjusted p.value = 0.0124 * 2 = 0.0248 (reject the null) test on Region: t = (11.37 - 7.767) / (6.056 * sqrt(1/27 + 1/43)) = 2.422956, df = 66 p.value (from R) = 0.0181 adjusted p.value = 0.0181 * 2 = 0.0362 (reject the null) What assumption have we made in doing these tests? 5) Here are the cell means from problem 2 above. There were n=30 subjects per cell. Calculate the weighted and unweighted marginal means and confirm that they are the same. Not just similar, EXACTLY the same! > with(X, tapply(rating,list(gender,author),mean)) female male none female 3.700000 3.733333 3.366667 male 3.433333 3.433333 3.566667 The marginal means are: > with(writing, tapply(rating,list(gender),mean)) female male 3.600000 3.477778 > with(writing, tapply(rating,list(author),mean)) female male none 3.566667 3.583333 3.466667 6) meningitis file = "http://ww2.coastal.edu/kingw/psyc480/data/meningitis.txt" X = read.table(file=file, header=T, stringsAsFactors=T) These data are assessments of recovery from brain damage. The DV (scores) is an assessment from a battery of tests designed to detect abnormal brain function. The IVs are type of illness that caused the brain damage (illness) and time post-recovery from the illness (time). If there is an interaction, graph it and describe it in words. Sorry, but I don't have a reference for the source of these data. I think I took them out of an old stat book. > aov.meningitis = aov(scores ~ illness * time, data=X) > summary(aov.meningitis) Df Sum Sq Mean Sq F value Pr(>F) illness 2 2413.6 1206.8 14.224 6.04e-05 *** time 2 536.2 268.1 3.160 0.058460 . illness:time 4 2194.4 548.6 6.466 0.000874 *** Residuals 27 2290.8 84.8 --- The interaction plot can be drawn this way: > with(X, interaction.plot(time, illness, scores, type="b", pch=c(1,2,19))) Scores rise over time postoperatively but fall after meningitis or encephalitis. 7) odd monkeys file = "http://ww2.coastal.edu/kingw/psyc480/data/oddmonkeys.txt" X = read.table(file=file, header=T, stringsAsFactors=T) Data are from monkeys performing an oddity task in which they are exposed to three objects, two of which are the same. Their task is to learn that a reward is hidden under the odd object. Reward consists of one grape (low), three grapes (moderate), or five grapes (high). Monkeys were randomly assigned to these conditions. In addition, the motivation of the monkeys to learn the task was manipulated by food depriving them for either one hour before doing the task (weak) or 24 hrs (strong). Monkeys were also randomly assigned to these conditions. A total of 30 learning trials were administered. The DV, successes, is the number of times out of the 30 that a monkey made the correct choice. Source: W.L. Hays (1963). Statistics for Psychologists. > aov.oddmonkeys = aov(successes ~ reward * motivation, data=X) > summary(aov.oddmonkeys) Df Sum Sq Mean Sq F value Pr(>F) reward 2 112 56.00 3.055 0.0721 . motivation 1 24 24.00 1.309 0.2675 reward:motivation 2 144 72.00 3.927 0.0384 * Residuals 18 330 18.33 --- The interaction is described in the interactions lecture. 8) sex file = "http://ww2.coastal.edu/kingw/psyc480/data/sexbalanced.txt" X = read.table(file=file, header=T, stringsAsFactors=T) I thought that might get your attention! Data are from Lindsey Chopyk's Psyc 497 project (Fall 2009). Lindsey administered a survey asking participants about their sexual history, practices, and opinions. Part of the survey involved a fictional person who the participant was asked for opinions about. Lindsey's design was mildly unbalanced. I have made it balanced by randomly throwing out a few subjects in some of the cells. This did not change the effects that are seen in the ANOVA. Notice that this is a three-factor design (2x2x2 factorial). Can you name the effects that will be evaluated before R gives you the answer. The variables are: Partic.Gend: gender of the participant in this study Story.Gend: gender of fictional person participant was asked for opinion about Age.Grp: age group of participant (young vs. old) Slut: how many sex partners may the fictional person have before being considered to be a slut? (Her term, not mine!) The survey was administered in a local mall near CCU. The participants were not CCU students. However, the young age group consisted of people of roughly college age, while the old age group consisted of people in their 50s or older. The relationhship between Age.Grp, Partic.Gend, Story.Gend, and Slut is fascinating! Can you summarize that relationship in words? > aov.sex = aov(Slut ~ Partic.Gend * Story.Gend * Age.Grp, data=X) > summary(aov.sex) Df Sum Sq Mean Sq F value Pr(>F) Partic.Gend 1 49.8 49.8 2.738 0.10125 Story.Gend 1 193.9 193.9 10.650 0.00153 ** Age.Grp 1 620.3 620.3 34.075 7.19e-08 *** Partic.Gend:Story.Gend 1 193.9 193.9 10.650 0.00153 ** Partic.Gend:Age.Grp 1 41.9 41.9 2.301 0.13260 Story.Gend:Age.Grp 1 129.4 129.4 7.107 0.00901 ** Partic.Gend:Story.Gend:Age.Grp 1 311.5 311.5 17.113 7.55e-05 *** Residuals 96 1747.7 18.2 --- The three-way interaction is the highest order interaction that is significant and so is the effect of interest. Graph it this way. Begin by getting an idea of what the means are like. > with(X,tapply(Slut,list(Partic.Gend,Story.Gend,Age.Grp),mean)) , , old men women Female 4.384615 3.153846 Male 3.769231 4.000000 , , young men women Female 6.769231 8.000000 Male 15.615385 4.461538 Then partition the graphics window to hold two graphs. > par(mfrow=c(1,2)) Then draw two graphs with the same range on the y-axis. I've chosen to draw a graph for each age group, but that is arbitrary. I think age group is the choice that tells the best story, however. (You can also draw these by hand from the means above and get a pretty good idea what's going on.) > with(X[X$Age.Grp=="old",],interaction.plot(Partic.Gend,Story.Gend,Slut,type="b",pch=c(2,16),ylim=c(2,16))) > title(main="Older") > with(X[X$Age.Grp=="young",],interaction.plot(Partic.Gend,Story.Gend,Slut,type="b",pch=c(2,16),ylim=c(2,16))) > title(main="Younger") The legends need some work, but you can get the idea. The line for the younger males tell the story.