unbalanced practice exercise

Psyc 480 -- Dr. King

Practice Exercise for Unbalanced Factorial ANOVA

Clear your workspace using either the menus or rm(list=ls()).

Here is the URL of the data file that we will be using.

http://ww2.coastal.edu/kingw/psyc480/data/depressMD.txt

If you are in RStudio Cloud or your own installed version of R, get the data this way. You can copy and paste these commands if accurate typing is not your thing.

file = "http://ww2.coastal.edu/kingw/psyc480/data/depressMD.txt"
X = read.table(file=file, header=T, stringsAsFactors=T)

Why are we calling the data frame X? Because it's a nice short name. I would generally discourage the use of single letters for the names of things, but it will do no harm here provided you've cleared your workspace and are careful not to overwrite it with something else called X. For example, if you were to do this (DON'T!): X=6, your data frame is gone.

Everyone should now fetch the aovIII() function if you need it. Is that enough of a hint about how we are going to do this problem?

source("http://ww2.coastal.edu/kingw/psyc480/functions/aovIII.R")

You should also create this function for calculating sums of squares. Use a lower case x to avoid confusion with the name of your data frame. (See what I mean about single letters?)

SS = function(x)  sum(x^2) - sum(x)^2 / length(x)

Your workspace should now look like this.

> ls()
[1] "aovIII" "file"   "SS"     "X"

When you do summary(X), you should get this.

> summary(X)
   deprscore     therapy      severity 
 Min.   :41.00   AT :15   mild    :13  
 1st Qu.:47.00   CB :15   moderate:15  
 Median :55.00   Rog:15   severe  :17  
 Mean   :54.04                         
 3rd Qu.:59.00                         
 Max.   :69.00

These are Maxwell and Delany's depression data. Briefly, deprscore is the score on a depression inventory administered after the subject underwent a period of therapy for depression. Higher scores indicate more depression. The IVs are therapy (three types: AT=assertiveness training, CB=cognitive-behavioral therapy, Rog=Rogerian therapy), and severity (the severity of the subject's depression at the beginning of the therapy). Patients were randomly assigned to therapy type, but of course could not be randomly assigned to severity.

The design is unbalanced, as we can already tell from the summary. To see how much it is unbalanced, cross-tabulate the IVs.

> with(X, table(therapy, severity))
       severity
therapy mild moderate severe
    AT     5        4      6
    CB     3        5      7
    Rog    5        6      4

If you were asked to describe the confound here, could you? We're interested in seeing the differences in effectiveness of the the three therapy types in the treatment of depression. How else do those groups differ? They also differ by how severely depressed the patients are that they have to treat. The CB therapists got stuck with the fewest mild cases and the most severe cases. On the other hand, the Rog therapists have more mild cases and the fewest severe cases. Therapy type is confounded with severity, and we need to find a way to remove that confound in our analysis. How should we proceed?

We don't really care about the effect of severity. The severity variable is only there because it is a confound that we must remove. Who doesn't know what the effect of severity is going to be even without a statistical analysis? Severely depressed patients are going to be more resistant to threatment than mildly depressed ones.

Question 1) Can the confound be removed using a Type I analysis?
(yes / no)

Question 2) If yes, then how?
Enter severity (first / second)

Question 3) This makes the test on therapy type equivalent to:
(Type II / Type III)

Question 4) And that means we are looking at the effect of:
(therapy with severity removed / therapy ignoring severity / severity with therapy removed)

Question 5) Can the confound be removed using a Type II analysis?
(yes / no)

Here is the Type II analysis.

> options(show.signif.stars=F)   # bad me, I've suppressed significance stars!

severity          2 1253.2   626.6  22.436 4.71e-07
therapy           2  238.5   119.2   4.270   0.0217
severity:therapy  4   14.2     3.5   0.127   0.9717
Residuals        36 1005.4    27.9

Question 6) First, of course, we'll look at the severity-by-therapy interaction. Is the interaction statistically significant?
(yes / no)

Question 7) Moving on, is the main effect of therapy statistically significant?
(yes / no)

Questions 8-11) Summarize the ANOVA result for the main effect of therapy. (Enter answers EXACTLY AS THEY APPEAR IN THE ANOVA OUTPUT. DON'T ROUND!)
F(, ) = , p =

Questions 12-14) What are the unweighted marginal means for therapy? (Round to three decimal places please.)

Okay, unweighted marginal means--there are two ways to do it, by hand and in R. By hand, you would get a table of cell means, which could be done with tapply(), and then you would calculate means of means for the therapy factor.

> with(X, tapply(deprscore, list(severity,therapy),mean))
            AT       CB   Rog
mild     48.00 44.66667 48.60
moderate 54.25 49.40000 56.00
severe   62.00 56.85714 61.25

AT.mean = (48.00 + 54.25 + 62.00) / 3 = 54.750, etc.

Or you could do this all in R. (This is new. Of course, calculating marginal means with tapply() would give you weighted means. But then you knew that, didn't you?)

> cell.means = with(X, tapply(deprscore, list(severity,therapy), mean))
> apply(cell.means, 2, mean)   # calcs means down the columns of means table

Of course, I'm not going to show you the answers. Btw, apply(cell.means, 1, mean) would calculate unweighted means across the rows of the means table. 1 = across the rows. 2 = down the columns. In R, rows always come before columns!

	AT	CB	Rog
Means

Question 15) Which group had the best result from the therapy?
(AT / CB / Rog)

Did you pick Rog? High numbers are not always good! Here, the higher the deprscore, the more depression the subject has. For a good response, we want the deprscores to be low.

The Type II sums of squares might be considered correct here, especially since there is no hint of an interaction. Maxwell and Delany recommended Type III sums of squares on the justification that they always recommend Type III for unbalanced designs. However, I think in this case they got it right. Patients were randomly assigned to therapy types resulting in 15 patients in each type of therapy. Obviously, no attempt was made to balance the design on severity as well, so the cell frequencies for severity would have to be considered "accidental." I don't see how anyone could claim that these cell frequencies represent something about the real world. I don't see a problem with using Type III sums of squares in this case. Since the interaction is miniscule, I'm betting they will give a result very similar to Type II. Let's find out.

> aovIII(deprscore ~ severity * therapy, data=X)
$contrasts
[1] "contr.sum"  "contr.poly"

Single term deletions

Model:
deprscore ~ severity * therapy
                 Df Sum of Sq    RSS    AIC F value    Pr(>F)    
                        1005.4 157.79                      
severity          2   1181.11 2186.5 188.75 21.1452 8.447e-07 ***
therapy           2    204.76 1210.2 162.13  3.6658   0.03556 *  
severity:therapy  4     14.19 1019.6 150.42  0.1270   0.97170    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$contrasts
[1] "contr.treatment" "contr.poly"

Question 16) Would this result have been different if the IVs had been entered in the opposite order (therapy * severity)?
(yes / no)

What is all that business about contrasts? It's a technical point, but you might as well learn it now, because you'll have to learn it eventually. When statistical software does an analysis such as an ANOVA, it codes factors internally as numbers. "You mean like dummy coding?" Yes, very much like dummy coding. If you dummy code a variable like gender as 0 and 1, those would be very much like treatment contrasts, meaning something like 0 is the control or baseline group and 1 is the treatment group. Treatment contrasts are the R default, but you cannot do a Type III ANOVA with treatment contrasts (which is one of the reasons the R people mistrust them). Type III ANOVA requires a type of contrasts called sum to zero, which means your genders would be coded -1 and +1. (Type I and Type II will work either way.) So before the aovIII calculations can be done, the function has to change the contrasts to contr.sum (sum to zero). Then at the end, the function changes the contrasts back to contr.treatment. I have them printed out to remind me that the function is changing how contrasts are set. You can ignore it. For now.

Questions 17-34) Using the information in the output above, construct a conventional ANOVA summary table for this analysis. (Copy the numbers into your new table EXACTLY as they appear above in the aovIII() output. Exceptions: anything that has to be calculated and is not a whole number give to 3 decimal places. Any very small p-values should be given as p < .001.)

df SS MS F p-value

severity

therapy

severity:therapy

Residuals

MSes are not in the aovIII output. I hope you remembered how to get MS from SS and df!

Question 35) What is SS.total (3 accurate decimal places please)?

There's a catch! With Type III SSes, the effect and error SSes do NOT add to SS.total. You have to get SS.total by finding the sum of squares of the DV. That's what the SS() function is for. Type II SSes also don't add to SS.total. Only the Type I SSes add correctly to SS.total.

Confirm that the SSes in the table do not add to this value. Moral of the story: You cannot get SS.total by adding the SSes in a Type III ANOVA. They do not add correctly!

Questions 36-41) Assuming the correct value of SS.total is exactly 2374 (it's not), calculate the following values for eta-squared to 3 decimal places and evaluate the effect sizes. Appropriate words for the evaluation are trivial, small, moderate, large, and very large.

effect eta-sqr evaluation

severity

therapy

severity:therapy

For questions 42 through 50, possible correct answers are Type I, Type II, Type III, and none, all, or any.

Question 42) These sums of squares are sometimes referred to as simultaneous.

Question 43) These sums of squares are sometimes referred to as sequential.

Question 44) These sums of squares should be used when you want to see the total effects of your IVs.

Question 45) These sums of squares should be used when you have true experimental data and you want to remove all confounds between the IVs due to the unbalanced design.

Question 46) If the p-value for the interaction comes out to be something like p = 0.075, which sums of squares should you feel uncomfortable about using, remembering the rule that nonsignificant does not mean nonexistent?

Question 47) If the interaction effect is the ONLY effect that you are interested in (in a two-factor ANOVA), which type SSes should you use?

Question 48) If you are interested in what's true in the population, and your sample from that population is biased, which would be the best SSes to use?

Question 49) If the design is balanced, which SSes should be used?

Question 50) These sums of squares use weighted means to evaluate the first factor.