Psyc 480 Graded Exercise 7

PSYC 480 -- Dr. King

The Next To Last Graded Exercise (No. 7): ANCOVA

Read the Instructions! Type your first and last name and email address in the boxes indicated. No name, no credit! Enter your correct CCU email address. Without this I cannot return your exam to you.

First Name: Last Name:

Your Coastal e-mail address: CCU emails ONLY!

STOP! Seriously, stop and put your name and email address in the boxes. I may not get your submission if you don't.

Further Instructions. When you're asked to type a number in a box, type it EXACTLY as R has printed it out. If it's a number you have to calculate, you can round to two decimal places. Multiple choice answers are worth 1 point each. Numeric answers are worth 2 points each.

First analysis: The dataset is ptsd.txt.

# These data are those obtained by Rodriguez, N., Ryan, S. W., Vande
# Kemp, H., & Foy, D. W. (1997). Posttraumatic stress disorder in adult female
# survivors of childhood sexual abuse: A comparison study. Journal of Counseling
# and Clinical Psychology, 65, 53-59.
# Variables:
#    cpa - childhood physical abuse measured by a standardized assessment device
#    ptsd - adult posttraumatic stress disorder measured by a standardized
#           assessment device
#    csa - childhood sexual abuse (Abused/NotAbused)
# Question under investigation: In this sample, is PTSD a function of childhood
# sexual abuse, even after the effect of childhood physical abuse is removed?

> rm(list=ls())   # clears your workspace
> file = "http://ww2.coastal.edu/kingw/psyc480/data/ptsd.txt"
> PTSD = read.table(file=file, header=T, stringsAsFactors=T)
> summary(PTSD)
      cpa              ptsd               csa    
 Min.   :-3.120   Min.   :-3.350   Abused   :45  
 1st Qu.: 0.825   1st Qu.: 6.170   NotAbused:31  
 Median : 2.065   Median : 8.910                 
 Mean   : 2.354   Mean   : 8.985                 
 3rd Qu.: 3.735   3rd Qu.:12.238                 
 Max.   : 8.650   Max.   :18.990                 
> 
> # First, I'm going to recode csa as a dummy variable with Abused = 1.
> 
> PTSD$csa = as.numeric(PTSD$csa)   # codes Abused = 1, NotAbused = 2.
> PTSD$csa = PTSD$csa - 2   # now Abused = -1, NotAbused = 0.
> PTSD$csa = abs(PTSD$csa)   # finally Abused = 1, NotAbused = 0.
> # This could have been done in one step, but I'm not that clever.
> summary(PTSD)
      cpa              ptsd             csa        
 Min.   :-3.120   Min.   :-3.350   Min.   :0.0000  
 1st Qu.: 0.825   1st Qu.: 6.170   1st Qu.:0.0000  
 Median : 2.065   Median : 8.910   Median :1.0000  
 Mean   : 2.354   Mean   : 8.985   Mean   :0.5921  
 3rd Qu.: 3.735   3rd Qu.:12.238   3rd Qu.:1.0000  
 Max.   : 8.650   Max.   :18.990   Max.   :1.0000  
> 
> dim(PTSD)
[1] 76  3
> 
> cor(PTSD)
           cpa      ptsd       csa
cpa  1.0000000 0.4924450 0.3688162
ptsd 0.4924450 1.0000000 0.7205715
csa  0.3688162 0.7205715 1.0000000
> 
> # First, we establish that there is a relationship between ptsd and csa.
> # There are several ways we can do this. What are some of the others?
> 
> lm.csa_alone = lm(ptsd ~ csa, data=PTSD)
> summary(lm.csa_alone)

Call:
lm(formula = ptsd ~ csa, data = PTSD)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.0448 -2.3157  0.0922  2.1627  7.0493 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   4.6948     0.6237   7.528 1.01e-10 ***
csa           7.2458     0.8105   8.940 2.16e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.473 on 74 degrees of freedom
Multiple R-squared:  0.5192,	Adjusted R-squared:  0.5127 
F-statistic: 79.92 on 1 and 74 DF,  p-value: 2.162e-13

> # Now we'll do the full ANCOVA.
> 
> lm.ptsd = lm(ptsd ~ cpa + csa, data=PTSD)
> summary(lm.ptsd)

Call:
lm(formula = ptsd ~ cpa + csa, data = PTSD)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.1562 -2.3659 -0.1558  2.1461  7.1461 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.9746     0.6292   6.317 1.87e-08 ***
cpa           0.5507     0.1715   3.210  0.00197 ** 
csa           6.2728     0.8219   7.632 6.89e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.273 on 73 degrees of freedom
Multiple R-squared:  0.5787,	Adjusted R-squared:  0.5672 
F-statistic: 50.14 on 2 and 73 DF,  p-value: 1.984e-14

> # Finally, we'll get some confidence intervals for the regression coefficients.
> 
> confint(lm.ptsd)
                2.5 %    97.5 %
(Intercept) 2.7207003 5.2285886
cpa         0.2088206 0.8926112
csa         4.6348026 7.9107063

Answer the following questions based on this analysis.

1) What kind of analysis is this? (Seriously? You're asking this? Is this a trick? Yes, seriously, and it is not a trick. I will only accept the very best and most specific answer, however.)
A) ANCOVA
B) standard multiple regression
C) simple linear regression
D) curvilinear regression

2) How many women participated as subjects in this study? (Hint: two ways to get this answer. Try them both and see if they agree.)

3) What proportion of these women were sexually abused as children? (Hint: two ways to get this answer.)

4) What is the correlation in this sample between csa and cpa?

5) What kind of a correlation is this (in question 4)?
A) Pearson r
B) point-biserial
C) phi coefficient
D) none of the above

6) We might expect childhood physical abuse and childhood sexual abuse to go together to some extent, i.e., if there is one, then there is the other. Is that what this correlation (question 4) is telling us?
A) no
B) yes

7) Before adjusting for (removing) cpa, what was the difference between the mean ptsd for women who were sexually abused as children and women who were not?

8) What was this difference after adjusting for (removing) cpa?

9) The first part of the main analysis was to establish that there is a significant relationship between csa and ptsd in this sample. I did that using lm() or regression. How else could I have done it? (Hint: If you're uncertain, you can always try these alternatives and see if they work.)
A) by using t.test()
B) by using cor.test()
C) by using aov()
D) any of the above would have worked

10-11) In the final model (with both cpa and csa), what is the regression equation for women who were sexually abused as children? (Hint: csa=1.)
ptsd.hat = + * cpa

12-13) In the final model, what is the regression equation for women who were not sexually abused as children? (Hint: csa=0.)
ptsd.hat = + * cpa

14) What proportion of the total variability in ptsd scores is accounted for by the two predictors in this model?

15) Of that amount (question 14), how much was accounted for by adding csa into the model in addition to cpa?
A) none
B) 50.14%
C) all of it
D) can't say from the information given

16) In the final model, was the adjusted mean difference significantly different from zero? If so, give the p-value.
A) no
B) yes, p = 0.00197
C) yes, p < 0.001 or p = 0.000
D) yes, but the p-value isn't given in the output of the analysis

17-18) From this analysis, we can be 95% confident that the true adjusted mean difference in the population is between what two values?
LL = UL =

Second analysis: The dataset is still ptsd.txt.

Now we'll do the same analysis using aov().

> aov.out = aov(ptsd ~ cpa + csa, data=PTSD)
> summary(aov.out)
            Df Sum Sq Mean Sq F value   Pr(>F)    
cpa          1  450.1   450.1   42.02 9.36e-09 ***
csa          1  624.0   624.0   58.25 6.89e-11 ***
Residuals   73  781.9    10.7                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

19) What kind of analysis is this? (Seriously? Yes, seriously!)
A) ANCOVA
B) standard multiple regression
C) simple linear regression
D) curvilinear

20) I checked the interaction. It's not significant, p = 0.4. Is that important?
A) no
B) yes, it means we are justified in leaving out the interaction effect
C) yes, it means we are doing the wrong analysis
D) yes, it means we can't use aov() for this analysis

21) When we leave out the interaction term, where does its variability (SS=7.7) go?
A) it just vanishes
B) it is divided up amongst the other two effects
C) it goes into Residuals (error)
D) nobody knows (oooh, spooky!)

22) Why is the effect of cpa different here (p=9.36e-09) than it was in the lm() analysis (p=0.00197)?
A) lm() tests cpa with csa removed; aov() tests cpa ignoring csa
B) lm() tests cpa ignoring csa; aov() tests cpa with csa removed
C) gee, that is a mystery!
D) it's not; those are the same p-values

23) Calculate an eta-squared for cpa. (You will learn in the next unit of this course that this is called delta-R-squared in this kind of analysis.)

24) Calculate an eta-squared (delta-R-squared) for csa.

25) If you add these two eta-squareds together, do they equal the multiple R-squared (to within rounding error) that we got in the lm() analysis?
A) yes
B) no

26) Is it possible to tell, just from this second analysis, how many subjects were used in this study?
A) yes
B) no

Use whichever part of the analysis is necessary to get these answers. They are worth 2 points each.

27) What is the mean of the ptsd variable?

28) What is the variance of the ptsd variable?

29) Is there any reason to believe that the ptsd variable is seriously skewed?
(yes/no)

30) If csa and cpa are both zero, what is the predicted value of ptsd?
ptsd.hat =

Have a nice day! The last two points are free PROVIDED you put your name and email address on the exam before submitting it!!

Message or Note (if any; keep it short!):

FINISHED: no yes

CAUTION: If you click the submit button before indicating you are finished,
your answers will be reset and you'll have to do the whole thing over again!

============================
============================