Read the Instructions! Type your first and last name and email address in the boxes indicated. No name, no credit! Enter your correct CCU email address. Without this I cannot return your exam to you.
First Name: Last Name:
Your Coastal e-mail address: CCU emails ONLY!
STOP! Seriously, stop and put your name and email address in the boxes. I may not get your submission if you don't.
Further Instructions. When you're asked to type a number in a box, type it EXACTLY as R has printed it out. If it's a number you have to calculate, you can round to two decimal places. Multiple choice answers are worth 1 point each. Numeric answers are worth 2 points each.
First analysis: The dataset is ptsd.txt.
# These data are those obtained by Rodriguez, N., Ryan, S. W., Vande # Kemp, H., & Foy, D. W. (1997). Posttraumatic stress disorder in adult female # survivors of childhood sexual abuse: A comparison study. Journal of Counseling # and Clinical Psychology, 65, 53-59. # Variables: # cpa - childhood physical abuse measured by a standardized assessment device # ptsd - adult posttraumatic stress disorder measured by a standardized # assessment device # csa - childhood sexual abuse (Abused/NotAbused) # Question under investigation: In this sample, is PTSD a function of childhood # sexual abuse, even after the effect of childhood physical abuse is removed? > rm(list=ls()) # clears your workspace > file = "http://ww2.coastal.edu/kingw/psyc480/data/ptsd.txt" > PTSD = read.table(file=file, header=T, stringsAsFactors=T) > summary(PTSD) cpa ptsd csa Min. :-3.120 Min. :-3.350 Abused :45 1st Qu.: 0.825 1st Qu.: 6.170 NotAbused:31 Median : 2.065 Median : 8.910 Mean : 2.354 Mean : 8.985 3rd Qu.: 3.735 3rd Qu.:12.238 Max. : 8.650 Max. :18.990 > > # First, I'm going to recode csa as a dummy variable with Abused = 1. > > PTSD$csa = as.numeric(PTSD$csa) # codes Abused = 1, NotAbused = 2. > PTSD$csa = PTSD$csa - 2 # now Abused = -1, NotAbused = 0. > PTSD$csa = abs(PTSD$csa) # finally Abused = 1, NotAbused = 0. > # This could have been done in one step, but I'm not that clever. > summary(PTSD) cpa ptsd csa Min. :-3.120 Min. :-3.350 Min. :0.0000 1st Qu.: 0.825 1st Qu.: 6.170 1st Qu.:0.0000 Median : 2.065 Median : 8.910 Median :1.0000 Mean : 2.354 Mean : 8.985 Mean :0.5921 3rd Qu.: 3.735 3rd Qu.:12.238 3rd Qu.:1.0000 Max. : 8.650 Max. :18.990 Max. :1.0000 > > dim(PTSD) [1] 76 3 > > cor(PTSD) cpa ptsd csa cpa 1.0000000 0.4924450 0.3688162 ptsd 0.4924450 1.0000000 0.7205715 csa 0.3688162 0.7205715 1.0000000 > > # First, we establish that there is a relationship between ptsd and csa. > # There are several ways we can do this. What are some of the others? > > lm.csa_alone = lm(ptsd ~ csa, data=PTSD) > summary(lm.csa_alone) Call: lm(formula = ptsd ~ csa, data = PTSD) Residuals: Min 1Q Median 3Q Max -8.0448 -2.3157 0.0922 2.1627 7.0493 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.6948 0.6237 7.528 1.01e-10 *** csa 7.2458 0.8105 8.940 2.16e-13 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.473 on 74 degrees of freedom Multiple R-squared: 0.5192, Adjusted R-squared: 0.5127 F-statistic: 79.92 on 1 and 74 DF, p-value: 2.162e-13 > # Now we'll do the full ANCOVA. > > lm.ptsd = lm(ptsd ~ cpa + csa, data=PTSD) > summary(lm.ptsd) Call: lm(formula = ptsd ~ cpa + csa, data = PTSD) Residuals: Min 1Q Median 3Q Max -8.1562 -2.3659 -0.1558 2.1461 7.1461 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.9746 0.6292 6.317 1.87e-08 *** cpa 0.5507 0.1715 3.210 0.00197 ** csa 6.2728 0.8219 7.632 6.89e-11 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.273 on 73 degrees of freedom Multiple R-squared: 0.5787, Adjusted R-squared: 0.5672 F-statistic: 50.14 on 2 and 73 DF, p-value: 1.984e-14 > # Finally, we'll get some confidence intervals for the regression coefficients. > > confint(lm.ptsd) 2.5 % 97.5 % (Intercept) 2.7207003 5.2285886 cpa 0.2088206 0.8926112 csa 4.6348026 7.9107063
Answer the following questions based on this analysis.
1) What kind of analysis is this? (Seriously? You're asking this? Is this a trick? Yes, seriously, and it is not a trick. I will only accept the very best and most specific answer, however.) A) ANCOVA B) standard multiple regression C) simple linear regression D) curvilinear regression
2) How many women participated as subjects in this study? (Hint: two ways to get this answer. Try them both and see if they agree.)
3) What proportion of these women were sexually abused as children? (Hint: two ways to get this answer.)
4) What is the correlation in this sample between csa and cpa?
5) What kind of a correlation is this (in question 4)? A) Pearson r B) point-biserial C) phi coefficient D) none of the above
6) We might expect childhood physical abuse and childhood sexual abuse to go together to some extent, i.e., if there is one, then there is the other. Is that what this correlation (question 4) is telling us? A) no B) yes
7) Before adjusting for (removing) cpa, what was the difference between the mean ptsd for women who were sexually abused as children and women who were not?
8) What was this difference after adjusting for (removing) cpa?
9) The first part of the main analysis was to establish that there is a significant relationship between csa and ptsd in this sample. I did that using lm() or regression. How else could I have done it? (Hint: If you're uncertain, you can always try these alternatives and see if they work.) A) by using t.test() B) by using cor.test() C) by using aov() D) any of the above would have worked
10-11) In the final model (with both cpa and csa), what is the regression equation for women who were sexually abused as children? (Hint: csa=1.) ptsd.hat = + * cpa
12-13) In the final model, what is the regression equation for women who were not sexually abused as children? (Hint: csa=0.) ptsd.hat = + * cpa
14) What proportion of the total variability in ptsd scores is accounted for by the two predictors in this model?
15) Of that amount (question 14), how much was accounted for by adding csa into the model in addition to cpa? A) none B) 50.14% C) all of it D) can't say from the information given
16) In the final model, was the adjusted mean difference significantly different from zero? If so, give the p-value. A) no B) yes, p = 0.00197 C) yes, p < 0.001 or p = 0.000 D) yes, but the p-value isn't given in the output of the analysis
17-18) From this analysis, we can be 95% confident that the true adjusted mean difference in the population is between what two values? LL = UL =
Second analysis: The dataset is still ptsd.txt.
Now we'll do the same analysis using aov().
> aov.out = aov(ptsd ~ cpa + csa, data=PTSD) > summary(aov.out) Df Sum Sq Mean Sq F value Pr(>F) cpa 1 450.1 450.1 42.02 9.36e-09 *** csa 1 624.0 624.0 58.25 6.89e-11 *** Residuals 73 781.9 10.7 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
19) What kind of analysis is this? (Seriously? Yes, seriously!) A) ANCOVA B) standard multiple regression C) simple linear regression D) curvilinear
20) I checked the interaction. It's not significant, p = 0.4. Is that important? A) no B) yes, it means we are justified in leaving out the interaction effect C) yes, it means we are doing the wrong analysis D) yes, it means we can't use aov() for this analysis
21) When we leave out the interaction term, where does its variability (SS=7.7) go? A) it just vanishes B) it is divided up amongst the other two effects C) it goes into Residuals (error) D) nobody knows (oooh, spooky!)
22) Why is the effect of cpa different here (p=9.36e-09) than it was in the lm() analysis (p=0.00197)? A) lm() tests cpa with csa removed; aov() tests cpa ignoring csa B) lm() tests cpa ignoring csa; aov() tests cpa with csa removed C) gee, that is a mystery! D) it's not; those are the same p-values
23) Calculate an eta-squared for cpa. (You will learn in the next unit of this course that this is called delta-R-squared in this kind of analysis.)
24) Calculate an eta-squared (delta-R-squared) for csa.
25) If you add these two eta-squareds together, do they equal the multiple R-squared (to within rounding error) that we got in the lm() analysis? A) yes B) no
26) Is it possible to tell, just from this second analysis, how many subjects were used in this study? A) yes B) no
Use whichever part of the analysis is necessary to get these answers. They are worth 2 points each.
27) What is the mean of the ptsd variable?
28) What is the variance of the ptsd variable?
29) Is there any reason to believe that the ptsd variable is seriously skewed? (yes/no)
30) If csa and cpa are both zero, what is the predicted value of ptsd? ptsd.hat =
Have a nice day! The last two points are free PROVIDED you put your name and email address on the exam before submitting it!!
CAUTION: If you click the submit button before indicating you are finished, your answers will be reset and you'll have to do the whole thing over again!
============================ ============================