PSYC 480 -- Dr. King Some 2x2 Contingency Tables 1) hiring by visible tattoo status (Brallier, et al., 2011) > Freq = scan() 1: 39 36 19 7 44 29 16 2 9: Read 8 items > gender = gl(2, 4, 8, labels=c("male","female")) > gender [1] male male male male female female female female Levels: male female > hire = gl(2, 2, 8, labels=c("yes","no")) > hire [1] yes yes no no yes yes no no Levels: yes no > tattoo = gl(2, 1, 8, labels=c("tattooed","not.tattooed")) > tattoo [1] tattooed not.tattooed tattooed not.tattooed tattooed [6] not.tattooed tattooed not.tattooed Levels: tattooed not.tattooed > tatt = data.frame(tattoo, hire, gender, Freq) > tatt tattoo hire gender Freq 1 tattooed yes male 39 2 not.tattooed yes male 36 3 tattooed no male 19 4 not.tattooed no male 7 5 tattooed yes female 44 6 not.tattooed yes female 29 7 tattooed no female 16 8 not.tattooed no female 2 > conting.tab = xtabs(Freq ~ tattoo + hire + gender, data=tatt) > conting.tab , , gender = male hire tattoo yes no tattooed 39 19 not.tattooed 36 7 , , gender = female hire tattoo yes no tattooed 44 16 not.tattooed 29 2 > rm(Freq,gender,hire,tattoo) # why? > ls() [1] "conting.tab" "tatt" Notes: gl() is the "generate levels" function for creating categorical variables (i.e., factors) when the levels of the factor occur in a repetitive order. The first number is the number of levels, the second number is how often they occur in sequence or at a time (for example, male, male, female, female, would be 2,2), and the third number is the length of vector containing the factor (so 2,2,4 would be the complete specs in the above example). =============================================================================== 2) survival by sex aboard Titanic (R built-in dataset) > surv.by.sex = margin.table(Titanic, margin=c(2,4)) > surv.by.sex Survived Sex No Yes Male 1364 367 Female 126 344 Note: creating a custom function in R. > odds = function(tabl) (tabl[1,1]*tabl[2,2])/(tabl[2,1]*tabl[1,2]) > odds(surv.by.sex) [1] 10.14697 This is called the "sample odds ratio" and is known to be a somewhat biased estimator of odds ratio in the population, especially when cell frequencies are small (not the case here!). R will calculate "conditional maximum likelihood odds ratio," an unbiased estimate of population odds ratio, as part of its calculation of the Fisher Exact Test. > fisher.test(surv.by.sex) Fisher's Exact Test for Count Data data: surv.by.sex p-value < 2.2e-16 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 7.97665 12.92916 sample estimates: odds ratio 10.13190 If you give an odds ratio in a paper, it is assumed to be the sample odds ratio unless you say otherwise. So be sure to say so if you are giving odds calculated in any other way! =============================================================================== 3) horn honking by status of frustrator (Doob & Gross, 1968) > honk.table = array(data=c(18,32,18,6), dim=c(2,2), + dimnames=list("frustrator"=c("high.status","low.status"), + "result"=c("honk","no.honk")) + ) > honk.table result frustrator honk no.honk high.status 18 18 low.status 32 6 > odds(honk.table) [1] 0.1875 > 1/odds(honk.table) [1] 5.333333 > fisher.test(honk.table) Fisher's Exact Test for Count Data data: honk.table p-value = 0.002595 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.05239344 0.61835961 sample estimates: odds ratio 0.1921131 =============================================================================== A brief intermission during which we will do some repair work on the workspace. > tatt.df=tatt > tattoo.table=conting.tab > rm(conting.tab,tatt) > ls() [1] "honk.table" "odds" "surv.by.sex" "tatt.df" "tattoo.table" =============================================================================== 4) belief in afterlife by gender (GSS 1991, Agresti, p.17) > afterlife.table = array(data=c(435,375,147,134), dim=c(2,2), + dimnames=list("gender"=c("females","males"), + "belief"=c("yes","no.undecided"))) > afterlife.table belief gender yes no.undecided females 435 147 males 375 134 =============================================================================== 5) heart attack by aspirin use (Agresti, p.20, for ref) > aspirin.table = array(data=c(189,104,10845,10933), dim=c(2,2), + dimnames=list("group"=c("placebo","aspirin"),"heart.attack"=c("yes","no"))) > aspirin.table heart.attack group yes no placebo 189 10845 aspirin 104 10933 =============================================================================== 6) colds by vitamin C (1000 mg/day) use (Ramsey, p.517, for ref) this is an example of a "prospective study" > colds.table = array(data=c(335,302,76,105), dim=c(2,2), + dimnames=list("group"=c("placebo","vit.C"),"outcome"=c("cold","no.cold"))) > colds.table outcome group cold no.cold placebo 335 76 vit.C 302 105 =============================================================================== 7) lung cancer by cigarette smoking (Ramsey, p.518, for ref) this is an example of a "retrospective study") question: Why is the odds ratio better than a likelihood ratio here? > smoke.table = array(data=c(83,3,72,14), dim=c(2,2), + dimnames=list("group"=c("smokers","nonsmokers"), + "outcome"=c("cancer","control"))) > smoke.table outcome group cancer control smokers 83 72 nonsmokers 3 14 =============================================================================== 8) injury by use of seatbelts (Florida DMV 1988, Agresti, p.47, for ref) > seatbelts.table = array(data=c(1601,510,162527,412368), dim=c(2,2), + dimnames=list("group"=c("no.seatbelt","seatbelt"), + "injury"=c("fatal","nonfatal"))) > seatbelts.table injury group fatal nonfatal no.seatbelt 1601 162527 seatbelt 510 412368 questions: Retrospective or prospective? Likelihood okay? =============================================================================== 9) living with parents by gender (young adults age 19-25, see Moore 5e, p.524) > parents.table = array(data=c(986,923,1267,1706), dim=c(2,2), + dimnames=list("gender"=c("male","female"),"live.w.parents"=c("yes","no"))) > parents.table live.w.parents gender yes no male 986 1267 female 923 1706 =============================================================================== 10) survival of patients with coronary heart disease by pet ownership (survival after one year, Moore & McCabe 3e, p.646, for ref) > CHD.table = array(data=c(50,28,3,11), dim=c(2,2), + dimnames=list("pet.ownership"=c("yes","no"), + "patient.status"=c("alive","dead"))) > CHD.table patient.status pet.ownership alive dead yes 50 3 no 28 11 =============================================================================== 11) sentencing by race of defendant on trial for murder of white victim (data from NC 1993-1997, Howell 7e, p.145, for ref) > sentencing.table = array(data=c(33,33,251,508), dim=c(2,2), + dimnames=list("race.of.defendant"=c("nonwhite","white"), + "death.sentence"=c("yes","no"))) > sentencing.table death.sentence race.of.defendant yes no nonwhite 33 251 white 33 508 ===============================================================================