| Table of Contents | Function Reference | Function Finder | R Project | RELATED MEASURES t TEST Syntax The syntax for the t.test() function is given here from the help page in R. ```## Default S3 method: t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...) ## S3 method for class 'formula': t.test(formula, data, subset, na.action, ...)``` "S3" refers to the S language (version 3), which is often the same as the methods and syntax used by R. In the case of the t.test() function, there are two alternative syntaxes, the default, and the "formula" syntax. Both syntaxes are relevant to the two-sample t-tests. The default syntax requires two data vectors, "x" and "y", to be specified. To get the dependent measures t-test, the option "paired=" must be set to TRUE, which is not the default. The "alternative=" option is set by default to "two.sided" but can be set to any of the three values shown above. The default null hypothesis is "mu = 0", which in this case should be read as "mu1-mu2=0". This is usually what we want, but doesn't have to be, and should be changed by the user if this is not the null hypothesis. The rest is either irrelevant to this tutorial or can be ignored for the moment. The t Test With Two Dependent Groups Dependent groups can be the same subjects used again (repeated measures), or they can be matched samples. Either way, the t-test is performed on the difference scores and amounts to little more than a single sample t-test. A normal distribution of difference scores is strongly encouraged, unless the sample is large enough that you can hide behind the central limit theorem and claim a normal sampling distribution of means of the differences, in which case the t-test is robust to violations of the normality assumption. If you are doing the test on the difference scores, see the Single Sample t Test tutorial. If the groups are represented by two vectors (or columns in a data frame), then it is just a matter of setting the "paired=" option in the t.test() function to TRUE. One caution: If the data are two vectors in your workspace, you need to remember that the scores are paired, i.e., score 1 in the "x" vector is paired with score 1 in the "y" vector, score 2 with score 2, etc. The scores must be kept in the correct "paired" order in the two vectors. ```> data(anorexia, package="MASS") # weight gain (lbs.) in anorexic women > attach(anorexia) > str(anorexia) 'data.frame': 72 obs. of 3 variables: \$ Treat : Factor w/ 3 levels "CBT","Cont","FT": 2 2 2 2 2 2 2 2 2 2 ... \$ Prewt : num 80.7 89.4 91.8 74 78.1 88.3 87.3 75.1 80.6 78.4 ... \$ Postwt: num 80.2 80.1 86.4 86.3 76.1 78.1 75.1 86.7 73.5 84.6 ... > ft = subset(anorexia, subset=(Treat=="FT")) # just the family therapy threatment > ft Treat Prewt Postwt 56 FT 83.8 95.2 57 FT 83.3 94.3 58 FT 86.0 91.5 59 FT 82.5 91.9 60 FT 86.7 100.3 61 FT 79.6 76.7 62 FT 76.9 76.8 63 FT 94.2 101.6 64 FT 73.4 94.9 65 FT 80.5 75.2 66 FT 81.6 77.8 67 FT 82.1 95.5 68 FT 77.6 90.7 69 FT 83.5 92.5 70 FT 89.9 93.8 71 FT 86.0 91.7 72 FT 87.3 98.0 > detach(anorexia) > rm(anorexia)``` The anorexia data frame has been retrieved from the "MASS" package, and the data corresponding to the Family Therapy treatment have been extracted in a new data frame called "ft" (and we cleaned up after ourselves). A note: in the subset() function, "subset=" is the second argument in the default syntax, so this command could have been written somewhat more simply and logically as: ft=subset(anorexia,Treat="FT"). The data frame has to be attached to do this. Otherwise, you must use anorexia\$Treat="FT". At this point, "ft" is set up in such a way that a test on the difference scores would be easy. ```> t.test(Postwt-Prewt, mu=0, data=ft, alternative="greater") Error in t.test(Postwt - Prewt, mu = 0, data = ft, alternative = "greater") : object "Postwt" not found``` EXCEPT there is no "data=" option unless we are using the formula interface. Drat! ```> with(ft, t.test(Postwt-Prewt, mu=0, alternative="greater")) # we could also attach(ft) One Sample t-test data: Postwt - Prewt t = 4.1849, df = 16, p-value = 0.0003501 alternative hypothesis: true mean is greater than 0 95 percent confidence interval: 4.233975 Inf sample estimates: mean of x 7.264706``` The null hypothesis is rejected at any reasonable alpha level. The 95% CI tells us the true mean difference is 4.23 or more with 95% confidence, and the sample mean difference is reported as 7.26 lbs. I.e., women receiving family therapy for anorexia gained, on average, 7.26 pounds during the treatment period. If you want a different confidence level, set that with the "conf.level=" option. The same result will be obtained from the dependent t-test. ```> with(ft, t.test(Postwt, Prewt, paired=T, alternative="greater")) Paired t-test data: Postwt and Prewt t = 4.1849, df = 16, p-value = 0.0003501 alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: 4.233975 Inf sample estimates: mean of the differences 7.264706``` In the first version of the test, the single-sample version, we let R do the subtraction to get the difference scores right inside the t.test() function. No use in creating a new data object that we will just have to discard anyway. In the second version, we listed the two vectors individually with a comma between them. This works as long as Xi in the first vector corresponds to Yi in the second vector, and as long as the "paired=T" option is set. (Otherwise, we will get an independent t-test.) Notice R subtracts first group listed minus second group listed. This determined that the alternative should be set as "greater" in this case, since we expect the patients to gain weight during the treatment period. A formula interface is also available for instances where the data frame is arranged in long form. Our current data frame is very simple. It contains two columns of scores representing the two times at which patients were measured (and a useless Treat column). Getting such a data frame into long form is relatively easy. ```> ft\$Treat = NULL # delete unnecessary columns > ft.long = stack(ft) > ft.long values ind 1 83.8 Prewt 2 83.3 Prewt 3 86.0 Prewt 4 82.5 Prewt 5 86.7 Prewt 6 79.6 Prewt 7 76.9 Prewt 8 94.2 Prewt 9 73.4 Prewt 10 80.5 Prewt 11 81.6 Prewt 12 82.1 Prewt 13 77.6 Prewt 14 83.5 Prewt 15 89.9 Prewt 16 86.0 Prewt 17 87.3 Prewt 18 95.2 Postwt 19 94.3 Postwt 20 91.5 Postwt 21 91.9 Postwt 22 100.3 Postwt 23 76.7 Postwt 24 76.8 Postwt 25 101.6 Postwt 26 94.9 Postwt 27 75.2 Postwt 28 77.8 Postwt 29 95.5 Postwt 30 90.7 Postwt 31 92.5 Postwt 32 93.8 Postwt 33 91.7 Postwt 34 98.0 Postwt``` For the t-test, that's all we need. (If there were more than two groups, or two measurement times, this would be more complex.) In such a data frame, however, the subjects' scores must remain paired. I.e., the first Prewt must correspond to--be from the same (or possibly matched) subject--as the first Postwt, the second Prewt with the second Postwt, etc. If that's NOT the case, then we would need a subject identifier in the data frame, AND we would need to do a different test. We could also rename the columns if we want to, but it's not mandatory. ```> t.test(values ~ ind, paired=T, alternative="less", data=ft.long) ### THIS IS WRONG! Paired t-test data: values by ind t = 4.1849, df = 16, p-value = 0.9996 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf 10.29544 sample estimates: mean of the differences 7.264706``` We got a little confused about the order in which the subtraction would be done. When R is fed a factor as part of a formula, it arranges the factor levels in alphabetical order. A summary will confirm this. ```> summary(ft.long) values ind Min. : 73.40 Postwt:17 1st Qu.: 80.78 Prewt :17 Median : 86.35 Mean : 86.86 3rd Qu.: 93.47 Max. :101.60``` And R subtracts first level minus second level, i.e., in this case, Postwt minus Prewt. That would make the correct alternative "greater". (The order in which the factor levels occur in the data frame is irrelevant.) ```> t.test(values ~ ind, paired=T, alternative="greater", data=ft.long) Paired t-test data: values by ind t = 4.1849, df = 16, p-value = 0.0003501 alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: 4.233975 Inf sample estimates: mean of the differences 7.264706``` There we go. But, WARNING, the factor levels don't have to be seen by R in alphabetical order! So it's always a good idea to check. ```> levels(ft.long\$ind) [1] "Postwt" "Prewt"``` Why might they be in some other order? Because WE, the humans, put them in some other order. Unlike a lot of modern software, R does what it's told to do by the humans who are using it! Well... usually it does! Don't do what I'm about to do! ```> ### WARNING: DO NOT DO THIS! THIS IS VERY BAD!! ### > levels(ft.long\$ind) = c("Prewt","Postwt") [1] "Prewt" "Postwt" > summary(ft.long) values ind Min. : 73.40 Prewt :17 1st Qu.: 80.78 Postwt:17 Median : 86.35 Mean : 86.86 3rd Qu.: 93.47 Max. :101.60``` The joke is certainly on me! Turns out the levels() function, not only relevels the variable, it also relabels it. I.e., it flips the "ind" vector without flipping the "values" vector! That is very bad and surely a bug (and if it isn't, it certainly should be). So what is the "correct" way to do it? Okay, you can do this one, but be very careful with your typing. (And remember, if you have the data frame attached, detach it first.) ```> ft.long\$ind = factor(ft.long\$ind, levels=c("Prewt","Postwt")) ### WARNING: DON'T MISTYPE! > ### Now the original test is the correct one! > t.test(values ~ ind, paired=T, alternative="less", data=ft.long) Paired t-test data: values by ind t = -4.1849, df = 16, p-value = 0.0003501 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf -4.233975 sample estimates: mean of the differences -7.264706 > ### Because... > levels(ft.long\$ind) [1] "Prewt" "Postwt" > ### ... the factor has been releveled without the data frame being altered!``` When you are fooling with the levels of a factor, DON'T MISTYPE! A typing mistake here could wipe out your variable in the data frame. Take my word for that! I've done it! (Sadly, more than once!) Now the subtraction will be in the order Prewt minus Postwt, and the original test with alternative="less" is the correct one. Confusing! But that is the price you pay for having repeated measures factors! Just be sure you know the order in which R is going to see your factor levels. You can do that with levels() WITHOUT AN ASSIGNMENT, or with summary(). Sane people would not organize their data frame as we have above. Instead, sane people would keep the data from each subject together on contiguous lines of the data frame, like this. ```> ### If you want to follow along, type the following to reorder the data frame. ### > ft.long\$subjects = rep(LETTERS[1:17],2) > ft.long = ft.long[order(ft.long\$subjects),] > ft.long values ind subjects 1 83.8 Prewt A 18 95.2 Postwt A 2 83.3 Prewt B 19 94.3 Postwt B 3 86.0 Prewt C 20 91.5 Postwt C 4 82.5 Prewt D 21 91.9 Postwt D 5 86.7 Prewt E 22 100.3 Postwt E 6 79.6 Prewt F 23 76.7 Postwt F 7 76.9 Prewt G 24 76.8 Postwt G 8 94.2 Prewt H 25 101.6 Postwt H 9 73.4 Prewt I 26 94.9 Postwt I 10 80.5 Prewt J 27 75.2 Postwt J 11 81.6 Prewt K 28 77.8 Postwt K 12 82.1 Prewt L 29 95.5 Postwt L 13 77.6 Prewt M 30 90.7 Postwt M 14 83.5 Prewt N 31 92.5 Postwt N 15 89.9 Prewt O 32 93.8 Postwt O 16 86.0 Prewt P 33 91.7 Postwt P 17 87.3 Prewt Q 34 98.0 Postwt Q``` It doesn't matter as far as the t.test() function is concerned. That function will pair the first value it finds of Prewt with the first value it finds of Postwt, the second value with the second value, etc. ```> levels(ft.long\$ind) [1] "Prewt" "Postwt" > t.test(values ~ ind, paired=T, alternative="less", data=ft.long) Paired t-test data: values by ind t = -4.1849, df = 16, p-value = 0.0003501 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf -4.233975 sample estimates: mean of the differences -7.264706``` Of course, we should be pilloried in the public square for not doing a proper examination of our data beforehand. ```> qqnorm(ft\$Postwt - ft\$Prewt) # output not shown > qqline(ft\$Postwt - ft\$Prewt) > plot(ft\$Prewt, ft\$Postwt) > plot(ft\$Prewt, Change <- ft\$Postwt-ft\$Prewt) # cannot use = here!``` We have a problem with normality in a small data set (n=17). We also have a problem with nonresponders (four women whose body weight didn't change or actually fell during therapy). And even among those who did change, the effect was not additive. Women who started off with lower body weights tended to change the most. These are all violations of the assumptions of the test. Matched Samples The previous example was a repeated measures design--i.e., the same subjects (experimental units) were being measured repeatedly. What if each subject is measured only once, but the subjects are paired up on a matching variable? Then we have a matched samples (or matched subjects or matched groups) design. I want to be sure to include one of these, because I'm going to refer to it in a later tutorial. The important thing to remember for now is that the analysis is the same. For decades it's been suspected that schizophrenia involves anatomical abnormalities in the hippocampus, an area of the brain involved with memory. The following data bearing on this issue are from Suddath et al. (1990) and were used by (and obtained from) Ramsey and Schafer (3rd ed., 2013, p. 31. Display 2.2). The researchers obtained MRI measurements of the volume of the left hippocampus from 15 pairs of identical twins discordant for schizophrenia. The data are displayed in the following table. (You should be able to copy and paste the following lines to get the data into R.) ```schizophrenia = read.table(header=T, text=" pair affected unaffected 1 1.27 1.94 2 1.63 1.44 3 1.47 1.56 4 1.39 1.58 5 1.93 2.06 6 1.26 1.66 7 1.71 1.75 8 1.67 1.77 9 1.28 1.78 10 1.85 1.92 11 1.02 1.25 12 1.34 1.93 13 2.02 2.04 14 1.59 1.62 15 1.97 2.08 ")``` Just eyeballing it leaves little doubt that the hippocampus is smaller in the affected cotwin--the "interocular trauma test" as my old stat prof (Tom Wickens) called it, because the result jumps out and hits you between the eyes. Most journal editors don't recognize this statistical technique, however, and would prefer that we cite the result of a t-test. Here it is. ```> with(schizophrenia, t.test(affected, unaffected, paired=T, alternative="less")) Paired t-test data: affected and unaffected t = -3.2289, df = 14, p-value = 0.003031 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf -0.09029832 sample estimates: mean of the differences -0.1986667``` Ramsey, Fred L. and Schafer, Daniel W. The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed.). Boston: Brooks/Cole Cengage, 2013. (Note: If anyone is interested in my opinion, this is one of the very best statistics books I've ever had the pleasure of encountering.) Suddath, R. L., et al., (1990). Anatomical abnormalities in the brains of monozygotic twins discordant for schizophrenia. New England Journal of Medicine, 322(12), 789-793. Alternatives to the Related Samples t Test The primary alternative to the paired t-test when normality is in question has historically been the Wilcoxin signed ranks test, the syntax of which (from the help page) is very similar. ```wilcox.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, exact = NULL, correct = TRUE, conf.int = FALSE, conf.level = 0.95, ...)``` It will work either without or with the formula interface and also as a single sample test. ```> ### single sample case > with(ft, wilcox.test(Postwt-Prewt, mu=0, alternative="greater")) Wilcoxon signed rank test data: Postwt - Prewt V = 142, p-value = 0.0004196 alternative hypothesis: true location is greater than 0 > ### two-sample case without the formula interface > with(ft, wilcox.test(Postwt, Prewt, alternative="greater", paired=T)) Wilcoxon signed rank test data: Postwt and Prewt V = 142, p-value = 0.0004196 alternative hypothesis: true location shift is greater than 0 > ### two-sample case with the formula interface > wilcox.test(values ~ ind, paired=T, alternative="less", data=ft.long) Wilcoxon signed rank test data: values by ind V = 11, p-value = 0.0004196 alternative hypothesis: true location shift is less than 0 > ### with a confidence interval > wilcox.test(values ~ ind, paired=T, alternative="less", data=ft.long, conf.int=T) Wilcoxon signed rank test data: values by ind V = 11, p-value = 0.0004196 alternative hypothesis: true location shift is less than 0 95 percent confidence interval: -Inf -4.05 sample estimates: (pseudo)median -7.65``` In older versions of R, there was no "data=" option in the wilcox.test() function. That appears to have been corrected in newer versions. revised 2016 January 30 | Table of Contents | Function Reference | Function Finder | R Project |