Psyc 480 -- Dr. King

Some Tips For Reading the Ashby Article

Note where the lead author, Jean Ashby, is from. When they refer to a "large
Mid-Atlantic Community College, gee, I wonder which one it could be!

They talk a lot about an "attrition adjusted sample." In other words, some
subjects dropped out of the study, by either dropping the course, or by just
not showing up after a certain point. Why might this be a problem?

Answer: This is a problem we will have to face again. When subject attrition
is not random with respect to treatment groups, that creates a potential
confound. These authors were aware of that and did their best to adjust for
any confound that might have been created by sample attrition.

We're not going to look at the attribtion adusted sample. So make what you
can out that, but eventually it gets a little confusing to figure out exactly
what they are doing. (It was confusing even to me!)

What was the IV in this study? Answer: method of course delivery.

What were the levels of the IV? Answer: face-to-face, blended (what we call
hybrid here at CCU), and online.

What was the DV? Answer: There were many. Here's a tip for when you're doing
your dissertation (Ph.D. research): collect data on EVERYTHING! Nobody cares
if you get a significant result in your 497 project, but if you don't in your
disseration research, you start again. The more DVs you have, the more of a
chance you have to find a significant effect. In the Ashby article, we will be
looking at scores only on the final standardized exam (the IACE).

Was this a true experiment? Answer: Nope! They even said it in the correct
terminology. Subjects self-selected their method of course delivery.

What was the justification for this study? Answer: Online course delivery is
becoming increasingly popular, and while there are several studies looking at
student success in this format at 4-year colleges, there was not much at
community colleges where the student population tends to be different in many
respects. Also, they were looking at success in a particularly important area,
remedial math.

"While the number of students needing developmental coursework continues to
grow,research on this population and their success rate,is very limited
(Barnett, 2008; Esch, 2009). Moreover, community colleges continue to create
online courses and enroll students in these courses who may or may not be
technically and educationally experienced enough to succeed.Growing community
college enrollment,specifically in online and developmental courses, invites
the need for research with this population; sadly,very little research focuses
on online students in community colleges." (p.129)

How did the researchers attempt to control for differences that might arise in
the courses being offered by different instructors and different delivery
methods? Answer: Many ways. It's discussed under "Research Setting" on p.130.
They didn't want the courses to be different in two ways (confound), so they
took measures to be as sure as possible the courses were the same except for
the method of course delivery.

"There was a significant difference in age by learning environment, F(2,164)
= 8.19, p < 0.001, with the online students (M = 28.75, SD = 8.19) being the
oldest group." (p.131-2) What kind of problem might this create? Answer: Once
again, it creates a possible confound, but this is the kind of problem you
often face when subjects self-select their treatment condition (are not
randomly assigned to groups).

"There was also a significant difference in gender, chi.square(2, N = 167) =
8.04, p = 0.018, with the online class having the largest percentage of females
(71%), over both the face-to-face (47%) and blended(54%) environments." (p.132)
Same problem? Answer: Yup!

We are going to try to duplicate the results in Table 2 (p.134) and one of
the results in Table 3 (p.135). You can get simulated data for the IACE scores
from the website as follows:

> file = "http://ww2.coastal.edu/kingw/psyc480/data/Ashby.txt"
> Ashby = read.table(file=file, header=T, stringsAsFactors=T)

--OR--

> getData("Ashby.txt")   # if you still have getData in your workspace.

Notice in Table 2 they give both number correct and percentage correct on the
IACE, along with the corresponding SDs. Does it matter which one we analyze?
Answer: No. The authors analyzed the percentages. We have number correct in
our data frame. It is a simple transformation to get from one to the other.
There were 48 items on the IACE, so multiply number correct by 100/48 to get
the percentages. (Or 2.0833. Divide by 48 and multiply by 100.)

Do the standard deviations transform the same way? I.e., does multiplying the
SD of number correct by 2.0833 give the SD of percent correct? Answer: Try it
and see. (Yes.)

It doesn't matter which one we do the analysis on. We'll get the same result.

Using tapply(), get the means and SDs of number correct (IACE.score) by group
and compare them to the statistics reported in Table 2. Are they the same?
Answer: Very nearly. Remember, these are simulated data, not the authors'
actual data, so they won't be exact.

Refer to Table 3, next to last line, the results of the analysis on IACE, the
standardized final exam. They found F(2,164) = 3.13, p = 0.046. Did you get
the same result when you did the ANOVA? Answer: I hope your answer was very
close to that. I got F(2,164) = 3.097, p = 0.0478.

Should we reject the null hypothesis of equal means for these populations?
Answer: Like a moldy canteloupe! (Yes.) Just barely, but yes.

They did a Tukey HSD test as the post hoc test and found the face.to.face
group had a significantly higher mean score than the blended group. Is that
what you got when you did the post hoc test? Answer: I hope so.

Were there any other significant differences in the pairwise comparisons?
Answer: Nope.

Is this a clear-cut result that allows clear-cut conclusions about the relative
value of these methods of course delivery? Answer: No.

Get the Fisher LSD p-values as follows:

> with(Ashby, pairwise.t.test(IACE.score, class.type, "none"))

Are the conclusions the same (same groups different)? Answer: Yes.

How many significant differences would you have found with the Bonferroni-Dunn
test? Answer:

> with(Ashby, pairwise.t.test(IACE.score, class.type, "bonf"))   # none

What about the Holm-Bonferroni test? Answer:

> with(Ashby, pairwise.t.test(IACE.score, class.type, "holm"))   # none

How can you find a significant effect with the overall ANOVA but not find any
sigbificant differences in the pairwise comparisons? Answer: When you're
using a conservative post hoc test, especially when the p-value in the ANOVA
is close to .05, that often happens. Bonferroni-Dunn is about as conservative
as it gets, and Holm-Bonferroni makes the same adjustment to the lowest p-value
as Bonferroni-Dunn does.

Would it be fair to do all of these tests and then pick the one you like best?
Answer: NOOOOOOOOO! That's called fishing for statistics. (Seriously, that's
what it's called.) Plan your statistical analysis in advance, and then stick
to it.

I don't really care about the rest of the article, which deals with the
attrition adjusted sample. Read it or don't read it, up to you. You might find
the Conclusion (p.138) interesting though.