Whenever I have a few seconds of spare time and feel like torturing myself, I go back to reading a paper I have blogged about previously (see here and here). Each reading reveals more errors, and my remarks over the previous blog posts reflect that. Initially I thought Study 1 was probably okay or less problematic than Study 2. However, Study 1 is every bit as problematic as Study 2. I think I was so overwhelmed by the insane amount of errors in Study 2 that I had no energy left to devote to Study 1. And I do want to circle around to Study 1. But first, I want to add one more remark to Study 2.
With regard to Study 2, I focused on the very odd reporting of degrees of freedom (df) for each statistical analysis, given that the experiment had 240 participants. I showed that if we were to believe those df to be correct (hint: we shouldn't), there were several decision errors. And to top it off, the authors try to write off what appears to be a statistically significant 3-way interaction as non-significant. That would still be the case even if the appropriate df were reported. The so-called main effect of violent video games on reaction time to aggressive versus non-aggressive goal words was inadequate. As noted before, not only were the df undoubtedly wrong, but the analysis does not compare the difference in reaction times between the treatment and control conditions. I would have expected either a 2x2 ANOVA demonstrating the interaction or the authors to compute the differences (in milliseconds) between aggressive and non-aggressive goal words for both the treatment and control groups, and then to compute the appropriate one-way ANOVA or t-test. Anderson et al (1998) took this latter approach and were quite successful. At least the authors offered means for that main analysis. In subsequent analyses, the authors quickly dispense with reporting means at all. In no case do the authors report standard deviations. That's the capsule summary of my critique up to this point. Now to add the proverbial cherry on top: the one time that the authors do report the mean and standard deviation together was when reporting the age of the participants, and even then the authors manage to make a mess of things.
Recall that the authors had a sample of 240 children ranging in age from 9 to 12 years for Study 2. The mean age for the participants was 11.66 with a standard deviation of 1.23. Since age can be treated as integer data, I used a post-peer-review tool called SPRITE to make sure that the mean age and standard deviation were mathematically possible. To do so, I entered the range of possible ages (as provided by the authors), the target mean and standard deviation, and the maximum number of distributions to generate. To my chagrin, I got an error message. Specifically, I was informed by SPRITE that the target standard deviation I had provided, based on what the authors reported, was too large. The largest mathematically plausible standard deviation was 1.17. Even something as elementary as the mean and standard deviation of participants' age gets messed up. You can try SPRITE for yourself and determine if what I am finding is correct. My guess is you will. Below is the result I obtained. I prefer to show my work.
So Study 2 is not to be trusted at all. What about Study 1? It's a mess for its own reasons. I'll circle back to that in a future post.