Saturday, April 20, 2019

A reminder that peer review is a very porous filter: Zheng & Zhang (2016)

This particular article is of interest to me as I was a peer reviewer on an earlier draft of the manuscript. I recommended rejection at the time, and was disappointed to see the article ultimately published with minimal changes. My initial point of contention was fairly basic: the manuscript appeared to be a rough draft, rather than one that had been finalized and ready for review. I noticed enough grammatical errors to have concerns about its appropriateness in the journal for which I was reviewing, or really any journal for that matter. If this had been an undergraduate project, I would have likely marked up the draft, and made gentle suggestions for how to improve the paper prior to the due date for a final draft. Instead, I had to respond to what was presumably the work of seasoned professionals. Then there were some glaring questions about research design and some of the data analyses, as something seemed just a little off. Anyway, I recommend rejection and move on with my life.

Then a few years later I see this same paper in a peer review journal. Look. Things happen. Peer reviewers are often under enormous stress and even the best editors are going to let some real dross pollute the literature from time to time. We're all only human, after all. So, I want to cut everyone on that side of the equation some slack.

The article itself is divided into two studies. I will focus primarily on Study 2, as that is where the most obvious problems can be detected. Study 1 was intended to determine two video games that were as equivalent as possible on a number of dimensions, except of course for the presence or absence of violent content. The reader can be the judge of whether or not the authors succeeded in that endeavor. At bare minimum, the reporting of those findings appears to be clean.

Study 2 is where things get interesting. Remember, the authors have three objectives when testing this sample of youths: 1) determining that there is a main effect of violent content of video games on accessibility of aggressive cognition (although the authors do not quite word it that way), 2) determining if there is a gender by video game interaction, and 3) determining if there is a trait aggressiveness by video game interaction. So far, so good.

It is clear that things start to fall apart in the method section. The authors selected 60 goal words for their reaction time task: 30 aggressive and 30 non-aggressive. These goal words are presented individually in four blocks of trials. The authors claim that their participants completed 120 trials total, when the actual total would appear to be 240 trials. I had fewer trials for adult participants in an experiment I ran over a couple decades ago and that was a nearly hour-long ordeal for my participants. I can only imagine the heroic level of attention and perseverance required of these children to complete this particular experiment. I do have to wonder if the authors tested for potential fatigue or practice effects that might have been detectable across blocks of trials. Doing so was standard operating procedure in our lab in the Aggression Lab at Mizzou back in the 1990s. Reporting those findings would have also been done - at least in a footnote when submitted for publication.

The results section is, to phrase this as nicely as possible, a mess. First, the way the authors go about reporting a main effect for video game violence on aggressive cognition is all wrong. The authors look only at the reaction times on aggressive words. What the authors should have done is compare difference scores between aggressive and neutral words in each condition - in other words, did the treatment cause more of a change from baseline than the control condition? It appears the answer is no when we look at the subsequent analyses. In those analyses, the DV is a difference score. On that basis alone, we can rule out a main effect of level of video game violence on aggressive cognition. As that sort of finding is a cornerstone of some cognitive models of aggression used to buttress an argument regarding the dangers of violent video games, the lack of a main effect on mere aggressive cognition is one that should raise eyebrows.

What happens when we look at potential interaction effects? Do subsamples save the day? Given the reporting errors that I could detect from a simple Statcheck run, that, too may be questionable, depending on what we are looking at. For example the authors manage to misreport the actual main effect of video game violence on aggressive cognition: F(1, 54) = 3.58 as p < .05 instead of p = .064 as computed by Statcheck. Oops. So much for that. The game by gender interaction was actually statistically significant, although not quite to the extent the authors reported: F(1, 62) = 4.89 as p < .01 instead of p = .031 as computed by Statcheck. Maybe subsamples will save the day. The aggressive cognition of boys seemed to be influenced by the level of violence in the game they played. The same effect was not found for girls. There were no obvious errors for the analyses of interaction between video game and trait aggressiveness. Subsample analyses seem to show that the effect is found among highly aggressive children but not those of moderate or low levels of aggressiveness.

There is of course a three-way interaction to contend with. The authors claim there was none, but they were wrong: F(1, 63) = 5.27, p > .05 instead of what I found on Statcheck, which indicated p = .025. That is a pretty serious decision error. Hence, the requisite follow-up analyses for this three-way interaction were apparently never performed or reported. There are of course some questions about why degrees of freedom vary so much from analysis to analysis. Although I am sure that there is a simple explanation for those anomalies, the authors don't bother to do so. We as the readers are left in the dark.

The reported findings are just error-prone enough to question the conclusions the authors wish to draw. My guess is that if the three-way interaction was truly significant, the authors would have a much more nuanced explanation of what the children in their experiment were experiencing. Regrettably, that opportunity is lost. At bare minimum, we as readers can say that the authors do not have the evidence to back up their claim that violent video games prime aggressive cognition. Those of us who have even a minimal background in media violence research can certainly question whether the authors' work for this experiment added to our understanding about the underlying processes presumed to exist between exposure to an aggression-inducing stimulus (in this case violent video games) and real life aggression. I have no idea if a correction or retraction will ever be forthcoming, but I can only hope that something is done to clarify the scientific record. At bare minimum, I would recommend against citing this particular article at face value. Those conducting meta-analyses and those who are looking for cautionary tales of how to not report one's findings probably will be citing this article for the foreseeable future.

Zheng, J., & Zhang, Q. (2016). Priming effect of computer game violence on children’s aggression levels. Social Behavior and Personality: An International Journal, 44(10), 1747–1759. doi:10.2224/sbp.2016.44.10.1747

No comments:

Post a Comment