Saturday, November 30, 2019

Revisiting the weapons effect database: The allegiance effect redux

A little over a year ago, I blogged about some of the missed opportunities from the Benjamin et al. (2018) meta-analysis. One of those missed opportunities was to examine something known as an investigator allegiance effect (Luborsky et al., 2006). As I noted at the time, I gave credit where credit was due (Sanjay Srivastava, personal communication) for the idea. I simply found a pattern as I was going back over the old database and Dr. Srivastava quite aptly told me what I was probably seeing. It wasn't too difficult to run some basic meta-analytic results through CMA software and tentatively demonstrate that there appears to be something of an allegiance effect.

So, just to break it all down, let's recall when the weapons effect appears to occur. Based on the old Berkowitz and LePage (1967) experiment, the weapons effect appears to occur when individuals are exposed to a weapon and are highly provoked. Under those circumstances, the short-term exposure to weapons instigates an increase in aggressive behavior. Note that this is presumably what Carlson et al. (1990) found in their early meta-analysis. So far, so good. Now, let's see where things get interesting.

I have been occasionally updating the database. Recently have added some behavioral research, although it is focused on low provocation conditions. I am aware of some work recently conducted under conditions of high provocation, but have yet to procure those analyses. That said, I have been going over the computations, double and triple checking them once more, cross-validating them and so on. Not glamorous work, but necessary. I can provide basic analyses along with funnel plots. If there is what Luborsky et al. (2006) define as an allegiance effect, the overall mean effect size for work conducted by researchers associated with Berkowitz should be considerably different than work conducted by non-affiliated researchers. The fairest test I could think of was to concentrate on studies in which there was a specific measure of provocation, a specific behavioral measure of aggression, and - more to the point - to concentrate on high provocation subsamples, based on the rationale provided by Berkowitz and LePage (1967) and Carlson et al. (1990). I coded these studies based on whether the authors were in some way affiliated with Berkowitz (e.g., former grad students, post-docs, or coauthors) or were independent. That was fairly easy to do. Just took a minimal amount of detective work. I then ran the analyses.

Here is the mixed-effects analysis:



The funnel plot also looks pretty asymmetrical for those in the allegiance group (i.e. labelled yes). The funnel plot for those studies in the non-allegiance group appears more symmetrical. Studies in the allegiance group may be showing considerable publication bias, which should be of concern. Null studies, if they exist, are not included.


Above is the funnel plot for studies from the allegiance group.


Above is the funnel plot for studies from the non-allegiance group.

I can slice these analyses any of a number of ways. For example, I could simply examine subsamples that are intended to be direct replications of the Berkowitz and LePage (1967) paper. I can simply collapse across all subsamples, which is what I did here. Either way, the mean effect size will trend higher when the authors are interconnected. I can also document that publication bias is a serious concern when examining funnel plots of the papers in which the authors have some allegiance to Berkowitz than when not. That should be concerning.

I want to play with these data further as time permits. I am hoping to incite a peer to share some unpublished data with me so that I can update the database. My guess is that the findings will be even more damning. I say so relying only on the basic analyses that CMA provides along with the funnel plots.

For better or for worse I am arguably one of the primary weapons effect experts - to the extent that we define the weapons effect as the influence of short-term exposure of weapons on aggressive behavioral outcomes as measured in lab or field experiments. That expertise is documented in some published empirical work - notably Anderson et al. (1998) in which I was responsible for Experiment 2, and Bartholow et al. (2005) in which I was also primarily responsible for Experiment 2, as well as the meta-analysis on which I was the primary author (Benjamin et al., 2018). I do know this area of research very well, am quite capable of looking at the data available, and changing my mind if the analyses dictate - in other words, I am as close to objective as one can get when examining this particular topic. I am also a reluctant expert given that the data dictate the necessity of adopting a considerably more skeptical stance after years of believing the phenomenon to be unquestionably real. I do have an obligation to report the truth as it appears in the literature.

As it stands, not only should we be concerned that the aggressive behavioral outcomes reported in Berkowitz and LePage (1967) represent something of an urban myth, but that in general the mythology appears to be largely due to mostly published reports by a particular group of highly affiliated authors. There appears to be an allegiance effect in this literature. Whether a similar effect exists in the broader body of media violence studies remains to be seen, but I would not be surprised if such an allegiance effect existed.

Friday, November 8, 2019

The New Academia

Rebecca Willen has an interesting post up on Medium about some alternatives to the tradition academic model of conducting research. I doubt traditional academia is going anywhere, but it is quite clear that independent institutions can fill in some gaps and provide some freedoms that might not be afforded elsewhere. In the meantime, this article offers an overview of the potential challenges independent researchers might face and how those challenges can be successfully handled. Worth a read.

Monday, November 4, 2019

Another resource for sleuths

This tweet by Elizabeth Bik is very useful:

The site she used to detect a publication that was self-plagiarized not only in terms of data and analyses but also in terms of text can be found here: Similarity Texter. I will be adding that site to this blog's links. I think as a peer reviewer it will help in detecting potential problem documents. Obviously I see the utility for post-peer review. Finally, any of us as authors who publish multiple articles and chapters on the same topic would do well to run our manuscripts through this particular website prior to submission to any publishing portal. Let's be real and accept that the major publishing houses are very lax when it comes to screening for potential duplicate publication, in spite of the enormous profits that they make from taxpayers across the planet. We should also be real about the quality of peer review. As someone who has been horrified to receive feedback on manuscript from a supposedly reputable journal in less than 48 hours, I think a good case can be made as an author for taking things into your own hands as much as possible. That along with statcheck can save some embarrassment as well as ensure that we as researchers and authors do due diligence to serve the public good.

Friday, November 1, 2019

To summarize, for the moment, my series on the Zhang lab's strange media violence research

It never hurt to keep something of a cumulative record of one's activities when investigating any phenomenon, including secondary analyses.

In the case of the work produced in the lab of Qian Zhang, I have been trying to understand their work, and what appears to have gone wrong with their reporting, for some time. Unbeknownst to me at the time in 2014, I was already encountering one of the lab's papers when by the luck of the draw I was asked to review a manuscript that I would later find was coauthored by Zhang. As I have previously noticed, that paper had a lot of problems and I recommended as constructively as I could that the paper not be published. It was published anyway.

More explicitly, I found a weapons priming article published in Personality and Individual Differences at the start of 2016. It was an empirical study and one that fit the inclusion criteria for a meta-analysis that I was working on at the time. However, I ran into some really odd statistical reporting, leaving me unsure as to what I should use to estimate an effect size. So I sent what I thought was a very polite email to the corresponding author and heard nothing. After a lot of head-scratching, I figured out a way to extract effect size estimates that I felt semi-comfortable with. In essence the authors had no main effect for weapon primes on aggressive thoughts - and it showed in the effect size estimate and confidence intervals. That study really had a minimal impact on the overall mean effect size for weapon primes on aggressive cognitive outcomes in my meta-analysis. I ran analyses and later re-ran analyses and went on with my life.

I probably saw a tweet by Joe Hilgard who was reporting some oddities in another Zhang et al paper sometime in the spring of 2018. That got me wondering what all I was missing. I made a few notes, bookmarked what I needed to bookmark, and came back to that question a bit later in the summer of 2018 when I had a bit of time and breathing room. By this point I could comb through the usual archives, EBSCO databases, ResearchGate, and Google Scholar, and was able to hone in on a fairly small set of English-language empirical articles coauthored by Qian Zhang of Southwest University. I saved all the PDF files, and did something that I am unsure if anyone had done already: I ran the articles through statcheck. With one exception at the time, all the papers I ran through statcheck that had the necessary elements reported (test stat value, p-value, degrees of freedom) showed serious decision errors. In other words, the conclusions the authors were drawing in these articles were patently false based on what they had reported. I was also able to document that the reported degrees of freedom were inconsistent within articles, and often much smaller than the reported sample sizes. There were some very strange tables in many of these articles that presumably reported means and standard deviations but looked more like poorly constructed ANOVA summary tables.

I first began tweeting about what I was finding in mid-to-late September 2018. I think between some conversations via Twitter and email, I at least was convinced that I had spotted something odd, and that my conclusions so far as they went were accurate. Joe Hilgard was especially helpful in confirming what I had found, and then going well beyond that. Someone else honed in on inaccuracies in the reporting of the number of reaction time trials reported in this body of articles. So that went on throughout the fall of 2018. By this juncture, there were a few folks tweeting and retweeting about this lab's troubling body of work, some of these issues were documented by individuals in PubPeer, and editors were being contacted, with varying degrees of success.

By spring of this year, the first corrections were published - one in Youth and Society and a corrigendum in Personality and Individual Differences. To what extent those corrections can be trusted is still an open question. At that point, I began blogging my findings and concerns here, in addition to the occasion tweet.

This summer, a new batch of errata were made public concerning articles published in journals hosted by a publisher called Scientific Research. Needless to say, once I became aware of these errata, I downloaded those and examined them. That has consumed a lot of space on this blog since. As you are now well aware, these errata themselves require errata.

I think I have been clear about my motivation throughout. Something looked wrong. I used some tools now at my disposal to test my hunch and found that my hunch appeared to be correct. I then communicated with others who are stakeholders in aggression research, as we depend on the accuracy of the work of our fellow researchers in order to get to as close an approximation of the truth as is humanly possible. At the end of the day, that is the bottom line - to be able to trust that the results in front of me are a close approximation of the truth. If they are not, then something has to be done. If authors won't cooperate, maybe editors will. If editors don't cooperate, then there is always a bit of public agitation to try to shake things up. In a sense, maybe my role in this unfolding series of events is to have started a conversation by documenting what I could about some articles that appeared to be problematic. If the published record is made more accurate - however that must occur - I will be satisfied with the small part I was able to play in the process. Data sleuthing, and the follow-up work required in the process, is time-consuming and really cannot be done alone.

One other thing to note - I have only searched for English-language articles published by Qian Zhang's lab. I do not read or speak Mandarin, so I may well be missing out on a number of potentially problematic articles in Chinese-language psychological journals. If someone who does know of such articles wishes to contact me please do. I leave my DM open on Twitter for a reason. I would especially be curious to know if there are any duplicate publications of data that we are not detecting. 

As noted before, how all this landed on my radar was really just the luck of the draw. A simple peer review roughly five years ago, and a weird weapons priming article that I read almost four years ago were what set these events in motion. Maybe I would have noticed something was off regardless. After all, this lab's work is in my particular wheelhouse. Maybe I would not have. Hard to say. All water under the bridge now. What is left is what I suspect will be a collective effort to get these articles properly corrected or retracted.