Sunday, April 7, 2019

Do weapons prime aggressive thoughts? Not always!

This is probably as good a time as any to mention that Zhang et al. (2016) is not the only experiment showing no apparent priming effect of weapons on aggressive thoughts. Arguably, anyone who follows this particular literature closely - and admittedly that is only a small handful of personality and social psychology researchers - is well aware of the two experiments in William Deuser's 1994 dissertation. To my knowledge, both experiments were well-designed and executed. Mendoza's (1972) dissertation also deserves mention. In that experiment, children were exposed to toys that had weapons or neutral toys. The cognitive dependent variable was content from participants' responses to a projective test. Those findings were non-significant for boys and girls across multiple sessions.

Those are the null findings that are at least in the public record. When we get to non-published results that found no link between weapon primes and aggressive thoughts (however manipulated or measured), there is so much that is unknown. What little I do know is probably more a matter of personal communication and hearsay. Unfortunately, those do not exactly lend to effect sizes that can be computed and integrated. For example, I am aware of one effort at the end of the 1990s to run an experiment similar to the sort that Bartholow and I had run, except that weapons vs neutral objects were subliminally primed. That experiment was a nonreplication. Given what we know now of the subliminal priming literature (which is littered with non-replications) this is not surprising. How I would love to get a hold of those data and protocols, to the extent that those might have been written up. I am aware of another weapons priming experiment from this decade that was designed to have a bit more ecological validity. As I understand it, the undergraduate student attached to that particular project bailed part-way through data collection, and the project was dropped. No way of knowing whether a replication happened or not there. There are quite likely other non-replications and half-completed projects stored on hard drives somewhere that no one knows about. From the perspective of a meta-analyst, this is especially frustrating as I am left hoping that publication bias assessments are adequately reflecting reality.

The cornerstone of the weapons effect narrative is that weapons reliably prime aggressive thoughts, which presumably leads to a chain of psychological events leading up to an increase in aggressive behavioral outcomes. What would happen if we needed to remove that cornerstone? My guess is that what remains of an already shaky narrative would crumble. As the state of the literature currently stands, the question of weapons priming aggressive behavioral outcome is inconclusive at best. That may or may not change in the near future.

So, what can we do? I am at least going to try to move the conversation a little. One way you can help me do so is if you have collected data where you have a weapon prime (weapon vs neutral images or words) and some cognitive outcome variable (reaction times on pronunciation task, lexical decision task, Stroop task, etc.; scores from a word completion task), talk to me. Better yet, make your protocols, data, and analyses available. If it turns out that aggressive cognitive outcomes are reliably predicted by weapon primes, that's fantastic. But if they are not, I think the scientific community and the public have a right to know. Seems fair enough, right? So if you have something that needs to be brought to light, talk to me. I'm easy to find on Twitter (I answer DMs regularly) or email (that's public record). I even have links to all my social media. Contact me at any time. I will get back to you.

Closing the books on a correction (Zhang et al., 2016)

When I was updating the weapons effect database for a then-in-progress meta-analysis a little over three years ago, I ran across a paper by Zhang, Tian, Cao, Zhang, & Rodkin (2016). You can read the original here, as it required significant corrections. The Corrigendum can be found here.

Initially, I was excited, as it is not often that one finds a weapons effect paper published that is based on non-American or non-European samples. There were obvious problems from the start. First, although the authors purport to measure aggression in adolescents (in reality the sample were pre-adolescent children), in reality the dependent variable was a difference in reaction time between aggressive and non-aggressive words. To put it another way, the authors were merely measuring accessibility of aggressive thoughts that presumably would be primed by mere exposure to weapons.

The analyses themselves never quite added up, which made determining an accurate effect size estimate from their work to be, shall we say, a wee bit challenging. I attempted to contact the corresponding author asking for data and any code or syntax used in the hopes of reproducing the analyses and getting the information necessary and obtaining the effect size estimate that would most closely approximate the truth. That email was sent on January 26, 2016. I never heard from Qian Zhang. I figured out a work-around in order to obtain a satisfactory-enough effect size estimate and moved in.

But that paper always bothered me once the initial excitement wore off. I am well aware that I am far from alone in having some serious questions about the Zhang et al. (2016) article. Some of those could be written off as potential typos: there were some weird discrepancies in degrees of freedom across the analyses. The authors contended that they established that they had replicated work I had been involved in conducting (Anderson, Benjamin, & Bartholow, 1998) by simply examining if reaction times to aggressive words were more rapid when primed with weapons than neutral images. In our experiments, we used the difference between aggressive and non-aggressive words as our dependent variable. And based on the degrees of freedom reported, it appeared that the analysis was based on one subsample, as opposed to the complete sample. So obviously there are some red flags.

The various subsample analyses using a proper difference score (they call it AAS) also looked a bit off. And of course the MANOVA table seemed unusual, especially since the unit of analysis appeared to be their difference score (reaction times for aggressive words minus non-aggressive words) - a single dependent variable - as opposed to multiple dependent variables. Although I have rarely used MANOVA and am unlikely to use MANOVA in my own research, I certainly had enough training to know what such analyses should look like. My understanding is that one would report MS, df, and F values for each IV-DV relationship, with the understanding that there will be at least two DVs for every IV. A cursory glance at the most recent edition I had of a classic textbook on multivariate statistics by Tabachnick and Fidell (2012) convinced me that the summary table reported in the article was inappropriate, and would confuse readers rather than enlighten them. There were other questions about the extent to which the authors more or less copied and pasted content from the Buss and Perry (1992) article in which they present their Aggression Questionnaire. Those as of yet have not been adequately addressed, and I suspect they never will.

So, I ran the analyses the authors provided in statcheck.io. I had even more questions. There were numerous errors, including decision errors even assuming that the test statistics and their respective degrees of freedom were accurate. Just to give you a flavor, here are my initial statcheck analyses:



As you can see, the authors misreport F(1, 155) = 1.75 p < .05 (actual p = .188), F(1, 288) = 3.76 p < .01 (actual p = .054), and F(1, 244) = 1.67, p < .05 (actual p = .197). The authors also appeared to misreport a three-way interaction as non-significant that clearly was statistically significant. Statcheck could not catch that one due to the authors' failure to include any degrees of freedom in their report. Basically, there was no good reason to trust the analyses at this point. Keep in mind that what I have done here is something that anyone with a basic graduate-level grounding in data analysis and access to Statcheck could compute. Anyone can reproduce what I did. That said, communicating with others about my findings was comforting: I was not alone in seeing what was clearly wrong.

In consultation with some of my peers, something else jumped out: the authors reported an incorrect number of trials. The authors reported 36 primes and 50 goal words which were each randomly paired. The authors reported a total number of trials as 900. However, if you do the math, it becomes obvious that the actual number of trials was 1800. As someone who was once involved in conducting reaction time experiments, I know the importance of not only assessing the necessary number of trials depending on the number of stimuli and target words that must be randomly paired, but also the importance of accurately reporting the number of trials required of participants. It is possible that given their description in the article itself, the authors took the number 18 (for weapons, for example) and multiplied it by 50. In itself, that seems like a probable and honest error. It happens, although it would have been helpful for this sort of thing to have been worked out in the peer review process.

The corrections in the corrigendum suggest a rather massive correction to the article. The presumed MANOVA table never quite gets completely resolved to satisfaction, and a lingering decision error remains. The authors also start using the term marginally significant to refer to a subsample analysis that made me cringe. The concept of marginal significance was supposed to have been swept into the dustbin of history a long time ago. We are well enough along into the 21st century to avoid that vain attempt to rescue a finding altogether.  Whether the corrections noted in the corrigendum are sufficient to save the conclusions the authors wished to make in the article is questionable. At minimum, we can conclude that Zhang et al. (2016) did not find evidence of weapon pictures priming aggressive thoughts, and even their effort to base a partial replication on subsample analyses was not sufficient. It is a non-replication, plain and simple.

My recommendation is not to cite Zhang et al. (2016) unless absolutely necessary. If one is conducting a relevant meta-analysis, citation is probably unavoidable. Otherwise, the article is probably worth citing if one is writing about questionable reporting of research, or perhaps as an example of research that fails to replicate a weapons priming effect.

Please note that the intention is not to attack this set of researchers. My concern is strictly on the research report itself, and the apparent inaccuracies contained in the original research report. I am quite pleased that however it transpired, the editor and authors were able to quickly make corrections in this instance. Mistakes get made. The point is to make an effort to fix them when they are noticed. That should at least be normal science. So kudos to those involved in making the effort to do the right thing here.

References

Anderson, C. A., Benjamin, A. J., Jr., & Bartholow, B. D. (1998). Does the gun pull the trigger? Automatic priming effects of weapon pictures and weapon names. Psychological Science, 9, 308-314. doi: 10.1111/1467-9280.00061

Buss, A. H., & Perry, M. (1992). The aggression questionnaire. Journal of Personality and Social Psychology, 63, 452-459. doi:10.1037/0022-3514.63.3.452

Tabachnick, B. G., & Fidell, L. S. (2012). Using multivariate statistics. New York: Pearson.

Zhang, Q, Tian, J., Cao, J., Zhang, D., & Rodkin, P. (2016). Exposure to weapon pictures and subsequent aggression in adolescence. Personality and Individual Differences, 90, 113-118. doi: 10.1016/j.paid.2015.09.017.

Thursday, April 4, 2019

What attracted me to those who are trying to reform the psychological sciences?

That is a question I ask myself quite a bit. Actually the answer is fairly mundane. As the saying goes" "I've seen stuff. I don't recommend it."

Part of the lived reality of working on a meta-analysis (or the three I have worked on) is that you end up going down many of your research area's dark alleys. You see things that frankly cannot be unseen. What is even more jarring is how recently some of those dark alleys were constructed. I've seen it all: overgeneralization based on small samples relying on mild operational definitions of the variables under consideration, poorly validated measures, the mere fact that many studies are inadequately powered, and so on.

If you ever wonder why phenomena do not replicate, just wander down a few of our own dark alleys and you will understand rather quickly. The meta-analysis on the weapons effect, for which I was the lead author, was a huge turning point for me. Between the allegiance effects, the underpowered research, and some questions about how any of the measures employed were actually validated, I ended up with questions that had no satisfactory answer. I've been able to show that the decline effect that others had found when examining Type A Behavior Pattern and health outcomes also applied to aggressive behavioral outcomes. I was not surprised - only disappointed in the quality of the research conducted. That much of the research was conducted at a point in time in which there were already serious questions about the validity of what we refer to as Type A personality is itself rather disappointing. And yet, that work persisted for a while, often with small samples. I have also documented in my last two meta-analyses duplicate publications. Yes, the same data sets manage to appear in at least two different journals. I have questions. Regrettably, those who could answer are long since retired, if not deceased. Conduct a meta-analysis, and expect to find ethical breaches, ranging from potential questionable research practices to outright fraud.

That's a long way of saying that I get the need for doing whatever can be done to make what we do in the psychological sciences better: validated instruments, registered protocols and analysis plans, proper power analyses, and so on. There are many who are counting on us getting it as close to right as is humanly possible. Those include not only students, but the citizens who fund our work. There is no point to "giving away the science of psychology in the public interest (as George Miller would have put it) if we are not doing due diligence at the planning phase of our work.

Asking early research professionals to shoulder the burden is unfair. Those who are in more privileged positions need to step up. We need to be willing to speak truth to the teeth of power, otherwise there is no point in us even continuing, as all we have is a pretense with little substance. I wish I could say doing so would make one more marketable and so on. The reality is far more stark. At minimum we need to go to work knowing we have a clean conscience. Doing so will maintain public trust in our work. Failure is not something I even want to contemplate.

So I am a reformer. However long I am around in an academic environment, that is my primary role. Wherever I can support those who do the heavy lifting, I must do so. I have undergraduate students and members of the public in my community counting on it. In reality, we all do.

About those "worthless" Humanities degrees?

Well, they are not so "worthless" after all.

A clip:

Take a look at the skills employers say they’re after. LinkedIn’s research on the most sought-after job skills by employers for 2019 found that the three most-wanted “soft skills” were creativity, persuasion and collaboration, while one of the five top “hard skills” was people management. A full 56% of UK employers surveyed said their staff lacked essential teamwork skills and 46% thought it was a problem that their employees struggled with handling feelings, whether theirs or others’. It’s not just UK employers: one 2017 study found that the fastest-growing jobs in the US in the last 30 years have almost all specifically required a high level of social skills.

Or take it directly from two top executives at tech giant Microsoft who wrote recently: "As computers behave more like humans, the social sciences and humanities will become even more important. Languages, art, history, economics, ethics, philosophy, psychology and human development courses can teach critical, philosophical and ethics-based skills that will be instrumental in the development and management of AI solutions.

Worth noting: During our December graduation ceremony, fully half the names read, and a bit over half the names of graduates listed in the program were individuals who pursued degrees in the Humanities (we'll include the Social Sciences under that umbrella given how my university is organized). There are soft skills that can make one potentially flexible for any of a number of opportunities post-graduation. In our culture's obsession with workforce development, the Humanities (broadly defined) often are overlooked in the conversations among policymakers. That's not to denigrate the necessity of preparing students for life after graduation, but to bear in mind that there are lessons and skills acquired within the various Humanities majors that can prove valuable in a variety of fields that, on the surface, have nothing directly to do with the degree they earn. And yet, they may well be the people you want selling your next house or the next time you need orthopedic surgery.

Wednesday, April 3, 2019

What Would Buffy Do?

Here is a somewhat whimsical tweet:
I followed that up with the following statement:
There are times when I think back to Buffy's conflicts with the Watchers' Council and I notice how relevant that set of conflicts is to how reformers in the psychological sciences deal with an established hierarchy and rules that are still currently in place.
I am a fan of the TV series (and comic books that followed the series) Buffy the Vampire Slayer. I sometimes like to remark that the first three seasons were part of what got me through grad school. Perhaps that overstates things slightly, but it was a series that was in the right place at the right time.

A number of facets of that series fascinated me then, and continue to fascinate me now. One is the on-going conflict that Buffy Summers had initially with Rupert Giles (her Watcher) and by extension the Watchers' Council.

Buffy was never particularly keen on the mythology perpetuated by the Watchers' Council. You know that whole bit about how "unto each generation a Slayer is born", right? That never sat well with her, and to a certain degree the "One Slayer Theory" was effectively debunked the moment Buffy briefly (as in for a few seconds) died, before being revived. After that, there were effectively two slayers - the more notorious of those being Faith. Another story for another time.

She also was not too keen on the rituals and rites, nor the secretiveness that came with being a Slayer. Buffy let several non-slayers into her circle of friends and allies over the course of the series, initially to the chagrin of Giles. Maybe a bit more openness would help with slayage, I could imagine Buffy reasoning. Buffy also is adept at uncovering some of the ethically questionable practices of the Watchers' Council, including their use of torture and kidnapping (as experienced by Faith), and in doing so eventually severs her ties with the Watchers' Council. Indeed, that organization gets exposed over the course of the series as moribund and out of touch with changing realities that require action. Ultimately, she sets on a course that empowers potential slayers at all levels to become involved in the work that she had initially been told she alone must do. A certain amount of openness and cooperation, and a willingness to keep an open mind toward those who might not seem like allies on the surface proved beneficial by the end of the television series.

I doubt I am the first to connect the ethos of Buffy the Vampire Slayer to the struggles we are going through within the psychological sciences (replication, measurement, and theoretical crises) and the rather lackadaisical approach by those most positioned to effect change, to connect those reforming as struggling with a hierarchy that rewards maintaining an increasingly untenable status quo. I don't yet know anyone else who has made that connection. It's a series I have lately been rewatching and finding renewed inspiration.

Perhaps I will write this up into something a bit more formal. There is an actual journal devoted to Whedon Studies, which does cover Buffy the Vampire Slayer in quite a bit of detail. Pop culture and fandom are of some interest to me, even if I have not had much of an excuse to really explore that avenue in greater detail.

In the meantime, as I teach and as I deal with research projects, I ask myself the question, what would Buffy do? The answer to that question is often my guide for action.

Postscript to the preceding

What do you do when something you wrote is discredited? I had to confront that last year around this time. In my case, that meant coming to terms that an article on which I was a lead author turned out to have too much duplicate content. That's a nice way of saying it was self-plagiarized. I can say that the one thing that saved me is I saved all my emails with my coauthor on that one. Retraction Watch got my account correct. Regrettably, the site could not get the other players in that particular tragedy of errors to comment. As I said, I have all the emails. I also have evidence that the only original material was whatever I added to that manuscript. You take cold comfort where you can get it.

As time has passed, I have come to become more thankful for that retraction. The article was hopelessly out of date by the time it got published. The whole quality control and peer review process was so hopelessly messed up that I will likely never trust that journal outlet again. I also don't quite trust myself again. Writing turned out to be enormously difficult for a while. I now subscribe to plagiarism software on my own dime in order to assess that any new work I write is genuinely original. I actually re-ran prior articles I authored to make sure they were sufficiently original. For a while my whole thought process was paralyzed.

So, I have been putting safeguards in place. I have also used the fallout from that set of events as a way of assessing my priorities in life. One had to do with where the weapons effect fit in. I am tied to that line of work, so it will never quite leave me, but I am realizing that I am at a point where I can start tying up loose ends. Writing an article on the weapons effect as a solo author and finding an outlet that would give it an honest and thorough peer review was a start. Aside from a couple minor typos that never quite went away, I can now say that I can come out the other side as someone who can still write. More importantly, I decided to shift my focus from trying to establish or confirm media effects to making sure that dependent variables I rely on - and more specifically their operational definitions - make sense and actually do the job they were intended to do. That work - namely the question of validity - will keep me going for a while.

The real lesson is to acknowledge where one is in error, work earnestly to correct that error, and find a way to move on. I am back to working largely in the shadows, like any of a number of literary and cinematic characters I admire, to continue the necessary work of improving the psychological sciences. There really is something to be said for obscurity. I have come to appreciate my relative lack of prominence more and more. In the meantime, there are assignments to be graded, undergraduates to be mentored, and engagements in my local community that will not win me any prizes, but which may serve those looking for answers that can occupy my time. I can live with that. Can you?

Tuesday, April 2, 2019

A brief outline of my road to weapons effect skepticism

I have a much longer post in the works. At the moment it is nowhere near ready for public consumption. I do think that in the meantime it might be helpful to lay down some talking points regarding my particular journey:

1. I was involved in some of the early research that showed a link between mere exposure to various weapon images and accessibility of aggressive thoughts. That's part of my history, along with my general acceptance that the Carlson et al. (1990) meta-analysis had essentially closed the case regarding the link between the mere exposure of weapons on aggressive behavior.

2. Post-graduate school, as a professor who primarily focuses on instruction, I continued to find interest in the weapons effect. Research opportunities were few and far between, of course. I continued to share what I knew about the available research.

3. At some point around the start of this decade, I got serious about updating the old Carlson et al. (1990) meta-analysis. I was reproducing the original effect sizes using a now very antiquated software called D-Stat (way too limiting - never use again). That was successful insofar as it went. I got an offer to do something considerably more sophisticated, and was promised access to better software and expertise. I could not refuse.

4. I would end up coauthoring the occasional narrative literature review that gave a glowing portrayal of the weapons effect literature. I believed what I wrote at the time as it was consistent with the findings I was involved in generating from the new meta-analysis, and generally supportive of the original Carlson et al. (1990) meta-analysis. In hindsight, I came to realize I was wrong.

5. Eventually an updated meta-analysis I coauthor gets accepted for publication. Yay. Then boo. There turned out to be a database error that I and the individual who was third author on the article never caught. The fact that we never caught it still bugs me to this day. I have emails documenting my requests to said coauthor to verify that the database was accurate over the period prior to publication.

6. Reanalyses required a rethink. Publication bias is a serious problem with this literature. Establishing a link between exposure to weapons and aggressive behavioral outcomes is difficult at best, and probably not doable. Moderators that appeared to be interesting were not so interesting.

7. How do you go about redrafting an article when each of the coauthors is working at cross-purposes? Hint: it ain't easy.

8. Bottom line: I cannot speak for my coauthors, but I cannot unsee what I saw. Based on the available evidence, I can no longer have confidence that the weapons effect is a legitimate phenomenon. That is not a knock on Berkowitz, but is rather the cold hard truth as I see it after looking at the evidence available was available.

9. Initial analyses I ran last fall after the revised manuscript was accepted show that there is also a potential allegiance effect. That really needs further exploration.

10. Although there are analyses that I certainly would love to run or wish I had run, the bottom line remains: there are likely issues with sample size and research design in these experiments that makes assessing behavioral outcomes darned difficult at best. As a social prime, short-term exposure to weapons may not be particularly interesting.

11. I honestly believed an effect was real that apparently was not. Once the evidence to the contrary socked me in the jaw (to borrow a phrase from Wittgenstein), I had to change my perspective. Doing so was not easy, but I have no regrets.

I'll lay out the longer version of this later.