Saturday, September 8, 2018

Paywall: The Business of Scholarship

Paywall: The Business of Scholarship (Full Movie) CC BY 4.0 from Paywall The Movie on Vimeo. I thought this was an interesting documentary. It does a decent enough job of describing a legitimate problem and some of the possible solutions. The presentation is fairly nuanced. I finished watching it with the feeling that open source journals as a standard could potentially work, but there would be some unintended consequences. I note this simply because we do need to keep some nuance as we try to figure out better ways of conveying our work to our peers and to the public. There are solutions that might sound cool, until you realize that individual scholars would have to shell out thousands of dollars to publish, which would pretty much keep some of the status quo: those researchers who have little in the way of a personal budget or institutional budget will get shut out, and the problem of transferring taxpayer money to support publicly funded research will remain in place. I don't know of easy answers, but I do know that the current status quo cannot sustain itself indefinitely.

Friday, September 7, 2018

Research Confidential: The Hidden Data

For now I will be somewhat vague, but perhaps I can get more specific if I don't think I will be breaching some ethical boundary.

One of the on-going problems in the social sciences is that although we conduct plenty of research, only a fraction of those findings ever get reported. What happens to the rest of those findings? That is a good question. I suspect many just end up buried.

It turns out I do have some experiences that are relevant. One of the problems I wrestle with is what to do about what I know. The problem is, although the data are ultimately the property of the taxpayers, the respective principal investigators are in control of their dissemination. In one case, in graduate school, the findings were statistically significant. We were able to find significant main effects and an interaction effect. The problem was that the pattern of the results was not one that lent itself to an easy theoretical explanation. The PI in this case and I puzzled over the findings for a bit before we reached a compromise of sorts. I would get to use the findings for a poster presentation, and then we would just forget that the experiment even happened. It was a shame, as I thought then, and still think now that the findings may have shed some light on how a mostly young adult sample was interacting and interpreting the stimulus materials we were using. An identical experiment run by one of my grad student colleagues in the department produced data that squared with my PI's theoretical perspective, and those were the findings that got published.

The other set of findings are a bit more recent. Here, the PI had run a couple online studies intended to replicate a weapons effect phenomenon that a student and I had stumbled upon. The first experiment failed. The use of an internet-administered lexical decision task was likely the problem. The amount of control that would have existed in the lab was simply not available in that particular research context. The other was also administered online, and used a word completion task as the DV. That also failed to yield statistical significance. This one was interesting, because I could get some convenient info on that particular DV's internal consistency. Why would I do that? I wondered if our problem was with an often overlooked issue in my field: the lack of paying attention to the fundamentals of test construction and determining that these tests are psychometrically sound (good internal consistency, test-retest reliability, etc.), leading to problems in measurement error. The idea is hardly a novel insight, and plenty of others have voiced concern about the psychometric soundness of our measures in social psychology. As it turned out in this instance, there was reason to be concerned. The internal consistency numbers were well below what we would consider minimally adequate. There was tremendous measurement error, making it potentially difficult to detect an effect if it were not there. My personal curiosity was certainly satisfied, but that did not matter. I was told not to use those data sets in any context. So, I have knowledge of what went wrong, and some insight into why (although I may be wrong), but no way to communicate so clearly with my peers. I cannot, for example, upload the data and code so that others can go over the data and scrutinize them - and perhaps offer insights that I might not have considered. The data are merely tossed into the proverbial trash can and considered forgotten. And when dealing with influential researchers in our field, it is best to go along if they are the ones calling the shots. Somehow that does not feel right.

Regrettably that is all I can disclose. I suspect that there are plenty of others who are grad students, post-docs, early career faculty, or faculty at relatively low-power colleges and universities who end up with similar experiences. In these cases, I can only hope that the data sets survive long enough to get included in relevant meta-analyses, but even then, how often does that occur? These data sets, no matter how "inconvenient" they may appear on the surface, may be telling us something useful if we would only listen. They may also tell a much needed story to our peers in our research areas that they need to hear as they go about their own work. I may be a relatively late convert to the basic tenets of open science, but increasingly I find openness as a necessary remedy to what is ailing us as social psychologists. We should certainly communicate our successes, but why not also communicate our null findings as well, or at minimum publicly archive them so that others may work with that data if we ourselves no longer wish to?

Wednesday, September 5, 2018

Do yourself and me a favor

Whatever else you do in life, please do not cite this article on the weapons priming effect. It has not aged well. I may have stood by those statements when I was drafting the manuscript in early 2016, and probably still believed it back when it came out in print. It is now badly out of date, and we should sweep it into the dustbin of history. The meta-analysis on which I was the primary author, and more importantly the process that led to understanding what was really going on with this literature, awakened me from my dogmatic slumber (to borrow a phrase from Immanuel Kant). That meta is perhaps more worth citing if relevant - and even then it will need a thorough update in a mere handful of years.

Postscript to the preceding

I wish to repeat something I said earlier:

Really the upshot to me is that if this line of research is worth bringing back, it needs to be done by individuals who are truly independent of Anderson, Bushman, and that particular cohort of aggression researchers. Nothing personal, but this is a badly politicized area of research and we need investigators who can view this work from a fresh perspective as they design experiments. I also hope that some large sample and multi-lab experiments are run in an attempt to replicate the old Berkowitz and LePage experiment, even if the replications are more of a conceptual nature. Those findings will be what guide me as an educator going forward. If those findings conclude that there really is not an effect, then I think we can pretty well abandon this notion once and for all. If on the other hand the findings appear to show that the weapons effect is viable, we can face another set of questions - including how meaningful that body of research is in everyday life. One conclusion I can already anticipate is that any behavioral outcomes used are mild in comparison to everyday aggression, and more importantly to violent behavior. I would not take any positive findings and recommend jumping to conclusions regarding the risk of gun violence, for example - and that jumping to such conclusions would needlessly politicize the research. That would turn me off further.

Tuesday, September 4, 2018

Research Confidential: Let's Discuss Investigator Allegiance Effects

The meta-analysis on the weapons effect on which I am the primary author examined a number of potential moderator effects. Our list of moderators was far from exhaustive, and I often wondered throughout the process what we might be missing. One of those might be something that Luborsky et al. (2006) and others call an investigator allegiance effect. Early on, when a new phenomenon is presented and examined, there is a tendency for investigators who are very closely affiliated with the principle investigator of the original finding to manage to successfully replicate the original findings, and for independent researchers to perhaps more difficulty in replicating. That pattern appeared to occur in the early years of research on the weapons effect. Several ex-students and post-docs of Leonard Berkowitz (e.g., Frodi, Turner, Leyens, etc., along with some of their own students) appeared to find additional evidence of a weapons effect. Others (e.g., Arnold Buss) often reported replication failures, even when faithfully reproducing the original Berkowitz and LePage (1967) findings - Arnold Buss et al. (1972) is a great example if you look at experiments 5 and 6.

Anyway, I noticed this pattern, and it always sort of gnawed on me. Recently, someone on Twitter (Sanjay Srivastava) suggested to me that what I was looking at was an allegiance effect, and that clinical researchers routinely look for that in meta-analysis. It had honestly never occurred to me. Something I had a gut feeling about not only had a name, but also could be quantified. So, I went to my old CMA database, added a moderator column for Berkowitz Allegiance Effect, and then coded all behavioral studies as yes (affiliated with Berkowitz) or no (not affiliated with Berkowitz). Since Berkowitz and LePage (1967) presumed that the weapons effect occurred under conditions of high provocation, I decided that studies where an aggressive behavioral outcome was a DV and in which there was a high provocation condition were where I should focus my preliminary analyses. Here is what I found:

If you click the picture, you will get a full sized image. Basically, in studies where the researchers were in some way connected with Berkowitz, the mean effect size was pretty strong, d = .513 (.110, .916). The mean effect size for those who were not affiliated with Berkowitz was d = . 192 (-.111, .495). Note that the confidence intervals for the mean effect size for those not affiliated with Berkowitz included zero, which makes sense as many of those authors reported replication failures. It appears that there is indeed an allegiance effect.

I ran a similar analysis on cognitive outcomes - adding Bushman and Anderson to the list, as these two authors drove much of the research literature on what we call a weapons priming effect. Full disclosure: I coauthored two of those articles with Anderson. Here is what I found in CMA:

 One thing to note when you click on this photo is that the mean effect size really does not change regardless of whether or not the study was authored by Anderson, Bushman, or Berkowitz or by someone affiliated with them. There is no apparent evidence for an allegiance effect for cognitive DVs.

Keep in mind that these are only preliminary findings. I have reason to be a bit more confident in the cognitive outcomes than I suspected. That said, simply showing a cognitive effect means little if we cannot consistently observe tangible behavioral change. Behavioral outcomes are truly another matter. One thing to note about the behavioral outcomes. Very few new studies that included a behavioral DV were published after Carlson et al. (1990) published their meta-analysis, which appeared to settle the debate about the weapons effect. There are some problems with that meta-analysis, which I should probably discuss at some point. Bottom line is that I think we as a field squandered an opportunity to continue to check for behavioral outcomes as our standards for proper sample sizes and so on changed. Only three studies that I am aware of have included a behavioral DV this decade, and one of those was conducted by a former student of one of Berkowitz's students (Brad Bushman, who was mentored by Russell Geen). The Bushman study appears to show a clean replication. The other two do not. Make of that what you will.

Sunday, September 2, 2018

Research Confidential: Let's Just Get This Out of the Way

In many respects, I am a flawed messenger when it comes to understanding what business as usual means in my corner of the scientific universe, and how and why we need to improve how we go about our work when we report data and when we publish anything from research reports to literature review articles and book chapters. I don't pretend to be anything else. I am pretty open about my own baggage because I prefer going to sleep each night with a clean conscience. It is safe to assume that I am highly self-critical. It is also safe to assume that if I am critical of others, it is not out of malice, but rather out of concern about leaving our particular science better than it was when we began.

If you are visiting here, there is a chance that you saw an article that included work I coauthored that was retracted (one article) and required major revisions after a database error (one article in which the corrected version is now thankfully available). I have written about the latter, and will minimize my comments on it. The former does require some explaining. Thankfully, the journalist who covered the story was very fair and largely kept my remarks about both articles intact. Anything that Alison McCook left out was, I suspect, largely because she was not able to independently verify my statements from those sources in question.You can read McCook's article for yourself. I don't have tons to add. The main take-away is that using a prior published work as a template for a subsequent manuscript is a very bad idea, and none of us should do so. In the interest of time, cutting corners while drafting a manuscript is tempting, but the consequences can be disastrous. I agree with McCook that whether out of carelessness (as was the case here) or maliciousness, duplicate content is far from victimless. It's a waste of others' time, resources, and adds nothing of value to the literature. I have learned my lesson and will move on.

Just to add a bit, let's talk a bit about the process. Brad Bushman was guest editor for an issue of Current Opinion in Psychology, an Elsevier journal. He invited me to coauthor an article on the weapons effect. I agreed, although was a bit worried about spreading myself too thin. I expected that since we had collaborated well together on an earlier review article on weapons as primes of aggressive thoughts and appraisal, I could count on him to contribute significantly to the manuscript this go around. So that was late 2016. Bushman asked me to wait before drafting the manuscript until we had updated analyses from our meta-analysis on the weapons effect (it had been going through the peer review process for quite some time). By the time we had those results, we were up against a deadline. I used the manuscript from our prior work as a template, added a section on aggressive behavioral outcomes when exposed to weapons, and began to redo some of the other sections. I then kicked it over to Bushman. When I got a revision from him, based a superficial scan of the manuscript, it appeared as if the entire document had been changed. Later I found out that he only did minor revisions and was largely focused on his dual role as guest editor and author/coauthor on four other papers for that edition. I have some questions about the wisdom in a guest editor being so highly involved as a coauthor, given the workload involved, but maybe some people can handle that. I still think it is a bad idea. In other words, the collaboration I expected never occurred, and I was too careless to notice.

Now, let's talk about the upload portal and peer review process, as it had numerous problems. McCook could never get anyone from Elsevier to address her questions. So what I am going to do is summarize what I disclosed to McCook. I will note that I can back up every claim I am making, as I still have the original emails documenting what happened safely archived. I can produce the evidence that backs up my understanding of what unfolded. When the deadline approached, Evise appeared to be down. I don't know if that was system-wide, specific to the journal, or specific to something the guest editor did, or failed to do. Hence I will refrain from any speculation on that front. What I can note is that Bushman as guest editor had to ask all of the primary authors to email our manuscripts to him, and that he would then distribute them to whoever was assigned to review them. The peer review process was extremely rapid. I think we received an email from Barbara Krahe two days after emailing our submission with suggestions for minor revisions. Krahe was apparently the action editor for the special edition. I do not know if she was the peer reviewer as well. Bushman acted at one point as if she might have been. I cannot confirm that. I made the revisions, and then had to email the revised manuscript to one of the journal's editorial staff members, April Nishimura, in order for her to upload the manuscript manually through Evise. That portal never did work properly. The upshot is that the whole process was backwards and far too rapid. In an ideal situation, the manuscript would have been uploaded to the portal, checked for duplicate content, and then and only then sent out for review. As the Associate Publisher responsible for Current Opinion in Psychology (Kate Wilson) unwittingly disclosed to me, Elsevier's policy about checking for duplicate content is that it may occur, which is a far cry from a guarantee that each document will be checked. Had the process worked properly, this manuscript would have been flagged, I would have been taken to task then and there, and whatever corrective action needed to occur would have happened long before we reached an eventual retraction decision. The review process was so minimal that I seriously doubt that much time or care was put into inspecting each document. Eventually I would see proofs, and would have all of maybe a couple days to accept those. Basically I was always uncomfortable with how that process unfolded. Although I ultimately shoulder the lion's share of the burden for what happened, as is the lot of any primary author, I cannot help but wonder if a better system would have led to a better outcome. When alerted early this year that there was a problem with this particular article, I got in contact first with April Nishimura, who then eventually connected me with Kate Wilson. I made sure these individuals were aware of what was disclosed to me and to Bushman about possible duplicate content, offered a corrected manuscript in case that might be acceptable, and then asked them to investigate and make whatever decision they deemed necessary. Given that Bushman was very worried about another retraction on his record, I did pursue alternatives to retraction. After several conversations, including a phone conversation with the Executive Publisher, I felt very comfortable with the retraction decision. It was indeed the right call to make in this instance. Thankfully, only one person ever cited that article, and that was Bushman. It could have been far worse. Given the problems I experienced in early 2017 with the whole process, I had already decided I would be highly reluctant to ever accept any further invitations to publish my work in that particular journal. The lack of a proper screening of manuscripts prior to review, the minimal peer review process, and the general haste with which each edition is put together lead me to question if this journal adds any value to the science of psychology. I have had far better experiences with journals with much lower impact factors.

As for the meta-analysis, I never saw asking for a retraction as a first option. I am aware that PSPR did consider retraction as one of the options, but opted to allow us to submit a major revision of the manuscript based on analyses computed from the database once we corrected the flaw that Joe Hilgard detected when he examined the data. As an aside, Hilgard has always been clear that the particular error I made when setting up the database was surprisingly common and that what happened was an honest mistake. There is one circumstance in which I would have insisted on a retraction, and thankfully that never came to pass. In early October, 2017, Bushman floated the idea of removing me as the primary author and replacing me with the second author, Sven Kepes. Thankfully Kepes balked at the idea. After all, as he noted, his expertise was not in the weapons effect, and placing him in the primary author role would not be something he was comfortable with. Nor was I comfortable with that idea, as I had conceptualized the project before Bushman even entered the picture, had done the bulk of the legwork on data entry, coding (including training one of my students to serve as a second coder), the initial data analyses in CMA, and much of the writing. Had Bushman persisted, I would have declared the whole project unsalvageable, and expressed my concerns to the editor prior to walking away from what had been a significant amount of work. I question whether that would have been an appropriate course of action, but if placed in an untenable situation, I am not sure what the alternative would have been. Thankfully it never came to that. I and Kepes both put in considerable time over the next couple months and ultimately determined that the database was correct, that we could each independently cross-validate the computations in the database, and at that point redid our analyses. I ran a handful of supplemental analyses after Kepes was no longer available, primarily to appease Bushman, but I make it very clear that those need to be interpreted with considerable caution. The updated article is one I am comfortable with, and it essentially suggests that we need to do a serious rethink of the classic Berkowitz and LePage (1967) article that effectively launched the weapons effect as a line of research. The proper conclusion at this point is that there is not sufficient evidence that the mere presence of a weapon influences aggressive behavioral outcomes. I am not yet ready to write off the possibility that the original findings hold up, but I am well aware that this is a line of research that could well be one of many zombie phenomena in my field. Oddly enough, I am comfortable with that possibility, and am eagerly awaiting large-sample experiments that attempt on some level to replicate the weapons effect as a behavioral phenomenon. If those replications fail, the line of research needs to be viewed as one that simply did not stand the test of time. It happens. I think it is likely that Bushman and I do not see eye to eye on how to interpret the meta-analytic findings. In hindsight, we came into this project with somewhat divergent agendas, and the end result is that the data have converted me from someone who saw this as a plausible phenomenon to someone who is considerably more skeptical. That skepticism is reflected in the revised article. That skepticism will be reflected in any subsequent work I publish on the weapons effect, unless or until such time that subsequent evidence suggests otherwise. I think there is a lesson to be learned from this meta-analysis. For me the main takeaway is to take concerns about potential errors in one's work seriously, and to cooperate with those who are critical of one's findings. We really need more of that - openness and cooperation in our science. We also need to make sure that those who do the hard work of double checking our findings post-publication are reinforced for doing so. No one likes to be the bearer of bad tidings, but if the scientific record needs correcting it is crucial that those who notice speak out. It is also critical that those who need to make corrections do so proactively and with a healthy dose of humility. We're only human, after all.

One final note: a significant part of my identity is as someone who has some expertise on the weapons effect. After all, two of my earlier publications examined facets of the weapons effect, including some potential limitations of the effect. Realizing that this line of research is not what it appeared to me and to perhaps many others required an adjustment in my point of view. In some fundamental sense, my identity remains intact, as I still know this literature fairly well. What does change is my perspective as I continue to speak and write about this line of research, as well as my thoughts on the theoretical models used to justify the weapons effect. What I have described is just one of the dark alleys that one might stumble into when making sense of social psychology. The process of doing research is sometimes quite ugly, and sometimes we make some very costly errors. There is something to be learned from our mistakes, and I am hopeful that what we learn will lead to better research as we continue our work. Then again, I have always been something of an optimist.

Research Confidential: Let's Talk

Both in graduate school and in my post-grad school professional life, I have periodically seen the dark underbelly of the psychological sciences. I have witnessed questionable research practices, and at times been something of a participant. On those occasions, the experience became soul-draining, for lack of a better term. In some fundamental sense, I am still trying to figure out how to recover.

I think it is important to realize that there is science as it is idealized in textbooks and science as it is actually practiced. Although not mutually exclusive, they are quite distinct. My particular area of the sciences, social psychology, as actually practiced, leaves a lot to be desired. P-hack, HARKing, hiding inconvenient data sets, and hastily composed manuscripts that may be so fundamentally flawed that they should never make it to peer review are far closer to the norm than they should be. If you are a grad student, or an obscure professor at institutions that none of the big names has even heard of or cares about, you will likely be pressured into engaging in such behavior, or looking the other way while those who believe they hold your fate in their hands.

Peer review, too, is not quite the defense against poor research it is portrayed to be in most college textbooks. The peer review system appears to me to be stretched beyond its limits. The upload portals for manuscript submission are themselves potentially powerful tools to detect some serious problems (such as duplicate content), but those are often not utilized. Elsevier's Evise platform, for example, may be used to detect duplicate content, but it also may not be used to do so. That was something disclosed to me unwittingly while I was corresponding with an associate publisher responsible for a journal called Current Opinion in Psychology. I don't think she intended to let that slip, but there it is. So the one feature that distinguishes Evise from being little more than Adobe Acrobat Reader DC (which can convert files from Word to PDF and merge multiple documents into one document) is unlikely to be utilized. Unless you see explicit language otherwise, assume that the major publishing conglomerates are not checking for plagiarism. If you are an author or peer reviewer, you are well and truly on your own. Peer review also varies in terms of the time reviewers are allowed to examine a manuscript. Most will give you a month or two. Some - usually predatory journals, but also occasionally journals that are supposed to be reputable (looking at you, Current Opinion in Psychology) - expect much quicker turnaround time (e.g., perhaps a couple days). That should worry all of us.

The academic world, from the upper echelons down to the regional colleges and universities, pressures faculty and grad students alike to adopt a publish or perish mindset. The bottom line is product, and specifically product in high impact journals. Note that high impact does not necessarily equal high quality, as even a cursory scan of retractions and corrections in these outlets will make painfully obvious. So why do we bother? High impact equals high prestige, and high prestige is what earns a first job, or a promotion and tenure. As long as you're pumping out product, regardless of other demands on your time at work, you'll be deemed productive. Those who don't may well stagnate or find themselves elbowed out of the academic world altogether. Suffice it to say, the work-life balance of many academicians is atrocious. I know, because I have lived the dream, and it may well have nearly killed me had I not walked away when I did. Note that I did not walk away from a job, but rather a mentality that turns out to be unhealthy.

From time to time, I am going to go into some serious detail about how the research you consume in textbooks, the mass media, or even directly from the original source material itself, is made. I am going to do so because not only is there a need to try to advocate for a better science of psychology than currently exists, but because this is also very personal. I walked away from a collaborative relationship earlier this spring. I am hardly a victim in what happened over a three year period. However, the consequences of allowing that situation to go on for as long as it did turned out to be damaging both physically and psychologically. Had I allowed things to continue, I have little doubt I would have continued down a path that would have not only destroyed my career, but ultimately ended my life. I note that not to be overly dramatic, but merely to highlight that the level of toxicity that exists in my field does inflict irreparable damage. Only now do I feel like I am even modestly recovering.

Just as one of my intellectual heroes, Anthony Bourdain would argue against ordering a steak well-done, I am going to argue against consuming substandard research. I will do so by telling my story, and also by giving you some tools that you can use to help critically evaluate what you read. In the meantime, stay tuned. I am just getting warmed up.