Sunday, September 30, 2018

How France Created the Metric System

Although we mostly ignore the Metric System in the US, it is a standard of measurement that influences so much of our lives, and even more so the lives of our fellow human beings. This quick BBC article highlights how the Metric System came into being, as well as the difficulty in its mainstreaming - as the article notes, it took about a century. It was truly an accomplishment with revolutionary origins, and one that truly changed the world.

Friday, September 28, 2018

Following up on Wansink

Andrew Gelman is on point in this post. I will give you this clip as a starting point:
I particularly liked this article by David Randall—not because he quoted me, but because he crisply laid out the key issues:
The irreproducibility crisis cost Brian Wansink his job. Over a 25-year career, Mr. Wansink developed an international reputation as an expert on eating behavior. He was the main popularizer of the notion that large portions lead inevitably to overeating. But Mr. Wansink resigned last week . . . after an investigative faculty committee found he had committed a litany of academic breaches: “misreporting of research data, problematic statistical techniques, failure to properly document and preserve research results” and more. . . . Mr. Wansink’s fall from grace began with a 2016 blog post . . . [which] prompted a small group of skeptics to take a hard look at Mr. Wansink’s past scholarship. Their analysis, published in January 2017, turned up an astonishing variety and quantity of errors in his statistical procedures and data. . . . A generation of Mr. Wansink’s journal editors and fellow scientists failed to notice anything wrong with his research—a powerful indictment of the current system of academic peer review, in which only subject-matter experts are invited to comment on a paper before publication. . . . P-hacking, cherry-picking data and other arbitrary techniques have sadly become standard practices for scientists seeking publishable results. Many scientists do these things inadvertently [emphasis added], not realizing that the way they work is likely to lead to irreplicable results. Let something good come from Mr. Wansink’s downfall.
But some other reports missed the point, in a way that I’ve discussed before: they’re focusing on “p-hacking” and bad behavior rather than the larger problem of researchers expecting routine discovery.
That is I think how we should be focusing here. This is partially about scientists engaging in questionable behavior, but the focus should not be to pillory them. Rather, we should ask ourselves about a research culture that demands we find positive results each time we run a study. News flash: we're going to get a lot of findings that are at best inconclusive if we run enough studies. We should also focus on the fundamentals of research design along with making sure that any instruments used for measurement (whether behavioral, cognitive, attitudinal, etc.) are sufficiently reliable and have been validated. When I asked in an earlier post about how many Wansinks there are, I think I would want to clarify that question with a statement: the bulk of these scientists who could be the potential next Wansink are often well-intentioned individuals who are attempting to adapt to a particular set of environmental contingencies [1] (ones that reinforce positive results, or what Gelman calls routine discovery), and who are using measures that are quite frankly barely warmed over crap. In my area of social psychology I would further urge making sure that the theoretical models we rely on for our particular specialty areas are really ones that are measuring up. In aggression research, it is increasingly obvious to me that one model I relied on since my grad school days really needs to be rethought or altogether abandoned.

As we move forward, we do need to figure out what we can learn from the case of Brian Wansink, or anyone else for whom we might encounter a checkered history of questionable findings. I would recommend focusing less on the shortcomings of the individual (there is no need to create monsters) and focus instead on the behaviors, and how to change those behaviors (both individually and collectively).

[1] I am no Skinnerian, but I do teach Conditioning and Learning from time to time. I always loved that term, environmental contingencies.

Wednesday, September 26, 2018

One of my projects...

is not a project in the normal sense of the term. I have been interested in the work of a specific lab for a while. Some of the findings reported in an article I stumbled upon a couple years back did not make sense to me, and I found extracting effect sizes I needed for a project I was then in the midst of updating to be rather frustrating and annoying. More recently, I stumbled upon some work by this same lab (I was looking for fresh articles that are relevant to my primary research areas), and noticed the same basic pattern of reporting and apparent reporting mistakes. I've been sharing that publicly on Twitter, and so have others. A number of us sort of stumbled on to a pattern of poor reporting in articles produced by this lab in both predatory journals and legitimate journals. Thankfully there are tools one can use post-publication to examine findings (I've been partial to statcheck), and those have been, shall we say, illuminating. This is not the sort of stuff I'll put on a CV. It won't count towards research as my institution defines it, nor will most folks end up really caring. What can be done is to clean up a portion of a literature that is in desperate need of cleaning up - to correct the record wherever possible. I am often astounded and appalled at what manages to slip through peer review. We really need to do better.

Friday, September 21, 2018

Update

Hi! I am in the process of adding a few links that may be of use to you. I have added a widget with links to statistical tools you can use to double check reported findings four yourself. Think of these as helpful for post-peer review. I will definitely vouch for statcheck. It works quite well. The others are newer, but look promising. SPRITE was one of the techniques used in the process of successfully scrutinizing Wansink's research (leading to over a dozen retractions and counting), which in and of itself makes it worth working with, in my humble opinion. Basically, we need to be able to look at findings and ask ourselves if they are genuinely plausible, or if some error or foul play may have been involved. I have also added some podcasts that I have found especially useful over the last several months, and hope to incite you all to give those a listen. Each is hosted by psychologists who are genuinely concerned with the current state of our science, and each will provoke a good deal of thought. If you are listening to these podcasts, you are missing out. I will keep adding resources as time permits.

Thursday, September 20, 2018

Data sleuths - a positive article

This article, The Truth Squad, is well worth your time. Here is a clip:
For scientists who find themselves in the crosshairs, the experience can feel bruising. Several years ago, the Tilburg group—now more than a dozen faculty members and students—unveiled an algorithm, dubbed statcheck, to spot potential statistical problems in psychology studies. They ran it on tens of thousands of papers and posted the troubling results on PubPeer, a website for discussion of published papers. Some researchers felt unfairly attacked; one eminent psychologist insinuated that the group was part of a “self-appointed data police” harassing members of the research community.

Van Assen and Wicherts say it was worth stepping on some toes to get the message across, and to flag mistakes in the literature. Members of the group have become outspoken advocates for statistical honesty, publishing editorials and papers with tips for how to avoid biases, and they have won fans. “I'm amazed that they were able to build that group. It feels very progressive to me,” says psychologist Simine Vazire of the University of California, Davis, a past chair of the executive committee of the Society for the Improvement of Psychological Science (SIPS).

The work by the Tilburg center and others, including SIPS and COS, is beginning to have an impact. The practice of preregistering studies—declaring a plan for the research in advance, which can lessen the chance of dodgy analyses—is growing rapidly (see story, p. 1192), as is making the data behind research papers immediately available so others can check the findings. Wicherts and others are optimistic that the perverse incentives of careerist academia, to hoard data and sacrifice rigor for headline-generating findings, will ultimately be fixed. “We created the culture,” Nosek says. “We can change the culture.”

Read the rest. One of the really cool things is finding their work in PubPeer (a website we social psychologists should utilize much more). This group's statcheck software, and what it can do is truly amazing and necessary. Let's just say that when I see the name Nuijten in the comments for a particular article, I pay keen attention. Among the people I respect in my area, statcheck has generally found no errors or only minor errors that don't change the basic thrust of their findings. Among some others, well, that's another story.

This is a useful article, and one that makes clear that although there is a bit of a paradigm shift in our field, we're far from a warm embrace of an open science approach. I am optimistic that the expectations for open data sharing, open sharing of research protocols prior to running research, etc., will be far more favorable this time next decade, but I am girding myself for the possibility that it may take considerably longer to get to that point. I am guessing that when the paradigm truly shifts, it will seem sudden. The momentum was there already, and thankfully we've gone well beyond the mere talk of change that my cohort basically managed. So there is that. Progress of a sort.

Be smart: the main motive of the various data sleuths is to make our science better. These are people who are not trying to destroy careers or hurt feelings, but rather are simply making sure that the work we do is as close an approximation to the truth as is humanly possible. My advice for mid and late career researchers is to embrace this new paradigm rather than resist. I can guarantee that the new generation of psychological scientists will not have the patience for business as usual.

Brian Wansink has "retired"

The story began with a simple blog post. The post caught my eye initially because the behavior of the principal investigator responsible for that post, Brian Wansink, seemed to be bragging about how he exploited an international student as he made a very ham-handed point about work ethic. Those sorts of posts will get someone on my radar very quickly. But what was equally disturbing was that he essentially copped to engaging in any of a number of questionable research practices as if it was all perfectly okay. I'll give him points for being brazen. I made a brief blog post of my own when his post began making the rounds on Twitter. In the interim, Wansink's methodology has been challenged, data sets have been scrutinized, and he has ended up with upwards of 13 retractions and many more corrections. He continued over the last year and a half as if all was business as usual - or at least that seemed to be the public front. But behind the scenes it was apparently anything but. His university began an extensive investigation of allegations of misconduct. Yesterday, there was a big announcement from Cornell that there was a statement about Wansink coming out Friday. Well, I guess Thursday is the new Friday. That announcement happened.Wansink is "retiring" at the end of the academic year, will be removed from any teaching and research responsibility, and will be paid handsomely to cooperate as his university continues its investigation.No matter how much Wansink spins the situation, Cornell makes it abundantly clear that the reason for his "retirement" is due to some very shoddy research practices. Cornell is hardly acting heroically. The institution is protecting itself once it became apparent that one of its prized researchers was just not going to be bringing in the grant money he once did.

Wansink did not just do some hokey experiments that were somewhat eye-catching. He appeared on various morning news shows plugging his lab's findings, in the process fooling the public. His lab's reported findings were used by policymakers, and although perhaps the fact that those findings are in question is not quite life and death, they certainly did not benefit the public interest. Here is a tweet that gives you some idea of how policymakers used his research (from a plenary speech given at SPSP 2012):





The sleuths who did the grunt work to discover problems with Wansink's work will never be thanked. They will never receive awards, nor will they be able to put those efforts on their respective CVs. But we all owe them a huge debt of gratitude. For the record, they are Nick Brown, James Heathers, Jordan Anaya, and Tim van der Zee. They have exposed some questionable research practices at great risk to their own careers. Perhaps more to the point, they have exposed Wansink's research practices as symptomatic of an academic culture that privileges quantity over quality, an emphasis on appearing in high impact journals, statistically significant findings over nonsignificant findings, research that can be used as clickbait, and secretiveness. That broader culture is what needs to be changed. As James Heathers would no doubt argue, we change the culture by using the tools available to detect questionable practices, and to rethink how we do peer review - and making certain that we instruct our students to do likewise. We need to be more open in sharing our methodology and our data (that is the point of registering or preregistering our research protocols and archiving our data so that our peers may examine them). We need to rethink what is important in doing our research. Is it about getting a flashy finding that can be easily published in high impact journals and net TED talks, or are we more interested in simply being truth seekers and truth tellers, no matter what the data are telling us? How many publications do we really need? How many citations do we really need? Those to me are questions that we need to be asking at each step in our careers. How much should we demand of editors and authors as peer reviewers? Why should we take the authors' findings as gospel? Could journals vet articles (possibly using software like SPRITE) to ascertain the plausibility of the data analyses, and if so, why are they not doing so?

There is some speculation that had Wansink not made that fateful blog post in December of 2016, he would still be go about business as usual, and he would never have faced any repercussions for his faulty research. That is a distinct possibility. A more optimistic case can be made that the truth would have caught up to him eventually, as the events that led to the replication crisis continue to unfold, and as our own research culture is one that is more in tune with rooting out questionable work. Maybe he would not be retiring at the end of the spring term of 2019, but a few years later - still under a cloud. I also wonder how things might have played out if Wansink had tried a different approach. When his research practices were initially challenged, he doubled down. What if he had cooperated with the sleuths who wanted to get to the truth about his findings? What if, faced with evidence of his mistakes, he had embraced those and taken an active role in correcting the record, and an active role in changing practices in his lab? He might have still ended up with a series of retractions and faced plenty of uncomfortable questions from any of a variety of stakeholders. The story might have had a less tragic ending.

This is not a moment for celebration, although there is some comfort in knowing that at least the record in one area of the sciences is being corrected. This is a moment for reflection. How did we arrive at this point? How many more Wansinks are in our midst? What can we do as researchers, as peer reviewers, and in our capacity to do post-peer review to leave our respective areas of the psychological sciences just a bit better than they were when we started? How do we make sure that we actually earn the trust of the public? Those are the questions I am asking myself tonight.

Friday, September 14, 2018

Reforming Psychology: Who Are These People?

Let's continue just a little bit from my last post. Right now I am merely thinking out loud, so take at least some of this with a few grains of salt. The Chronicle article I linked to in that earlier report was quite adept at finding some of the more extreme statements and magnifying them, as well as at times proving to be factually incorrect (Bem's infamous ESP article in JPSP was published in 2011, not 2010!). That makes for clicks, and presumably ad revenue, but may not exactly shed light on who the stakeholders are.

Among the reformers, I suspect that this is a varied group, representing multiple specialties, and at various levels of prominence within the academic world. Some are grad students who probably have the idealism and zeal I once experienced when I was a grad student, and who like me are legitimately frustrated by their relative lack of power to change a status quo that leaves a lot to be desired. Others are post-docs and early career researchers whose fates hang in the balance based on evaluations by some of the very people whose work they may be criticizing. Hiring decisions and tenure decisions are certainly a consideration. Others may be primarily educators, but who also could be caught in the cross-hairs of those who have considerably more prestige. For those of us who are a bit less prominent, it is easier for those used to getting their way to fling unfounded accusations at us, knowing full well that for now they will be taken at face value in the public sphere. At least in these early moments, the effort to reform psychological science appears to be a high-risk enterprise.

There may be a great deal of diversity in terms of how to go about reform. Going with my generally cautious nature, I might want to tread cautiously - test drive various approaches to making our work more transparent and see what works and what doesn't. Others may want a more immediate payoff. Some of us may disagree on methodological and statistical practices. The impression I get is that regardless of where the reformers stand, there is a consensus that the status quo no longer works, and that the system needs to be changed. The other impression I get is that there is a passion for science in all of its messiness. These are not people with vendettas, but rather people who want to do work that matters, that gets at closer approximations of the truth. If someone's work gets criticized, it has nothing to do with some need to take down someone famous, but to get at what is real or not real about the foundations underlying their claims in specific works. I wish this were understood better. For the educators among reformers, we just want to know that what we teach our undergrads actually is reality-based. We may want to develop and/or find guidance in how to teach open science to research methods students, or to show how a classic study was debunked in our content courses. Of course keep in mind that I am basing these observations on a relatively small handful of interactions over the last few months in particular. Certainly I have not done any systematic data collection, nor am I aware of much of any. I do think it is useful to realize that SIPS is evenly split between men and women in its membership, and really does have a diverse representation as far as career levels (although I think more toward early career), specialties, and teaching load. I think it is also useful to realize that SIPS is likely only one part of a broader cohort of reformers, and so any article discussing reforms to psychological science needs to take that into account.

As for those defending the status quo. I suspect there is also a great deal of variation. That said, the loudest voices are clearly mid and late career scholars, many of whom perceive having a great deal to lose. There has to be some existential crisis that occurs when one realizes that the body of work making up a substantial portion of one's career was all apparently for nothing. I am under the impression that at least a subset have achieved a good deal of prestige, have leveraged that prominence to amass profits from book deals, speaking engagements, etc. and that efforts to debunk their work could be seen as a threat to all the trappings of what they might consider success. Hence, the temptation to occasionally lob phrases like "methodological terrorists" at the data sleuths among the reformers. As an outsider looking in to the upper echelons of the academic world, my impression is that most of the status quo folks are generally decent, well-intentioned folks, who have grown accustom to a certain way of doing things and benefit from that status quo. I wish I could tell the most worried among them that their worries about a relatively new reform movement are unfounded. I know I would not be listened to. I have a bit of personal experience in that regard. Scholars scrutinizing data sets are not "out to get you" but are interested in making sure that what you claimed in published reports checks out. I suspect that argument will fall on deaf ears.

I'd also like to add something else: I don't really think that psychology is any meaner now than it was when I started out as a grad student in the 1990s. I have certainly witnessed rather contentious debates and conversations at presentation sessions, have been told in no uncertain terms that my own posters were bullshit (I usually would try to engage those folks a bit, out of curiosity more than anything else), and have seen the work of early scholars ripped to shreds. What has changed is the technology. The conversation now plays out on blogs (although those are pretty old-school by now) and social media (Twitter, Facebook groups, etc.). We can now publicly witness in as close to real time as our social media allow what used to occur only behind the relatively closed doors of academic conferences and colloquia - and journal article rebuttals that were behind paywalls. Personally I find the current environment refreshing. It is no more or no less "mean" than it was then. Some individuals in our field truly behave in a toxic manner - but that was true back in the day. What is also refreshing that it is now easier to debunk findings and easier to do so in the public sphere than ever before. I see that not as a sign of a science in trouble, but of one that is actually in the process of figuring itself out at long last. I somehow doubt that mid-career and late-career scholars are leaving in droves because the environment now is not so comfortable. If that were the case, the job market for all the rest of us would be insanely good right now. Hint: the job market is about as bleak as it was this time last year.

A bit about where I am coming from: Right now I have my sleeves rolled up as I go about my work as an educator. I am trying to figure out how to convey what is happening in psych to my students so that they know what is coming their way as they enter the workforce, graduate school, and onward. I am trying to figure out how to go about engaging them to constructively think about what they read in their textbooks and in various mass media outlets, and to sort out what it means when classic research turns out to be wrong. I am trying to sort out how to create a more open-science friendly environment in my methods courses. I want to teach stats just a bit better than I currently do. When I look at those particular goals, it is clear that what I am wanting aligns well with those working to reform our field. I can also say from experience that my conversations have been nothing short of pleasant. And even when some work I was involved in got taken to task (I am assuming if you are reading this you know my history) nothing got said that was in someway undeserved, or untoward. Quite the contrary.

I cast my lot with the reformers - first quietly and then increasingly vocally. I decided to do so because I remember what I wanted to see changed in psychology back when I was in grad school, and I am disappointed that so little transpired in the way of reform back then. There is now hope that things will be different, and that what emerges will be a psychology that really does live up to its billing as a science whose findings matter and can be trusted. I base that on evidence in editorial leadership changes, journals at least tentatively taking steps to enforce more openness from authors, etc. It's a good start. Like I might say in other contexts, there is so much to be done.

Postscript: as time permits, I will start linking to blogs and podcasts that I think will enlighten you. I have been looking at what I have in the way of links and blogroll and realize that it needs an overhaul. Stay tuned...

Reforming Psychology: We're Not Going to Burn it Down!

This post is merely a placeholder for something I want to spend some time discussing with those of you who come here later. There has been a spirited discussion on Twitter and Facebook regarding a recent article in The Chronicle of Higher Education (hopefully this link will get you behind its paywall - if not, my apologies in advance). For the time being I will state that although I have not yet attended a SIPS conference (something I will make certain to correct in the near future), my impression of SIPS is a bit different than what is characterized in the article. I get the impression that these are essentially reformers, something that is increasingly near and dear to me, who want to take tangible actions to improve the work we do. I also get the impression that in general these are folks who largely share some things I value:

1. An interest in fostering a psychological science that is open, cooperative, supportive, and forgiving.

2. An interest in viewing our work as researchers and reformers as a set of tangible behaviors.

I've blogged before about the replication crisis. My views on what has emerged from the fallout have certainly evolved. It is very obvious that there are some serious problems, especially in my own specialty area (generally social psychology, and more specifically in the area of aggression research), and that those serious problems need to be addressed. Those problems are ones that are fixable. There is no need to burn down anything.

I'll have more to say in a bit.

A tipping point for academic publishing?

Perhaps. George Monbiot has a good opinion piece in The Guardian worth reading. This is hardly his first rodeo when it comes to writing about the distribution of wealth from the public sector to a handful of private sector publishing conglomerates. What is different is that some stakeholders, such as federal governments, grant agencies, and university libraries are pushing back at long last. Of course, there's also Sci-Hub, which is distributing what would otherwise be behind a paywall for free. I am a bit late to the party when it comes to Sci-Hub, but I can say that for someone who wants to do some data or manuscript sleuthing, it can be a valuable resource. I've had some opportunity to use that site as a means to quickly determine if a book chapter was (apparently) unwittingly self-plagiarized much more efficiently than if I waited for the actual book to arrive. When I reflect on why I decided to pursue a career as an academic psychologist, it was in large part because I wanted to give away psychology in the public interest (to paraphrase the late George Miller). Publishing articles that get paywalled is a failure to do so. I am not great at predicting the future, and have no idea what business model will be in place for academic publishing a decade from now. What I can do is express hope that whatever evolves from the ashes of the current system, it is one that does not bankrupt universities or individual researchers/labs, and that it is one that truly democratizes our work, truly makes our research available to the public. After all, our research is a public good and should be treated as such. Such an attitude really should not be a revolutionary concept.

Saturday, September 8, 2018

Paywall: The Business of Scholarship

Paywall: The Business of Scholarship (Full Movie) CC BY 4.0 from Paywall The Movie on Vimeo. I thought this was an interesting documentary. It does a decent enough job of describing a legitimate problem and some of the possible solutions. The presentation is fairly nuanced. I finished watching it with the feeling that open source journals as a standard could potentially work, but there would be some unintended consequences. I note this simply because we do need to keep some nuance as we try to figure out better ways of conveying our work to our peers and to the public. There are solutions that might sound cool, until you realize that individual scholars would have to shell out thousands of dollars to publish, which would pretty much keep some of the status quo: those researchers who have little in the way of a personal budget or institutional budget will get shut out, and the problem of transferring taxpayer money to support publicly funded research will remain in place. I don't know of easy answers, but I do know that the current status quo cannot sustain itself indefinitely.

Friday, September 7, 2018

Research Confidential: The Hidden Data

For now I will be somewhat vague, but perhaps I can get more specific if I don't think I will be breaching some ethical boundary.

One of the on-going problems in the social sciences is that although we conduct plenty of research, only a fraction of those findings ever get reported. What happens to the rest of those findings? That is a good question. I suspect many just end up buried.

It turns out I do have some experiences that are relevant. One of the problems I wrestle with is what to do about what I know. The problem is, although the data are ultimately the property of the taxpayers, the respective principal investigators are in control of their dissemination. In one case, in graduate school, the findings were statistically significant. We were able to find significant main effects and an interaction effect. The problem was that the pattern of the results was not one that lent itself to an easy theoretical explanation. The PI in this case and I puzzled over the findings for a bit before we reached a compromise of sorts. I would get to use the findings for a poster presentation, and then we would just forget that the experiment even happened. It was a shame, as I thought then, and still think now that the findings may have shed some light on how a mostly young adult sample was interacting and interpreting the stimulus materials we were using. An identical experiment run by one of my grad student colleagues in the department produced data that squared with my PI's theoretical perspective, and those were the findings that got published.

The other set of findings are a bit more recent. Here, the PI had run a couple online studies intended to replicate a weapons effect phenomenon that a student and I had stumbled upon. The first experiment failed. The use of an internet-administered lexical decision task was likely the problem. The amount of control that would have existed in the lab was simply not available in that particular research context. The other was also administered online, and used a word completion task as the DV. That also failed to yield statistical significance. This one was interesting, because I could get some convenient info on that particular DV's internal consistency. Why would I do that? I wondered if our problem was with an often overlooked issue in my field: the lack of paying attention to the fundamentals of test construction and determining that these tests are psychometrically sound (good internal consistency, test-retest reliability, etc.), leading to problems in measurement error. The idea is hardly a novel insight, and plenty of others have voiced concern about the psychometric soundness of our measures in social psychology. As it turned out in this instance, there was reason to be concerned. The internal consistency numbers were well below what we would consider minimally adequate. There was tremendous measurement error, making it potentially difficult to detect an effect if it were not there. My personal curiosity was certainly satisfied, but that did not matter. I was told not to use those data sets in any context. So, I have knowledge of what went wrong, and some insight into why (although I may be wrong), but no way to communicate so clearly with my peers. I cannot, for example, upload the data and code so that others can go over the data and scrutinize them - and perhaps offer insights that I might not have considered. The data are merely tossed into the proverbial trash can and considered forgotten. And when dealing with influential researchers in our field, it is best to go along if they are the ones calling the shots. Somehow that does not feel right.

Regrettably that is all I can disclose. I suspect that there are plenty of others who are grad students, post-docs, early career faculty, or faculty at relatively low-power colleges and universities who end up with similar experiences. In these cases, I can only hope that the data sets survive long enough to get included in relevant meta-analyses, but even then, how often does that occur? These data sets, no matter how "inconvenient" they may appear on the surface, may be telling us something useful if we would only listen. They may also tell a much needed story to our peers in our research areas that they need to hear as they go about their own work. I may be a relatively late convert to the basic tenets of open science, but increasingly I find openness as a necessary remedy to what is ailing us as social psychologists. We should certainly communicate our successes, but why not also communicate our null findings as well, or at minimum publicly archive them so that others may work with that data if we ourselves no longer wish to?

Wednesday, September 5, 2018

Do yourself and me a favor

Whatever else you do in life, please do not cite this article on the weapons priming effect. It has not aged well. I may have stood by those statements when I was drafting the manuscript in early 2016, and probably still believed it back when it came out in print. It is now badly out of date, and we should sweep it into the dustbin of history. The meta-analysis on which I was the primary author, and more importantly the process that led to understanding what was really going on with this literature, awakened me from my dogmatic slumber (to borrow a phrase from Immanuel Kant). That meta is perhaps more worth citing if relevant - and even then it will need a thorough update in a mere handful of years.

Postscript to the preceding

I wish to repeat something I said earlier:

Really the upshot to me is that if this line of research is worth bringing back, it needs to be done by individuals who are truly independent of Anderson, Bushman, and that particular cohort of aggression researchers. Nothing personal, but this is a badly politicized area of research and we need investigators who can view this work from a fresh perspective as they design experiments. I also hope that some large sample and multi-lab experiments are run in an attempt to replicate the old Berkowitz and LePage experiment, even if the replications are more of a conceptual nature. Those findings will be what guide me as an educator going forward. If those findings conclude that there really is not an effect, then I think we can pretty well abandon this notion once and for all. If on the other hand the findings appear to show that the weapons effect is viable, we can face another set of questions - including how meaningful that body of research is in everyday life. One conclusion I can already anticipate is that any behavioral outcomes used are mild in comparison to everyday aggression, and more importantly to violent behavior. I would not take any positive findings and recommend jumping to conclusions regarding the risk of gun violence, for example - and that jumping to such conclusions would needlessly politicize the research. That would turn me off further.

Tuesday, September 4, 2018

Research Confidential: Let's Discuss Investigator Allegiance Effects

The meta-analysis on the weapons effect on which I am the primary author examined a number of potential moderator effects. Our list of moderators was far from exhaustive, and I often wondered throughout the process what we might be missing. One of those might be something that Luborsky et al. (2006) and others call an investigator allegiance effect. Early on, when a new phenomenon is presented and examined, there is a tendency for investigators who are very closely affiliated with the principle investigator of the original finding to manage to successfully replicate the original findings, and for independent researchers to perhaps more difficulty in replicating. That pattern appeared to occur in the early years of research on the weapons effect. Several ex-students and post-docs of Leonard Berkowitz (e.g., Frodi, Turner, Leyens, etc., along with some of their own students) appeared to find additional evidence of a weapons effect. Others (e.g., Arnold Buss) often reported replication failures, even when faithfully reproducing the original Berkowitz and LePage (1967) findings - Arnold Buss et al. (1972) is a great example if you look at experiments 5 and 6.

Anyway, I noticed this pattern, and it always sort of gnawed on me. Recently, someone on Twitter (Sanjay Srivastava) suggested to me that what I was looking at was an allegiance effect, and that clinical researchers routinely look for that in meta-analysis. It had honestly never occurred to me. Something I had a gut feeling about not only had a name, but also could be quantified. So, I went to my old CMA database, added a moderator column for Berkowitz Allegiance Effect, and then coded all behavioral studies as yes (affiliated with Berkowitz) or no (not affiliated with Berkowitz). Since Berkowitz and LePage (1967) presumed that the weapons effect occurred under conditions of high provocation, I decided that studies where an aggressive behavioral outcome was a DV and in which there was a high provocation condition were where I should focus my preliminary analyses. Here is what I found:

If you click the picture, you will get a full sized image. Basically, in studies where the researchers were in some way connected with Berkowitz, the mean effect size was pretty strong, d = .513 (.110, .916). The mean effect size for those who were not affiliated with Berkowitz was d = . 192 (-.111, .495). Note that the confidence intervals for the mean effect size for those not affiliated with Berkowitz included zero, which makes sense as many of those authors reported replication failures. It appears that there is indeed an allegiance effect.

I ran a similar analysis on cognitive outcomes - adding Bushman and Anderson to the list, as these two authors drove much of the research literature on what we call a weapons priming effect. Full disclosure: I coauthored two of those articles with Anderson. Here is what I found in CMA:

 One thing to note when you click on this photo is that the mean effect size really does not change regardless of whether or not the study was authored by Anderson, Bushman, or Berkowitz or by someone affiliated with them. There is no apparent evidence for an allegiance effect for cognitive DVs.

Keep in mind that these are only preliminary findings. I have reason to be a bit more confident in the cognitive outcomes than I suspected. That said, simply showing a cognitive effect means little if we cannot consistently observe tangible behavioral change. Behavioral outcomes are truly another matter. One thing to note about the behavioral outcomes. Very few new studies that included a behavioral DV were published after Carlson et al. (1990) published their meta-analysis, which appeared to settle the debate about the weapons effect. There are some problems with that meta-analysis, which I should probably discuss at some point. Bottom line is that I think we as a field squandered an opportunity to continue to check for behavioral outcomes as our standards for proper sample sizes and so on changed. Only three studies that I am aware of have included a behavioral DV this decade, and one of those was conducted by a former student of one of Berkowitz's students (Brad Bushman, who was mentored by Russell Geen). The Bushman study appears to show a clean replication. The other two do not. Make of that what you will.

Sunday, September 2, 2018

Research Confidential: Let's Just Get This Out of the Way

In many respects, I am a flawed messenger when it comes to understanding what business as usual means in my corner of the scientific universe, and how and why we need to improve how we go about our work when we report data and when we publish anything from research reports to literature review articles and book chapters. I don't pretend to be anything else. I am pretty open about my own baggage because I prefer going to sleep each night with a clean conscience. It is safe to assume that I am highly self-critical. It is also safe to assume that if I am critical of others, it is not out of malice, but rather out of concern about leaving our particular science better than it was when we began.

If you are visiting here, there is a chance that you saw an article that included work I coauthored that was retracted (one article) and required major revisions after a database error (one article in which the corrected version is now thankfully available). I have written about the latter, and will minimize my comments on it. The former does require some explaining. Thankfully, the journalist who covered the story was very fair and largely kept my remarks about both articles intact. Anything that Alison McCook left out was, I suspect, largely because she was not able to independently verify my statements from those sources in question.You can read McCook's article for yourself. I don't have tons to add. The main take-away is that using a prior published work as a template for a subsequent manuscript is a very bad idea, and none of us should do so. In the interest of time, cutting corners while drafting a manuscript is tempting, but the consequences can be disastrous. I agree with McCook that whether out of carelessness (as was the case here) or maliciousness, duplicate content is far from victimless. It's a waste of others' time, resources, and adds nothing of value to the literature. I have learned my lesson and will move on.

Just to add a bit, let's talk a bit about the process. Brad Bushman was guest editor for an issue of Current Opinion in Psychology, an Elsevier journal. He invited me to coauthor an article on the weapons effect. I agreed, although was a bit worried about spreading myself too thin. I expected that since we had collaborated well together on an earlier review article on weapons as primes of aggressive thoughts and appraisal, I could count on him to contribute significantly to the manuscript this go around. So that was late 2016. Bushman asked me to wait before drafting the manuscript until we had updated analyses from our meta-analysis on the weapons effect (it had been going through the peer review process for quite some time). By the time we had those results, we were up against a deadline. I used the manuscript from our prior work as a template, added a section on aggressive behavioral outcomes when exposed to weapons, and began to redo some of the other sections. I then kicked it over to Bushman. When I got a revision from him, based a superficial scan of the manuscript, it appeared as if the entire document had been changed. Later I found out that he only did minor revisions and was largely focused on his dual role as guest editor and coauthor on four other papers for that edition. I have some questions about the wisdom in a guest editor being so highly involved as a coauthor, given the workload involved, but maybe some people can handle that. I still think it is a bad idea. In other words, the collaboration I expected never occurred, and I was too careless to notice.

Now, let's talk about the upload portal and peer review process, as it had numerous problems. McCook could never get anyone from Elsevier to address her questions. So what I am going to do is summarize what I disclosed to McCook. I will note that I can back up every claim I am making, as I still have the original emails documenting what happened safely archived. I can produce the evidence that backs up my understanding of what unfolded. When the deadline approached, Evise appeared to be down. I don't know if that was system-wide, specific to the journal, or specific to something the guest editor did, or failed to do. Hence I will refrain from any speculation on that front. What I can note is that Bushman as guest editor had to ask all of the primary authors to email our manuscripts to him, and that he would then distribute them to whoever was assigned to review them. The peer review process was extremely rapid. I think we received an email from Barbara Krahe two days after emailing our submission with suggestions for minor revisions. Krahe was apparently the action editor for the special edition. I do not know if she was the peer reviewer as well. Bushman acted at one point as if she might have been. I cannot confirm that. I made the revisions, and then had to email the revised manuscript to one of the journal's editorial staff members, April Nishimura, in order for her to upload the manuscript manually through Evise. That portal never did work properly. The upshot is that the whole process was backwards and far too rapid. In an ideal situation, the manuscript would have been uploaded to the portal, checked for duplicate content, and then and only then sent out for review. As the Associate Publisher responsible for Current Opinion in Psychology (Kate Wilson) unwittingly disclosed to me, Elsevier's policy about checking for duplicate content is that it may occur, which is a far cry from a guarantee that each document will be checked. Had the process worked properly, this manuscript would have been flagged, I would have been taken to task then and there, and whatever corrective action needed to occur would have happened long before we reached an eventual retraction decision. The review process was so minimal that I seriously doubt that much time or care was put into inspecting each document. Eventually I would see proofs, and would have all of maybe a couple days to accept those. Basically I was always uncomfortable with how that process unfolded. Although I ultimately shoulder the lion's share of the burden for what happened, as is the lot of any primary author, I cannot help but wonder if a better system would have led to a better outcome. When alerted early this year that there was a problem with this particular article, I got in contact first with April Nishimura, who then eventually connected me with Kate Wilson. I made sure these individuals were aware of what was disclosed to me and to Bushman about possible duplicate content, offered a corrected manuscript in case that might be acceptable, and then asked them to investigate and make whatever decision they deemed necessary. Given that Bushman was very worried about another retraction on his record, I did pursue alternatives to retraction. After several conversations, including a phone conversation with the Executive Publisher, I felt very comfortable with the retraction decision. It was indeed the right call to make in this instance. Thankfully, only one person ever cited that article, and that was Bushman. It could have been far worse. Given the problems I experienced in early 2017 with the whole process, I had already decided I would be highly reluctant to ever accept any further invitations to publish my work in that particular journal. The lack of a proper screening of manuscripts prior to review, the minimal peer review process, and the general haste with which each edition is put together lead me to question if this journal adds any value to the science of psychology. I have had far better experiences with journals with much lower impact factors.

As for the meta-analysis, I never saw asking for a retraction as a first option. I am aware that PSPR did consider retraction as one of the options, but opted to allow us to submit a major revision of the manuscript based on analyses computed from the database once we corrected the flaw that Joe Hilgard detected when he examined the data. As an aside, Hilgard has always been clear that the particular error I made when setting up the database was surprisingly common and that what happened was an honest mistake. There is one circumstance in which I would have insisted on a retraction, and thankfully that never came to pass. In early October, 2017, Bushman floated the idea of removing me as the primary author and replacing me with the second author, Sven Kepes. Thankfully Kepes balked at the idea. After all, as he noted, his expertise was not in the weapons effect, and placing him in the primary author role would not be something he was comfortable with. Nor was I comfortable with that idea, as I had conceptualized the project before Bushman even entered the picture, had done the bulk of the legwork on data entry, coding (including training one of my students to serve as a second coder), the initial data analyses in CMA, and much of the writing. Had Bushman persisted, I would have declared the whole project unsalvageable, and expressed my concerns to the editor prior to walking away from what had been a significant amount of work. I question whether that would have been an appropriate course of action, but if placed in an untenable situation, I am not sure what the alternative would have been. Thankfully it never came to that. I and Kepes both put in considerable time over the next couple months and ultimately determined that the database was correct, that we could each independently cross-validate the computations in the database, and at that point redid our analyses. I ran a handful of supplemental analyses after Kepes was no longer available, primarily to appease Bushman, but I make it very clear that those need to be interpreted with considerable caution. The updated article is one I am comfortable with, and it essentially suggests that we need to do a serious rethink of the classic Berkowitz and LePage (1967) article that effectively launched the weapons effect as a line of research. The proper conclusion at this point is that there is not sufficient evidence that the mere presence of a weapon influences aggressive behavioral outcomes. I am not yet ready to write off the possibility that the original findings hold up, but I am well aware that this is a line of research that could well be one of many zombie phenomena in my field. Oddly enough, I am comfortable with that possibility, and am eagerly awaiting large-sample experiments that attempt on some level to replicate the weapons effect as a behavioral phenomenon. If those replications fail, the line of research needs to be viewed as one that simply did not stand the test of time. It happens. I think it is likely that Bushman and I do not see eye to eye on how to interpret the meta-analytic findings. In hindsight, we came into this project with somewhat divergent agendas, and the end result is that the data have converted me from someone who saw this as a plausible phenomenon to someone who is considerably more skeptical. That skepticism is reflected in the revised article. That skepticism will be reflected in any subsequent work I publish on the weapons effect, unless or until such time that subsequent evidence suggests otherwise. I think there is a lesson to be learned from this meta-analysis. For me the main takeaway is to take concerns about potential errors in one's work seriously, and to cooperate with those who are critical of one's findings. We really need more of that - openness and cooperation in our science. We also need to make sure that those who do the hard work of double checking our findings post-publication are reinforced for doing so. No one likes to be the bearer of bad tidings, but if the scientific record needs correcting it is crucial that those who notice speak out. It is also critical that those who need to make corrections do so proactively and with a healthy dose of humility. We're only human, after all.

One final note: a significant part of my identity is as someone who has some expertise on the weapons effect. After all, two of my earlier publications examined facets of the weapons effect, including some potential limitations of the effect. Realizing that this line of research is not what it appeared to me and to perhaps many others required an adjustment in my point of view. In some fundamental sense, my identity remains intact, as I still know this literature fairly well. What does change is my perspective as I continue to speak and write about this line of research, as well as my thoughts on the theoretical models used to justify the weapons effect. What I have described is just one of the dark alleys that one might stumble into when making sense of social psychology. The process of doing research is sometimes quite ugly, and sometimes we make some very costly errors. There is something to be learned from our mistakes, and I am hopeful that what we learn will lead to better research as we continue our work. Then again, I have always been something of an optimist.

Research Confidential: Let's Talk

Both in graduate school and in my post-grad school professional life, I have periodically seen the dark underbelly of the psychological sciences. I have witnessed questionable research practices, and at times been something of a participant. On those occasions, the experience became soul-draining, for lack of a better term. In some fundamental sense, I am still trying to figure out how to recover.

I think it is important to realize that there is science as it is idealized in textbooks and science as it is actually practiced. Although not mutually exclusive, they are quite distinct. My particular area of the sciences, social psychology, as actually practiced, leaves a lot to be desired. P-hack, HARKing, hiding inconvenient data sets, and hastily composed manuscripts that may be so fundamentally flawed that they should never make it to peer review are far closer to the norm than they should be. If you are a grad student, or an obscure professor at institutions that none of the big names has even heard of or cares about, you will likely be pressured into engaging in such behavior, or looking the other way while those who believe they hold your fate in their hands.

Peer review, too, is not quite the defense against poor research it is portrayed to be in most college textbooks. The peer review system appears to me to be stretched beyond its limits. The upload portals for manuscript submission are themselves potentially powerful tools to detect some serious problems (such as duplicate content), but those are often not utilized. Elsevier's Evise platform, for example, may be used to detect duplicate content, but it also may not be used to do so. That was something disclosed to me unwittingly while I was corresponding with an associate publisher responsible for a journal called Current Opinion in Psychology. I don't think she intended to let that slip, but there it is. So the one feature that distinguishes Evise from being little more than Adobe Acrobat Reader DC (which can convert files from Word to PDF and merge multiple documents into one document) is unlikely to be utilized. Unless you see explicit language otherwise, assume that the major publishing conglomerates are not checking for plagiarism. If you are an author or peer reviewer, you are well and truly on your own. Peer review also varies in terms of the time reviewers are allowed to examine a manuscript. Most will give you a month or two. Some - usually predatory journals, but also occasionally journals that are supposed to be reputable (looking at you, Current Opinion in Psychology) - expect much quicker turnaround time (e.g., perhaps a couple days). That should worry all of us.

The academic world, from the upper echelons down to the regional colleges and universities, pressures faculty and grad students alike to adopt a publish or perish mindset. The bottom line is product, and specifically product in high impact journals. Note that high impact does not necessarily equal high quality, as even a cursory scan of retractions and corrections in these outlets will make painfully obvious. So why do we bother? High impact equals high prestige, and high prestige is what earns a first job, or a promotion and tenure. As long as you're pumping out product, regardless of other demands on your time at work, you'll be deemed productive. Those who don't may well stagnate or find themselves elbowed out of the academic world altogether. Suffice it to say, the work-life balance of many academicians is atrocious. I know, because I have lived the dream, and it may well have nearly killed me had I not walked away when I did. Note that I did not walk away from a job, but rather a mentality that turns out to be unhealthy.

From time to time, I am going to go into some serious detail about how the research you consume in textbooks, the mass media, or even directly from the original source material itself, is made. I am going to do so because not only is there a need to try to advocate for a better science of psychology than currently exists, but because this is also very personal. I walked away from a collaborative relationship earlier this spring. I am hardly a victim in what happened over a three year period. However, the consequences of allowing that situation to go on for as long as it did turned out to be damaging both physically and psychologically. Had I allowed things to continue, I have little doubt I would have continued down a path that would have not only destroyed my career, but ultimately ended my life. I note that not to be overly dramatic, but merely to highlight that the level of toxicity that exists in my field does inflict irreparable damage. Only now do I feel like I am even modestly recovering.

Just as one of my intellectual heroes, Anthony Bourdain would argue against ordering a steak well-done, I am going to argue against consuming substandard research. I will do so by telling my story, and also by giving you some tools that you can use to help critically evaluate what you read. In the meantime, stay tuned. I am just getting warmed up.