Sunday, December 9, 2018

Rethinking Turner, Layton, and Simons (1975)

Let's revisit what is a frequently cited set of field experiments purporting to support the notion that the mere presence of a weapon influences aggressive behavior:
The weapons effect occurs outside of the lab too. In one field experiment,[2] a confederate driving a pickup truck purposely remained stalled at a traffic light for 12 seconds to see whether the motorists trapped behind him would honk their horns (the measure of aggression). The truck contained either a .303-calibre military rifle in a gun rack mounted to the rear window, or no rifle. The results showed that motorists were more likely to honk their horns if the confederate was driving a truck with a gun visible in the rear window than if the confederate was driving the same truck but with no gun. What is amazing about this study is that you would have to be pretty stupid to honk your horn at a driver with a military rifle in his truck—if you were thinking, that is! But people were not thinking—they just naturally honked their horns after seeing the gun. The mere presence of a weapon automatically triggered aggression.

The above description could come from practically any social psychology textbook describing the weapons effect, and probably serves as an exemplar for why I increasingly hate teaching classic experiments in my own field, except perhaps as cautionary tales. As the title suggests, this is a typical description of a series of experiments reported by Turner, Layton, and Simons (1975). Joe Hilgard aptly sums up what appeared to have happened:

Turner, Layton, and Simons (1975) report a bizzare experiment in which an experimenter driving a pickup truck loitered at a traffic light. When the light turned green, the experimenter idled for a further 12 seconds, waiting to see if the driver trapped behind would honk. Honking, the researchers argued, would constitute a form of aggressive behavior.

The design was a 3 (Prime) × 2 (Visibility) design. For the Prime factor, the experimenter's truck featured either an empty gun rack (control), a gun rack with a fully-visible .303-caliber military rifle and a bumper sticker with the word "Friend" (Friendly Rifle), or a gun rack with a .303 rifle and a bumper sticker with the word "Vengeance" (Aggressive Rifle). The experimenter driving the pickup was made visible or invisible by the use of a curtain in the rear window.

There were 92 subjects, about 15/cell. The sample is restricted to males driving late-model privately-owned vehicles for some reason.

The authors reasoned that seeing the rifle would prime aggressive thoughts, which would inspire aggressive behavior, leading to more honking. They run five different planned complex contrasts and find that the Rifle/Vengeance combination inspired honking relative to the No Rifle and Rifle/Friend combo, but only when the curtain was closed, F(1, 86) = 5.98, p = .017. That seems like a very suspiciously post-hoc subgroup analysis to me.

A second study in Turner, Layton, and Simons (1975) collects a larger sample of men and women driving vehicles of all years. The design was a 2 (Rifle: present, absent) × 2 (Bumper Sticker: "Vengeance", absent) design with 200 subjects. They divide this further by driver's sex and by a median split on vehicle year. They find that the Rifle/Vengeance condition increased honking relative to the other three, but only among newer-vehicle male drivers, F(1, 129) = 4.03, p = .047. But then they report that the Rifle/Vengeance condition decreased honking among older-vehicle male drivers, F(1, 129) = 5.23, p = .024! No results were found among female drivers.
In summary, outside of perhaps one subgroup, assuming one believes the findings, there appears to not only be no priming of a weapon on aggressive behavior, but arguably the opposite: seeing a weapon in a vehicle suppressed horn-honking. When I was computing effect sizes for my recently published weapons effect meta-analysis, I noticed that overall, the Cohen's d was negative. That actually makes more sense to me.

Here is a screen shot of Table 3, which summarizes Study 3:









































At bare minimum we might be able to make a case that privileged males (based on the cars they drove) are the one subsample that would honk their horns even when it seemed irrational. Otherwise, it appears that non-privileged males and females overall (no distinction is made on the whether or not female subjects drove new cars or older cars) showed either no effect or a suppression effect!

Late last decade, a student and I attempted a replication of the old Turner et al. (1975) research. In our case, we used a different DV, latency of horn-honking: in other words how long it took the driver behind the truck to start honking, measured in seconds (admittedly, my student's measure was crude: seconds were measured based on a confederate's wristwatch, when a stopwatch might have been more appropriate). The prime stimulus used was a bumper sticker of an AK-47 that was placed conspicuously on the rear window of the truck in the treatment condition. There was no sticker in the control condition. We ended up with null findings. If anything the presence of the AK-47 sticker trended (although nonsignificantly) in a negative direction. Admittedly our sample was small (cell sizes of 10 in each condition), and so my student merely wrote up the results to complete the requirement of a methods course he was in. It is possible that with a large enough sample, we would have been able to show fairly conclusively that drivers generally have the good sense not to try to provoke those who drive with weapons or even images of weapons. Or we may have ended up with a simple null finding, and given the low power of our study, that is a fair enough assessment.

I've often wondered what to make of this set of experiments, beyond the obvious conclusion that Turner et al. (1975) did not actually replicate the classic Berkowitz and LePage (1967) lab experiment. I am now wondering if there may be another plausible explanation. There is a body of research showing that individuals who are exposed to images of guns and knives embedded within an array of images are pretty good at primary threat appraisal. That is, they notice the images faster (based on reaction time) and they tend to show more caution (again based on reaction time) when primed with these images (see Sulikowski & Burke, 2014, for a recent set of experiments). Bottom line is that we may want to reinterpret the horn-honking experiments of Turner et al. (1975) and the work my student did with me as follows: weapons do not increase horn-honking behavior, to the extent we have used it as a proxy for aggression. Rather, it is likely that weapons either have no impact on horn-honking , or suppress the impulse to engage in horn-honking. This latter conclusion is consistent with the findings of some evolutionary psychologists who study threat appraisal. Individuals who encounter a potentially threatening stimulus are probably going to be more cautious around those who display such stimuli to the extent that they are motivated toward self-preservation. The adaptive response to seeing someone driving a vehicle with a gun on a gun-rack or a sticker of a weapons-grade firearm is to refrain from horn-honking, and if that is not possible, to at least delay horn-honking for as long as possible. At least that is an explanation that strikes me as sensible. Beyond that possible very tentative conclusion, I would suggest a lot of caution when interpreting not only field experiments purporting to demonstrate a link between short-term exposure to weapons or weapon images and aggressive behavioral outcomes, but lab experiments as well.In the meantime, stay skeptical.

Saturday, December 8, 2018

When is a replication not a replication?

Let's imagine a scenario. A researcher several years ago designs a study with five treatment conditions and is mainly interested in a planned contrast between condition 1 and the remaining four conditions. That finding appears statistically significant. A few years later, the same researcher runs a second experiment that appears to be based on the same protocols, but with a larger sample (both good ideas) and finds the same planned contrast is no longer significant. That is problematic for the researcher. So, what to do? Here is where we meet some forking paths. One choice is to report the findings as they appear and acknowledge that the original finding did not replicate. Admittedly finding journals to publish non-replications is still a bit of a challenge (too much so in my professional opinion), so that option may seem a bit unsavory. Perhaps another theory driven path is available. The researcher could note that other than the controller used in condition 1 and condition 2 (and the same for condition 3 and condition 4), the stimulus is identical. So, taking a different path, the researcher combines conditions 1 and 2 to form a new category and does the same with conditions 3 and 4. Condition 5 remains the same. Now, a significant ANOVA is obtainable and the researcher can plausibly argue that the findings show that this new category (conditions 1 and 2 combined) really is distinct from the neutral condition, thus supporting a theoretical model. The reported findings now look good for publication in a higher impact journal. The researcher did not find what she/he initially set out to find, but did find something. But did the researcher really replicate the original findings? If based on the prior published work, the answer appears to be no. The original planned contrast between condition 1 and the other conditions does not replicate. Does the researcher have a finding that tells us something possibly interesting or useful? Maybe. Maybe not. Does the revised analysis appear to be consistent with an established theoretical model? Apparently. Does the new finding tell us something about everyday life that the original would not have already told us had it successfully replicated? That's highly questionable. At bare minimum, in the strict sense of how we define a replication (i.e., a study that finds similar results to the original and/or to similar other studies) the study in question fails to do so. That happens with many psychological phenomena, especially ones that are quite novel and counter-intuitive.

This is a real scenario. I am remaining deliberately vague as to avoid picking on someone (usually not productive and likely only to result in defensiveness, which is the last thing we need as we move to a more open science) but also to point out that the process of following research from start to finish is one in which we find ourselves faced with many forking paths, and sometimes the ones we choose take us far from our intended destination (far more productive). Someone else noted that what we sometimes call p-hacking or HARKing (the latter seems to have occurred in this scenario) is akin to experimenter bias, and should be treated as such. We as researchers make a number of decisions - often outside of conscious awareness - that influence the outcome of our work. That includes the statistical side of our work as well. I like the idea as it avoids unnecessary shaming while allowing skeptics the space needed to point out potential problems that appear to have occurred. That seems healthier. As far as the real scenario above, it did not take me long once the presumed replication report was published online to realize that the findings were not actually a replication. Poking around at the data set (it was helpful the author made that available) was very crucial and what I was able to reproduce coincided with what others were also noticing before me. The bottom line was that having a set of registered protocols, the data, and the research report were all very helpful in determining that the conclusion the researcher wished to draw was apparently erroneous. It happens. In the process, I gained some insight into the paths the researcher chose to make the analysis decisions and conclusions that she/he made. The insights I was able to arrive at were hardly novel, and others have drawn similar insights prior to me.

Here's the thing I have to keep in mind going forward. My little corner of the psychological sciences is going through some pretty massive changes. It sometimes feels like walking through a virtual minefield in a video game, given the names, egos, and reputations involved. I am optimistic that if we can place our focus where it belongs - on the methodology and the data themselves - rather than focusing on the personalities, we'll wind up with the sort of open science worthy of the name. Getting there will not be easy.

In the meantime, we will still deal with reported replications that really are not replications.

Friday, December 7, 2018

Better late than never, I suppose.

A few years ago, a student and I submitted a paper for consideration in a journal, based on a talk given at a conference honoring George Gerbner. Even after acceptance, it took a good couple years for it to go into print. Rather belatedly, I'd find out was actually published in 2016. As I learned, academic publishing in Hungary is far from seamless. But it did come out. Sara Oelke and I were grateful for that. It is nice when an undergraduate student project at a relatively obscure institution like mine can coauthor a published article.



Update: A later replication of Experiment 2 (using identical protocols) actually fits in with a pattern the Open Science Foundation reported in 2015. The finding in this case was still statistically significant, but the effect size was noticeably smaller in the replication study than in the original. So some potential for a decline effect there. I really need to write up the replication attempt, noting that it was not a faithful replication in terms of effect size, but still appeared to be statistically significant. I am a bit more cautious about the effectiveness framing approach we used to influence attitudes toward torture as a result.

Saturday, December 1, 2018

Is fragile masculinity related to voting preferences in the US?

The tentative answer appears to be that it might be. Regions where search terms associated with fragile masculinity (e.g., "erectile dysfunction", "how to get girls") appeared to be ones that also voted for Trump by higher margins in 2016. The pattern does not seem to apply to prior Presidential cycles (2008, 2012). The authors also conducted a preregistered study examining if this pattern could be extended to Congressional electoral cycles and it appears that at least for the 2018 cycle the answer is tentatively yes. This is correlational research, and the authors don't try to make causal claims, which is to their credit. What to make of this set of findings? I am not really sure yet. File this one under curiosities.

Thursday, November 29, 2018

This Time Could Be Different

Here's a link to the podcast at The Black Goat. Give it a listen. There was certainly still at least some talk of reform back when I was in grad school. Obviously, that went nowhere in a hurry. So here we are. Maybe we'll get it right this time.

Wednesday, November 28, 2018

Motivation

No matter our background, no matter our vocation, there has to be something that gets us up in the morning. For me, lately, that is anger.

At what? Let's just say that the crisis which goes by many names (replication crisis, replicability crisis, methodological crisis) felt like a punch to the gut - and one I just did not see coming. As I digested what had happened and what was happening, I had to change my perspective about a field that defines a significant part of my identity. Initially I was a bit sanguine. Then as reality sank in, I got pissed off. After all, to maintain any semblance of integrity, I had to alert students in many of my classes that there were whole sections of textbooks that they were probably best ignoring, or viewing only as cautionary tales. That meant accepting that students would ask me, what was real, and for me to not necessarily feel like I had a satisfactory answer. A substantial chunk of work in my corner of our aching science seems to needlessly scare the hell out of people, and that work is not aging well. In fact the moral panics over video games and violence or screen time and any of a number of purported negative psychological health outcomes remind me of the moral panics that I grew up with: Dungeons and Dragons was supposed to damage teens psychologically, as were the lyrics of songs from many of my favorite bands of the time (remember that I enjoyed and still enjoy punk and punk-derived music from the late 1970s-mid 1980s). At the time I would see people make causal claims from correlational data (or merely out of thin air) and I would just think, "bullshit." One could say that I did become an educator, and maybe that questionable life choice is an outcome of questionable life choices I made in my youth, including my pop culture interests and activities of the day. I am pissed at a system of dissemination of our work that relies on the funds (at least indirectly) of our citizens but which once published becomes the property of some conglomerate that then sells the content back to the citizens at an insane profit, and sometimes with peer review and editorial standards that differ little from what most of us rightfully deride as predatory journals.

Thankfully, from punk I got both the attitude and the politics. The attitude is the easy part. The politics actually took a good deal of thought. And so here I am again. It would be easy to adopt a pose of casual contempt or indifference and merely sneer as I preview a textbook or read the latest journal article. That's not me. I actually care. So maybe a little anarchy (not in the sense of chaos!) will do us some good about now. Things get shaken up a bit and if that leads to the sort of changes we need (more open communication and archiving our work, more equality and equity in the profession as opposed to rigid hierarchies) I'm in. Reading much of what is coming out of the open science proponents is the equivalent of putting on an old familiar Black Flag or Dead Kennedys LP. Hell, sometimes I do both, especially if I am at the office on a weekend and can crank up the volume. The punks at their best were angry and thoughtful. They wanted to knock down stuff, but they also were also wanting to replace whatever was knocked down with something better (which was of course ever an open-ended question as to what that would entail). Whatever form that something better takes, I hope for a science that truly gives itself away in the public interest, rather than get coopted into some neoliberal facsimile of open science that merely repeats the same mistakes of the past. Doing what I can, as an educator and scholar who has little privilege or leverage to offer other than adding to the voices in the proverbial wilderness is enough for now. That gets me up in the morning, like clockwork.

Tuesday, November 27, 2018

Everything Went Black

I nicked that title from an old Black Flag album, from when they were temporarily not Black Flag due to a legal battle with their former label. Apparently issuing Damaged under Black Flag's new label really pissed off the suits at MCA. Eventually the label owned by MCA went under and Black Flag returned with a vengeance. Among bands in the hardcore scene circa the mid-1980s, Black Flag did not fit comfortably. It is not clear that they were even punk by the time they released some of my favorite recordings in 1984 and 1985. The band had moved into much more experimental territory, we elements of metal, and more importantly free jazz thrown into the mix. Add to that a very confrontational set of artists who clearly did not relish the ever present threat of violence at shows where their new sounds were increasingly alienating their core audience.

The way I saw it at the time, although I may not have worded it as such is that there was a crisis in the punk scene. The old formulas just did not seem to work any more, and openly admitting so was a good way to get sucker-punched, or stomped. One would certainly be shunned even if a punch were never thrown. So the old formulas remained in place, and punk became "another meaningless fad" (to nick a line from Dead Kennedys). What to do when what appeared to work before no longer does? One answer is to ignore it or wish it away. I certainly watched enough people come and go who did that back in the day. Another approach was to abandon what no longer worked and move in a different direction - ideally still embodying the ideals of the movement. Black Flag were quite adept at doing so for a few years. So was Flux (formerly Flux of Pink Indians), whose last album, Uncarved Block, was unlike any UK anarchopunk LP at the time. Probably should mention Chumbawamba while I am at it. There is something refreshing about searching for a new path when the old one has turned into a dead end. It happens in the arts, the sciences, and in life. As someone who was never more than one of the scenesters during the 1980s punk era, I knew it was time to follow some different muses when it became obvious that all that was left at the clubs and parties were folks who had the style and the attitude down, but who never really understood the ideas or the politics.

On some level, what I recall from a formative part of my early years serves as an allegory for what has gone on in my aching corner of the sciences as a methodological crisis has continued to unfold. There is so much I would love to write about. Problems in my little corner of the psychological sciences are the same ones affecting the rest of our aching field. Unfortunately when I am passionate about something that actually matters to me, I write with the heat of a thousand suns. Although that heat may not be aimed at one specific person or group, there is the chance it will be treated as such, placing me in a position that I find uncomfortable. Having to scrub this blog of content in order to prevent a situation from escalating is something I will not go through again. That is simply not tenable given the time it takes me to write, along with my numerous other commitments. When you are not a person of privilege (in the academic world, I and the institutions where I work are truly among the unprivileged), consequences hit twice as hard as for anyone else. Don't feel bad. I don't. Just the way it is. If you want to feel anything, feel anger. Then do something to make academic life more equitable. I guess I never really left my punk roots, and perhaps there is a reason I do have a good deal of empathy for those among psychology's reformers who advocate burning everything to the ground.

I am honestly not sure what I am going to do with this blog. I considered just deleting it altogether and look for other avenues to work out ideas, look at some problems that desperately need to be looked at, etc. Maybe that's the way to go. Maybe I will figure out a way to write as I wish. Time will tell.