Friday, September 7, 2018

Research Confidential: The Hidden Data

For now I will be somewhat vague, but perhaps I can get more specific if I don't think I will be breaching some ethical boundary.

One of the on-going problems in the social sciences is that although we conduct plenty of research, only a fraction of those findings ever get reported. What happens to the rest of those findings? That is a good question. I suspect many just end up buried.

It turns out I do have some experiences that are relevant. One of the problems I wrestle with is what to do about what I know. The problem is, although the data are ultimately the property of the taxpayers, the respective principal investigators are in control of their dissemination. In one case, in graduate school, the findings were statistically significant. We were able to find significant main effects and an interaction effect. The problem was that the pattern of the results was not one that lent itself to an easy theoretical explanation. The PI in this case and I puzzled over the findings for a bit before we reached a compromise of sorts. I would get to use the findings for a poster presentation, and then we would just forget that the experiment even happened. It was a shame, as I thought then, and still think now that the findings may have shed some light on how a mostly young adult sample was interacting and interpreting the stimulus materials we were using. An identical experiment run by one of my grad student colleagues in the department produced data that squared with my PI's theoretical perspective, and those were the findings that got published.

The other set of findings are a bit more recent. Here, the PI had run a couple online studies intended to replicate a weapons effect phenomenon that a student and I had stumbled upon. The first experiment failed. The use of an internet-administered lexical decision task was likely the problem. The amount of control that would have existed in the lab was simply not available in that particular research context. The other was also administered online, and used a word completion task as the DV. That also failed to yield statistical significance. This one was interesting, because I could get some convenient info on that particular DV's internal consistency. Why would I do that? I wondered if our problem was with an often overlooked issue in my field: the lack of paying attention to the fundamentals of test construction and determining that these tests are psychometrically sound (good internal consistency, test-retest reliability, etc.), leading to problems in measurement error. The idea is hardly a novel insight, and plenty of others have voiced concern about the psychometric soundness of our measures in social psychology. As it turned out in this instance, there was reason to be concerned. The internal consistency numbers were well below what we would consider minimally adequate. There was tremendous measurement error, making it potentially difficult to detect an effect if it were not there. My personal curiosity was certainly satisfied, but that did not matter. I was told not to use those data sets in any context. So, I have knowledge of what went wrong, and some insight into why (although I may be wrong), but no way to communicate so clearly with my peers. I cannot, for example, upload the data and code so that others can go over the data and scrutinize them - and perhaps offer insights that I might not have considered. The data are merely tossed into the proverbial trash can and considered forgotten. And when dealing with influential researchers in our field, it is best to go along if they are the ones calling the shots. Somehow that does not feel right.

Regrettably that is all I can disclose. I suspect that there are plenty of others who are grad students, post-docs, early career faculty, or faculty at relatively low-power colleges and universities who end up with similar experiences. In these cases, I can only hope that the data sets survive long enough to get included in relevant meta-analyses, but even then, how often does that occur? These data sets, no matter how "inconvenient" they may appear on the surface, may be telling us something useful if we would only listen. They may also tell a much needed story to our peers in our research areas that they need to hear as they go about their own work. I may be a relatively late convert to the basic tenets of open science, but increasingly I find openness as a necessary remedy to what is ailing us as social psychologists. We should certainly communicate our successes, but why not also communicate our null findings as well, or at minimum publicly archive them so that others may work with that data if we ourselves no longer wish to?

No comments:

Post a Comment