Monday, January 29, 2018

Cross-validating in meta-analysis

I thought I'd share a couple techniques I've picked up on that are useful for cross-validation purposes. Keep in mind that the sorts of meta-analyses I am interested in involve experimental designs, and so what I will offer may or may not work for your particular purposes.

If you are estimating Cohen's d from between-subjects designs, the following formula for estimating N is recommended:


Here you simply need to know your estimate of d and the variance (v) for a particular comparison. If you are able to estimate N from the above formula reasonably accurately, you can be confident that your estimate is in the ballpark. Note that this formula works best when your treatment and control group have equal sample sizes. Unequal sample sizes will not yield accurate estimates of N.

The above formula will not work with within-subjects designs. The formula that I know does work for within-subjects designs is the following:


Note that the above formula assumes you will know the exact correlation (r) between your variables, which may or may not be reported or available. I have found that under those circumstances, if I assume r = .5000, that I typically get accurate enough estimates of N from my calculations of d and variance (v). That said, for those in the process of conducting a meta-analysis, I recommend contacting the original authors or principle investigators under circumstances where all you might have to go on is a paired-sample t-test and a sample size (and potentially a p-value). Often, the authors are more than happy to provide the info you want or need either in the form of actual estimates of r for each comparison that they computed, or better yet provide the original data set and enough info so you can do so yourself. That's easy with newer studies. Good luck if the research was published much earlier than this decade - though even then I have been amazed at how helpful authors will try to be. For those cross-validating a meta-analyst's database, if the original correlational info is available, ideally it will be recorded in the database itself for within-subjects comparisons. If not, email the meta-analyst. Again, we should be able to provide that info easily enough.

If you embark on a meta-analysis, keep in mind that others who eventually want to see your data will try to cross-validate your effect size estimates. Get ahead of that situation and do so from the get-go on your own. You'll know that you can trust your calculations of effect size and you will be able to successfully address concerns about those computations as they arise later. Ultimately that's the bottom line: you need to know that you can trust the process of how your effect size calculations are being computed, regardless of whether you are using a proprietary software package like CMA or open access language like R, and regardless of how seasoned you are as a meta-analyst. If you find problems cross-validating, then you can go back and check your code for possible errors. That'll undoubtedly save some heartache and heartburn, but the more important thing is that you can be confident that what you ultimately present to your particular audience is the closest approximation to the truth possible. Ultimately, that is all that matters. Hopefully the above is helpful to someone.

No comments:

Post a Comment