Quotulatiousness

August 29, 2015

We need a new publication called The Journal of Successfully Reproduced Results

Filed under: Media, Science — Tags: , , , , , — Nicholas @ 04:00

We depend on scientific studies to provide us with valid information on so many different aspects of life … it’d be nice to know that the results of those studies actually hold up to scrutiny:

One of the bedrock assumptions of science is that for a study’s results to be valid, other researchers should be able to reproduce the study and reach the same conclusions. The ability to successfully reproduce a study and find the same results is, as much as anything, how we know that its findings are true, rather than a one-off result.

This seems obvious, but in practice, a lot more work goes into original studies designed to create interesting conclusions than into the rather less interesting work of reproducing studies that have already been done to see whether their results hold up.

Everyone wants to be part of the effort to identify new and interesting results, not the more mundane (and yet potentially career-endangering) work of reproducing the results of older studies:

Why is psychology research (and, it seems likely, social science research generally) so stuffed with dubious results? Let me suggest three likely reasons:

A bias towards research that is not only new but interesting: An interesting, counterintuitive finding that appears to come from good, solid scientific investigation gets a researcher more media coverage, more attention, more fame both inside and outside of the field. A boring and obvious result, or no result, on the other hand, even if investigated honestly and rigorously, usually does little for a researcher’s reputation. The career path for academic researchers, especially in social science, is paved with interesting but hard to replicate findings. (In a clever way, the Reproducibility Project gets around this issue by coming up with the really interesting result that lots of psychology studies have problems.)

An institutional bias against checking the work of others: This is the flipside of the first factor: Senior social science researchers often actively warn their younger colleagues — who are in many cases the best positioned to check older work—against investigating the work of established members of the field. As one psychology professor from the University of Southern California grouses to the Times, “There’s no doubt replication is important, but it’s often just an attack, a vigilante exercise.”

[…]

Small, unrepresentative sample sizes: In general, social science experiments tend to work with fairly small sample sizes — often just a few dozen people who are meant to stand in for everyone else. Researchers often have a hard time putting together truly representative samples, so they work with subjects they can access, which in a lot of cases means college students.

A couple of years ago, I linked to a story about the problem of using western university students as the default source of your statistical sample for psychological and sociological studies:

A notion that’s popped up several times in the last couple of months is that the easy access to willing test subjects (university students) introduces a strong bias to a lot of the tests, yet until recently the majority of studies disregarded the possibility that their test results were unrepresentative of the general population.

August 14, 2015

QotD: When “the science” shows what you want it to show

Filed under: Media, Quotations, Science — Tags: , , , , , — Nicholas @ 01:00

To see what I mean, consider the recent tradition of psychology articles showing that conservatives are authoritarian while liberals are not. Jeremy Frimer, who runs the Moral Psychology Lab at the University of Winnipeg, realized that who you asked those questions about might matter — did conservatives defer to the military because they were authoritarians or because the military is considered a “conservative” institution? And, lo and behold, when he asked similar questions about, say, environmentalists, the liberals were the authoritarians.

It also matters because social psychology, and social science more generally, has a replication problem, which was recently covered in a very good article at Slate. Take the infamous “paradox of choice” study that found that offering a few kinds of jam samples at a supermarket was more likely to result in a purchase than offering dozens of samples. A team of researchers that tried to replicate this — and other famous experiments — completely failed. When they did a survey of the literature, they found that the array of choices generally had no important effect either way. The replication problem is bad enough in one subfield of social psychology that Nobel laureate Daniel Kahneman wrote an open letter to its practitioners, urging them to institute tougher replication protocols before their field implodes. A recent issue of Social Psychology was devoted to trying to replicate famous studies in the discipline; more than a third failed replication.

Let me pause here to say something important: Though I mentioned bias above, I’m not suggesting in any way that the replication problems mostly happen because social scientists are in on a conspiracy against conservatives to do bad research or to make stuff up. The replication problems mostly happen because, as the Slate article notes, journals are biased toward publishing positive and novel results, not “there was no relationship, which is exactly what you’d expect.” So readers see the one paper showing that something interesting happened, not the (possibly many more) teams that got muddy data showing no particular effect. If you do enough studies on enough small groups, you will occasionally get an effect just by random chance. But because those are the only studies that get published, it seems like “science has proved …” whatever those papers are about.

Megan McArdle, “The Truth About Truthiness”, Bloomberg View, 2014-09-08.

April 23, 2011

QotD: The debunking problem in media

Filed under: Media, Quotations, Science — Tags: , , , , , — Nicholas @ 12:23

[. . .] the second issue is how people find out about stuff. We exist in a blizzard of information, and stuff goes missing: as we saw recently, research shows that people don’t even hear about retractions of outright fraudulent work. Publishing a follow-up in the same venue that made an initial claim is one way of addressing this problem (and when the journal Science rejected the replication paper, even they said “your results would be better received and appreciated by the audience of the journal where the Daryl Bem research was published”).

The same can be said for the New York Times, who ran a nice long piece on the original precognition finding, New Scientist who covered it twice, the Guardian who joined in online, the Telegraph who wrote about it three times over, New York Magazine, and so on.

It’s hard to picture many of these outlets giving equal prominence to the new negative findings that are now emerging, in the same way that newspapers so often fail to return to a debunked scare, or a not-guilty verdict after reporting the juicy witness statements.

All the most interesting problems around information today are about structure: how to cope with the overload, and find sense in the data. For some eyecatching precognition research, this stuff probably doesn’t matter. What’s interesting is that the information architectures of medicine, academia and popular culture are all broken in the exact same way.

Ben Goldacre, “I foresee that nobody will do anything about this problem”, Bad Science, 2011-04-23

January 3, 2011

Healthy skepticism about study results

Filed under: Bureaucracy, Media, Science — Tags: , , , , — Nicholas @ 13:30

John Allen Paulos provides some useful mental tools to use when presented with unlikely published findings from various studies:

Ioannidis examined the evidence in 45 well-publicized health studies from major journals appearing between 1990 and 2003. His conclusion: the results of more than one third of these studies were flatly contradicted or significantly weakened by later work.

The same general idea is discussed in “The Truth Wears Off,” an article by Jonah Lehrer that appeared last month in the New Yorker magazine. Lehrer termed the phenomenon the “decline effect,” by which he meant the tendency for replication of scientific results to fail — that is, for the evidence supporting scientific results to seemingly weaken over time, disappear altogether, or even suggest opposite conclusions.

[. . .]

One reason for some of the instances of the decline effect is provided by regression to the mean, the tendency for an extreme value of a random quantity dependent on many variables to be followed by a value closer to the average or mean.

[. . .]

This phenomenon leads to nonsense when people attribute the regression to the mean as the result of something real, rather than to the natural behavior of any randomly varying quantity.

[. . .]

In some instances, another factor contributing to the decline effect is sample size. It’s become common knowledge that polls that survey large groups of people have a smaller margin of error than those that canvass a small number. Not just a poll, but any experiment or measurement that examines a large number of test subjects will have a smaller margin of error than one having fewer subjects.

Not surprisingly, results of experiments and studies with small samples often appear in the literature, and these results frequently suggest that the observed effects are quite large — at one end or the other of the large margin of error. When researchers attempt to demonstrate the effect on a larger sample of subjects, the margin of error is smaller and so the effect size seems to shrink or decline.

[. . .]

Publication bias is, no doubt, also part of the reason for the decline effect. That is to say that seemingly significant experimental results will be published much more readily than those that suggest no experimental effect or only a small one. People, including journal editors, naturally prefer papers announcing or at least suggesting a dramatic breakthrough to those saying, in effect, “Ehh, nothing much here.”

The availability error, the tendency to be unduly influenced by results that, for one reason or another, are more psychologically available to us, is another factor. Results that are especially striking or counterintuitive or consistent with experimenters’ pet theories also more likely will result in publication.

« Newer Posts

Powered by WordPress