Selection bias in polling « Quotulatiousness

December 29, 2022

Selection bias in polling

Filed under: Media, USA — Tags: Polls, Statistics, University — Nicholas @ 03:00

At Astral Codex Ten, Scott Alexander points out that it’s impossible to do any kind of poll without some selection bias, so you can’t automatically dismiss any given poll on that basis. The kind of selection bias, however, may indicate whether the results will have any relationship to reality:

The real studies by professional scientists usually use Psych 101 students at the professional scientists’ university. Or sometimes they will put up a flyer on a bulletin board in town, saying “Earn $10 By Participating In A Study!” in which case their population will be selected for people who want $10 (poor people, bored people, etc). Sometimes the scientists will get really into cross-cultural research, and retest their hypothesis on various primitive tribes — in which case their population will be selected for the primitive tribes that don’t murder scientists who try to study them. As far as I know, nobody in history has ever done a psychology study on a truly representative sample of the world population.

This is fine. Why?

Selection bias is disastrous if you’re trying to do something like a poll or census. That is, if you want to know “What percent of Americans own smartphones?” then any selection at all limits your result. The percent of Psych 101 undergrads who own smartphones is different from the percent of poor people who want $10 who own smartphones, and both are different from the percent of Americans who own smartphones. The same is potentially true about “how many people oppose abortion?” or “what percent of people are color blind?” or anything else trying to find out how common something is in the population. The only good ways to do this are a) use a giant government dataset that literally includes everyone, b) hire a polling company like Gallup which has tried really hard to get a panel that includes the exact right number of Hispanic people and elderly people and homeless people and every other demographic, c) do a lot of statistical adjustments and pray.

Selection bias is fine-ish if you’re trying to do something like test a correlation. Does eating bananas make people smarter because something something potassium? Get a bunch of Psych 101 undergrads, test their IQs, and ask them how many bananas they eat per day (obviously there are many other problems with this study, like establishing causation — let’s ignore those for now). If you find that people who eat more bananas have higher IQ, then fine, that’s a finding. If you’re right about the mechanism (something something potassium), then probably it should generalize to groups other than Psych 101 undergrads. It might not! But it’s okay to publish a paper saying “Study Finds Eating Bananas Raises IQ” with a little asterisk at the bottom saying “like every study ever done, we only tested this in a specific population rather than everyone in the world, and for all we know maybe it isn’t true in other populations, whatever.” If there’s some reason why Psych 101 undergrads are a particularly bad population to test this in, and any other population is better, then you should use a different population. Otherwise, choose your poison.

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

December 29, 2022

Selection bias in polling

No Comments