Quotulatiousness

January 3, 2011

Healthy skepticism about study results

Filed under: Bureaucracy, Media, Science — Tags: , , , , — Nicholas @ 13:30

John Allen Paulos provides some useful mental tools to use when presented with unlikely published findings from various studies:

Ioannidis examined the evidence in 45 well-publicized health studies from major journals appearing between 1990 and 2003. His conclusion: the results of more than one third of these studies were flatly contradicted or significantly weakened by later work.

The same general idea is discussed in “The Truth Wears Off,” an article by Jonah Lehrer that appeared last month in the New Yorker magazine. Lehrer termed the phenomenon the “decline effect,” by which he meant the tendency for replication of scientific results to fail — that is, for the evidence supporting scientific results to seemingly weaken over time, disappear altogether, or even suggest opposite conclusions.

[. . .]

One reason for some of the instances of the decline effect is provided by regression to the mean, the tendency for an extreme value of a random quantity dependent on many variables to be followed by a value closer to the average or mean.

[. . .]

This phenomenon leads to nonsense when people attribute the regression to the mean as the result of something real, rather than to the natural behavior of any randomly varying quantity.

[. . .]

In some instances, another factor contributing to the decline effect is sample size. It’s become common knowledge that polls that survey large groups of people have a smaller margin of error than those that canvass a small number. Not just a poll, but any experiment or measurement that examines a large number of test subjects will have a smaller margin of error than one having fewer subjects.

Not surprisingly, results of experiments and studies with small samples often appear in the literature, and these results frequently suggest that the observed effects are quite large — at one end or the other of the large margin of error. When researchers attempt to demonstrate the effect on a larger sample of subjects, the margin of error is smaller and so the effect size seems to shrink or decline.

[. . .]

Publication bias is, no doubt, also part of the reason for the decline effect. That is to say that seemingly significant experimental results will be published much more readily than those that suggest no experimental effect or only a small one. People, including journal editors, naturally prefer papers announcing or at least suggesting a dramatic breakthrough to those saying, in effect, “Ehh, nothing much here.”

The availability error, the tendency to be unduly influenced by results that, for one reason or another, are more psychologically available to us, is another factor. Results that are especially striking or counterintuitive or consistent with experimenters’ pet theories also more likely will result in publication.

December 30, 2010

Cartographic explanation for the order of secession

Filed under: History, USA — Tags: , , , — Nicholas @ 00:20

A fascinating NYT post looks at one of the most influential maps of the US Civil War period:

The 1860 Census was the last time the federal government took a count of the South’s vast slave population. Several months later, the United States Coast Survey — arguably the most important scientific agency in the nation at the time — issued two maps of slavery that drew on the Census data, the first of Virginia and the second of Southern states as a whole. Though many Americans knew that dependence on slave labor varied throughout the South, these maps uniquely captured the complexity of the institution and struck a chord with a public hungry for information about the rebellion.

The map uses what was then a new technique in statistical cartography: Each county not only displays its slave population numerically, but is shaded (the darker the shading, the higher the number of slaves) to visualize the concentration of slavery across the region. The counties along the Mississippi River and in coastal South Carolina are almost black, while Kentucky and the Appalachians are nearly white.

H/T to Walter Olson for the link.

December 20, 2010

Once again, correlation is not causation

Filed under: Britain, Media, Science — Tags: , , — Nicholas @ 12:50

An excellent example of what statistical analysis can and cannot show:

Do mobile phone towers make people more likely to procreate? Could it be possible that mobile phone radiation somehow aids fertilisation, or maybe there’s just something romantic about a mobile phone transmitter mast protruding from the landscape?

These questions are our natural response to learning that variation in the number of mobile phone masts across the country exactly matches variation in the number of live births. For every extra mobile phone mast in an area, there are 17.6 more babies born above the national average.

This was discovered by taking the publicly available data on the number of mobile phone masts in each county across the United Kingdom and then matching it against the live birth data for the same counties. When a regression line is calculated it has a “correlation coefficient” (a measure of how good the match is) of 98.1 out of 100. To be “statistically significant” a pattern in a dataset needs to be less than 5% likely to be found in random data (known as a “p-value”), and the masts-births correlation only has a 0.00003% probability of occurring by chance.

Part of the problem is that our brains have evolved to detect patterns and relationships — even when they’re not really there:

Mobile phone masts, however, have absolutely no bearing on the number of births. There is no causal link between the masts and the births despite the strong correlation. Both the number of mobile phone transmitters and the number of live births are linked to a third, independent factor: the local population size. As the population of an area goes up, so do both the number of mobile phone users and the number people giving birth.

The problem is that our first instinct is to assume that a correlation means that one factor is causing the other. While this does not cause a problem when using pattern-spotting as an evolved survival tool, it does cause severe problems when assessing possible health scares based on a recently uncovered correlation. For the majority of cases, correlation does not indicate the presence of causality.

H/T to Maggie Koerth-Baker for the link.

December 17, 2010

Gay and lesbian couples’ income levels

Filed under: Economics, Education, Randomness, USA — Tags: , , — Nicholas @ 12:59

As this article asserts, I don’t remember where I heard the “fact” that gay couples had higher incomes than heterosexual couples, but it seemed likely to be true. Apparently not:

The myth of gay money holds that “gays” (really just gay males) are high-income or rich. Why? Mostly because they don’t have kids, especially not when two guys live together. (That would make them DINKs.)

This myth was relentlessly propagated through the 1990s and persists today. Maybe you couldn’t put your finger on where you heard it (perhaps in a newspaper article?), but the stereotype is out there. And it isn’t true.

[. . .]

Why do gay males have generally lower incomes than straight males?

  • Gay males have more education than straight males, but they do not choose male-dominated professions as often as straight males do. In fact, they choose female-dominated and/or service professions much more often. Male-dominated professions (like construction) have generally higher wages than female-dominated professions (like secretarial).
  • Gay males work fewer hours than straight males.

Why do lesbians generally have higher income than straight females? It’s almost the inverse of the gay-male trend.

  • Lesbians also have more education than straight females, but they work longer hours — because, generally speaking, they are less likely to have children to take care of at home.
  • Lesbians are overrepresented in male-dominated professions that pay better than female-dominated professions.

What about discrimination? It’s a ready excuse to explain away the “few” gays who don’t meet the stereotype of being affluent. (That’s what press coverage would tell you – that just a few of us aren’t affluent. In reality, it’s most of us.) But the statistical evidence for discrimination as a cause of lower gay incomes is weak at best, and of course falls down completely in the case of lesbians, who, most studies agree, have higher incomes than straight females. Discrimination is clearly a factor sometimes; it just isn’t a credible explanation for the whole effect, which doesn’t apply to half of the population we’re talking about.

H/T to Freakonomics blog for the link.

December 7, 2010

Chinese official acknowledged that official data is unreliable

Filed under: Bureaucracy, China, Economics, Railways — Tags: , , , , — Nicholas @ 07:32

I’ve been saying this for years now: China’s official GDP and associated economic numbers are just not reliable:

A senior Chinese official said in 2007 that much of the country’s local economic data are unreliable, according to a leaked diplomatic cable published by the WikiLeaks website.

The official, Li Keqiang, was at the time Communist Party secretary of the northeastern province of Liaoning, and has since been promoted to vice premier. Since landing that position, he has overseen many of the central government’s efforts to improve the quality of its economic statistics, which continue to face many questions over their accuracy and consistency.

[. . .]

China’s Foreign Ministry has said it will not comment on the content of the diplomatic cables published by WikiLeaks. The leaked cable reports comments Mr. Li made in a dinner in Beijing with then-U.S. Ambassador Clark Randt on March 12, 2007. His remarks focused on the challenges of administering the province of Liaoning, which because of its legacy of failed state-owned enterprises was burdened with a large number of unemployed workers.

“When evaluating Liaoning’s economy, he focuses on three figures: 1) electricity consumption, which was up 10% in Liaoning last year; 2) volume of rail cargo, which is fairly accurate because fees are charged for each unit of weight; and 3) amount of loans disbursed, which also tends to be accurate given the interest fees charged,” the cable says.

“By looking at these three figures, Li said he can measure with relative accuracy the speed of economic growth. All other figures, especially GDP statistics, are ‘for reference only,’ he said smiling,” the cable reads. “GDP figures are ‘man-made’ and therefore unreliable,” the cable paraphrases Mr. Li as saying.

As I said back in February, the reason for the made up numbers is inherent in the Chinese system:

In this way, the PLA stopped being just the customer/end user. They cut out the middleman and absorbed the entire supply chain. The PLA became a significant economic player in the Chinese industrial economy . . . and this is still true today. The generals aren’t formally in charge, but they own the companies that do military production.

So what? So let’s look at how a civilian corporation’s incentives differ from one owned directly by the army. In a civilian corporation, the CEO runs the business with an eye to generating the largest profit possible while staying (for the most part) within the law. A CEO who deviates from this to ride a favourite hobby horse will eventually face the wrath of the stockholders who want that maximized profit. There are natural limits on how much freedom to invest in uneconomic activity any CEO will be given. Sensible stockholders don’t try to micromanage the firm, but do raise questions if too much of the company’s efforts are devoted to things clearly not related to the company’s long term benefit. Company accounts can be rigged, for a time, to show misleading results, but eventually (Enron, Worldcom, etc.) the truth will out.

A Chinese firm that’s owned by the army? Profit may be nice, but the “CEO” reports to a different master: the guys with the guns. The company accounts will show exactly what the guys with the guns want them to show . . . and the oversight and auditing committee members carry submachine guns. You’re told that your target is 10% growth? Don’t you think that the reported result will be at least 10%? Because your life may depend on the reported results being acceptable.

As my former virtual landlord says, this is one of my hobbyhorses:

December 3, 2010

Pay no attention to the statisticians behind the curtain

Filed under: Britain, Bureaucracy, Environment, Media — Tags: , , , — Nicholas @ 09:23

James Delingpole has a handy guide to assure you that man made global warming is still happening:

“It’s all actually a sign that man made global warming is very much a live issue and that there’s more of it happening than ever,” says a top scientist, who holds the British record for securing grant-funding for global warming research projects so he must know what he’s talking about.

“Look at the Met office,” the scientist goes on. “They’ve just told us that 2010 is the hottest year since records began in 1850 and even though the stupid Central England Temperature record tells us something quite different and even though the year hasn’t actually finished yet they must know what they’re talking about and they definitely can’t have fiddled the data because the Met office is part of the government and they wouldn’t lie or get things wrong which is why that barbecue summer was such a scorcher.”

The big problem is, the scientist said, is that the public are really stupid. They think just because Dr David Viner of the Climatic Research Unit said in the Independent in 2000 that soon there’d be no snow because of global warming, when what he actually meant was that soon there’d be lots of snow and that this would be “proof” of global warming. The interviewer just missed out the word “proof” that’s all because journalists are lazy that way.

Yes, yes, confusing mere “weather” with climate again, I’m sure.

October 10, 2010

Calculate your odds of winning the lottery

Filed under: Randomness — Tags: , — Nicholas @ 10:09

By way of the ever-helpful Roger Henry, here’s a site that lets you simulate your lottery habits to determine how much you’ll win and how much it’ll cost:

Incredibly Depressing Mega Millions Lottery Simulator!

It is really hard to win the Mega Millions lottery. So hard that it can be difficult to comprehend what long odds confront its players.

Why not try for free on this Mega Millions lottery simulator? You’ll be able to try the same numbers over and over, simulating playing twice a week for a year or 10. You’ll never win.

I ran 1040 simulations. It would have cost me $1040. I would have won $243. Not a good return on investment. I actually “win” more by not buying a ticket at all: I avoid the loss of that $797 in ticket purchases.

July 27, 2010

Four weeks in Canadian history

Filed under: Cancon, Humour — Tags: , , — Nicholas @ 07:59

Scott Feschuk has been out of the country for a while. He’s delighted to find that something actually happened in Canada while he was away, and provides a useful summary for those of us not paying attention:

I’ve been away from Canada for four of the past five weeks, and it’s always fun to return and see what’s been missed. A comprehensive review:

1. The dominant domestic news story of the past month hinges on the intricacies of statistical analysis.

2. Finally demonstrating a populist touch, Michael Ignatieff has started production on his own Speed sequel: If his party’s popularity in opinion polls falls below 25 per cent, the Liberal Express explodes! (Subplot: If the bus keeps stopping for Timbits, the occupants of the Liberal Express explode!)

3. Conrad Black has apparently tunneled out of prison and escaped.

4. Upon being informed of No. 3, David Radler has soiled himself.*

* Not reported, but a safe assumption.

Don’t ever change, Canada.

July 17, 2010

QotD: The census as legalized theft of time and resources

Filed under: Bureaucracy, Cancon, Economics, Government, Liberty, Quotations — Tags: — Nicholas @ 21:01

Those defending the Census’ mandatory long form have clothed their arguments in the public interest. We need, they argue, a detailed, fair and statistically accurate count of the population to ensure that government services and programs are effectively delivered to Canadians. Without going into how useful many of these programs really are, let’s agree that the Census provides an enormously valuable store of data. Data that is used not only by all three levels of government, but also market researchers, academics, corporations and charities.

The data gathered by the Census is a vital resource for both the public and private sector. But it is not the only valuable product or service used by governments. Governments also large use large quantities cement, asphalt, paper, sophisticated electronic equipment and the services of tens of thousands of Canadians. Yet it is expected that government pay for these products and services, from Canadians who voluntarily exchange their talents and energies.

If employees of the federal government started randomly seizing cement trucks, or conscripting people off the streets to build roads, such conduct would be rightly denounced. It would be the sort of behaviour one expects of thugs like Hugo Chavez or Fidel Castro, not the government of a free country like Canada. The Census, for the all the recent beating of breasts and furrowing of brows, is just another service the government needs to conduct its affairs.

A mandatory cenus is less about some hazy notion of the public interest, and more about governments, corporations, academics and other consumers of Census data getting a free ride. Rather than having to conduct their own research, and make careful adjusts to compensate for possible distortions between samples and the overall popualtion, these data consumers get the government to force ordinary Canadians to save them the bother.

Publius, “The Census: Government Information Theft”, Gods of the Copybook Headings, 2010-07-16

July 12, 2010

QotD: Silly census fuss

Filed under: Bureaucracy, Cancon, Liberty, Quotations — Tags: , , , , — Nicholas @ 12:20

[. . .] isn’t it just the slightest bit embarrassing for a government whose leader has trashed libertarians for their ethical myopia to have minions and media partisans present a libertarian pretext for an action that is not literally among the first 200 policy changes that would be implemented by an intelligent libertarian given plenary power?

Colby Cosh, “Census squabble: weak arguments shouldn’t have even worse foundations”, Maclean’s, 2010-07-12

June 16, 2010

Air pollution: unseen (and statistically unlikely) killer

Filed under: Cancon, Environment, Health — Tags: , , — Nicholas @ 00:09

Air pollution is bad, and the computer models used to determine how bad it is show that more than 100% of all deaths were due to pollution!

Air pollution cuts a deadly but invisible swath through Canada. We know this because the Canadian Medical Association says there were 21,000 deaths from exposure to air-borne pollutants in 2008. Of these, 2,682 Canadians were instantly struck down by the acute effects of pollution. By 2031, 710,000 people will have been slain by this unseen killer.

The evidence on this epic death toll is chillingly precise. According to the Ontario Medical Association, exactly 348 people died from air pollution in Waterloo Region in 2008. In Hamilton, 445 lives were cut short. And Manitoulin Island tragically lost 14 residents due to pollutants that year.

In Toronto, the Big Smoke of Canada, the figures are appropriately larger. Calculations by Toronto Public Health claim air pollution kills 1,700 people annually and sends 6,000 to the hospital. Ten percent of all non-trauma deaths in Toronto are directly attributed to air pollution.

Did you know that? I certainly didn’t. Oh, and wait . . . neither of us knew it because it’s junk scientific bullshit:

Consider what happens when you take Toronto’s computer model and use it to determine the death toll in previous eras, when the air was far more polluted than today. For example, average sulfur dioxide levels in downtown Toronto were more than 100 parts per billion in the mid-1960s. It’s now less than 10 ppb. No surprise then, that the death toll was much greater in the bad old days. Across the 1960s, half of all non-trauma deaths were the direct result of air pollution, according to Toronto’s model. And in February 1965, more than 100% of all deaths were due to pollution!

In other words, air pollution killed more people inside the computer model than actually died of all causes in the real world. How’s that for deadly?

I can confidently assure any modern day pollution-panicked worrier that things were much, much worse in the 1960s and 70s: the air was much more difficult to breathe in downtown Toronto, the water was disgustingly polluted, and (we were assured) things could only get worse in our little slice of environmental hell. The air is far less polluted now than at any time in my life, the lakes are largely recovered from the worst environmental damage we inflicted on them.

February 24, 2010

Rechecking the data (where it still exists) is the only solution

Filed under: Environment, Media, Science — Tags: , , , , — Nicholas @ 12:58

Given all the “missing”, “normalized”, and “cherry-picked” data in the climate change debate, this is the only rational way forward:

More than 150 years of global temperature records are to be re-examined by scientists in an attempt to regain public trust in climate science after revelations about errors and suppression of data.

The Met Office has submitted proposals for the reassessment by an independent panel in a tacit admission that its previous reports have been marred by their reliance on analysis by the University of East Anglia’s Climatic Research Unit (CRU).

Two separate inquiries are being held into allegations that the CRU tried to hide its raw data from critics and that it exaggerated the extent of global warming.

In a document entitled Proposal for a New International Analysis of Land Surface Air Temperature Data, the Met Office says: “We feel it is timely to propose an international effort to reanalyse surface temperature data in collaboration with the World Meteorological Organisation.”

As I’ve said several times, we may actually have a global problem with rising temperatures, and if so we need to consider the potential impact and possible ways to address it. However, the science is far from settled — in fact, it’s more unsettled now than it was at any time in the last fifteen years. Without reliable data, we can’t pretend to make any predictions or recommend any course of action because we don’t know whether global temperatures are rising or not.

February 23, 2010

Statistics can tell a lot . . . but not always truthfully

Filed under: Cancon, Economics, Europe — Tags: , , , , — Nicholas @ 12:46

Brian Lilley looks at a recent report which critiques the federal government’s claim that women earn only 84% of the wages that men earn. The report uses a different set of statistics to show that women only earn 70 cents for every dollar a man earns in Canada:

Were this true it would be a shocking and appalling state of affairs, the type of thing that government regulations must be called upon to rectify. I truly do not know anyone who would advocate that a man earn 42% more than a woman for working the same job, for the same number of hours. Of course this is not the case.

The report, dubbed a reality check by its authors, looks at the government’s claim that women earn 84 cents for every dollar a man makes and they dismiss it. Their reason for doing so? The government does not use the correct data. In the government report, the 84 cents on the dollar claim is arrived at by looking at wages on a dollar per hour basis using Statistics Canada’s July 2008 Labour Force Survey. In July of 2008 women earned an average of $19.14 per hour while men earned an average of $22.80 per hour, thus the 84 cents on the dollar figure.

In any argument over statistics, the chosen measurement is always the one that best supports your argument. This is fair play, when the statistics are comparable. It isn’t when your choice of stat measures something quite different:

The collective report by the labour and activist groups does not use dollar per hour compensation to show that women earn less than men, they use total year compensation. It is easy to understand why the group uses this formula, it will always show that women are being discriminated against while the other formula is showing improvements. A quick look at Stats Canada’s monthly Labour Force Survey shows one reason why men make more money than women; they work more hours. While this may not justify a difference in hourly wages, it would justify a difference in year end compensation. In the report cited by the government, men worked an average of 38.7 hours per week, a full five hours more than women who clocked in for 33.7 hours. For full-time workers, rather than all workers combined, there was still a difference, men working 40.7 hours per week to 38 hours for women. In reviewing several months of these reports over the past two years a consistent pattern emerges, men in full-time jobs work two to three hours more per week than women.

There may still be parts of the economy where male bosses or business owners irrationally discriminate against women (equally, there may be other forms of prejudice in play). Where laws exist to prohibit this, they should be enforced. However, trying to paint the numbers to show discrimination where it does not exist does not help anyone, and it makes it harder to achieve truly equal rights.

Update, 21 October: Ilkka at The Fourth Checkraise mentioned a related story from Finland:

Speaking of the male-female wage gap, I don’t know how I could forget the recent study by the Finnish emeritus researcher (who is thus free to speak his mind) Pauli Sumanen about this very issue. It concluded that Finnish men earn more on average (again, not the median) than Finnish women simply because they work more: if you control for actual hours worked, women get paid more than men so that a woman’s euro is not 80 cents but closer to 104. And if you look at the net salaries after the heavily progressive taxation, and include the fact that women live and receive pensions seven years longer on average (Finnish women pay 45% of total health care costs yet use 59% of health care), these numbers become vastly more dramatic for women.

Canadian women beat Finns 5-0, will face Team USA for gold on Thursday

Filed under: Cancon, Sports — Tags: , , — Nicholas @ 07:48


Photo by Julie Jacobson

Finland provided much more challenge for Canada, with excellent goaltending turning back many shots, but eventually they broke through. Cherie Piper scored the opening goal on a pass from Meghan Agosta, while Agosta broke the single Olympics scoring record with her ninth of the games. Haley Irwin scored twice, and Caroline Ouellette got the other goal for Canada.

They will face Team USA on Thursday for the gold medal. This matchup was expected, as both Canada and the US have been dominant in their respective games through the preliminary and semi-final rounds, tallying 86 goals between the two teams, and allowing only 4.

Update: Colby Cosh is pessimistic about the men’s team making it all the way to the top podium:

Even on the explicit, historically derived premise that Canada has the strongest team in the tournament, it would be hard to peg our chances of winning gold at much higher than 25%. On Desjardins’ pretty reasonable estimates of underlying national team strength, the figure is not close to 25%. I crunched the numbers, leaving room for the possibility of being helped somewhere along the way by an upset of a strong rival, and I get about 19%. That’s assuming we have a 100% chance of beating Germany tonight, when the real figure is probably more like 93-95%.

February 11, 2010

Nobody knows how many died in the Haiti earthquake

Filed under: Americas, Government, Media — Tags: , , — Nicholas @ 08:22

I can only assume it’s a slow news day for this to be a headline: “Differing death tolls raise suspicions that no one really knows how many died in Haiti quake“. Of course nobody knows: the Haitian government was barely functioning even before the quake hit, and not at all afterwards. They had no accurate idea of how many people lived in the area beforehand, and they still haven’t been able to recover all the bodies. Any death toll estimates will be inaccurate, almost by definition:

Wildly conflicting death tolls from Haitian officials have raised suspicions that no one really knows how many people died in the Jan. 12 earthquake.

The only thing that seems certain is the death toll is one of the highest in a modern disaster.

A day after Communications Minister Marie-Laurence Jocelyn Lassegue raised the official death toll to 230,000, her office put out a statement Wednesday quoting President Rene Preval as saying 270,000 bodies had been hastily buried by the government following the earthquake.

A press officer withdrew the statement, saying there was an error, but then reissued it within minutes. Later Wednesday, the ministry said there was a typo in the figure — the number should have read 170,000.

Even that didn’t clear things up. In the late afternoon, Preval and Lassegue appeared together at the government’s temporary headquarters.

Preval, speaking English, told journalists there were 170,000 dead, apparently referring to the number of bodies contained in mass graves.

Lassegue interrupted him in French, giving a number lower than she had given the previous day: “No, no, the official number is 210,000.”

Preval dismissed her. “Oh, she doesn’t know what she’s talking about,” he said, again in English.

What is not in dispute is that the death toll was very high, and that even with all the disaster relief efforts from other countries, there will still be many more deaths in the quake’s aftermath. Food, water, and medical aid is still not reaching everyone. That fact reduces the importance of the squabble over macabre numbers to a little bit of political theatre.

Update, 24 February: Radio Netherlands is claiming that the death toll has been vastly over-estimated and thinks the number of casualties will be under 100,000:

Haiti has buried an estimated 52,000 victims since the earthquake on 12 January 2010. More bodies still lie under the rubble, but the total number of casualties will not surpass 100,000 — that’s according to observation and research on the ground in Haiti, carried out by Radio Netherlands Worldwide.

This number is considerably smaller than the number of 217,000 victims the Haitian government claims to have counted so far, and far fewer than the estimated final count of 300,000 mentioned by President René Préval just last Sunday.

« Newer PostsOlder Posts »

Powered by WordPress