Skip to Content, Navigation, or Footer.

Aluthge ’15: Lies, damned lies and bad statistics

Women are three times more likely to wear red or pink when ovulating. Men with greater upper body strength are more likely to be fiscally conservative. And Republicans have “significantly different brain structure” compared to Democrats.

All three of these statements are fairly bold, provocative claims. All three were published in reputable, peer-reviewed, academic journals. And all three are, to put it politely, nothing but hot air.

The unfortunate reality is that all three of the research studies in question contained fundamentally fatal flaws — for example, failing to account for multiple comparisons — in the statistical reasoning employed by the authors to justify their conclusions. Thus, their conclusions, however flashy and provocative they may be, are not supported by statistically significant evidence. So while the Brown Republicans are welcome to start a recruiting campaign targeting gym-goers, there is no actual evidence that suggests they should.

The examples I have highlighted are just symptoms of a larger insidious trend. Statistical reasoning and methods are regularly misused in the academic world, and the consequences are serious and far-reaching.

Perhaps the most obvious consequence of the misuse of statistics is that it corrupts the overall body of research literature and ultimately hinders research progress. Take, for example, the field of cardiology. A 2007 study in the Journal of the American Medical Association compiled a list of 85 genetic variants that had been found, in various peer-reviewed papers, to be linked to acute coronary syndrome. When the researchers attempted to validate these 85 claims by testing actual patients, they found that only one of the genetic variants was significantly linked to the syndrome, implying serious flaws in most of the papers that were examined.

But most researchers who read these papers did not perform their own validation tests. They assumed that the science and statistics were sound and took the conclusions to be legitimate — the articles were, after all, subjected to peer review. And their research may have been shaped and directed by these results. Of course, since those results were not in fact correct, this means that such research could have been largely useless. In general, having false results in the literature — no matter the field — can lead to other researchers wasting both time and money pursuing fruitless leads. This problem exerts a financial burden on our research system and institutions while delaying researchers’ work, potentially by years.

In addition to the negative effects on research communities, the misuse of statistics can also serve to perpetuate harmful societal stereotypes and structures. To illustrate this, let’s look at a 2013 study entitled “The Fluctuating Female Vote.” In this study, a team of researchers found that women’s ovulatory cycles affected their political and religious views. For example, the authors claimed that “ovulation led single women to become more liberal, less religious, and more likely to vote for Barack Obama” in the 2012 presidential election.

Criticism of this paper’s statistical reasoning was justifiably widespread. Andrew Gelman, director of the Applied Statistics Center at Columbia University, pointed out some of the most glaring flaws, such as the arbitrary exclusion of data, a failure to account for noise and the use of causal language such as “influenced” and “affected” when discussing correlation. Gelman’s overall assessment? The paper was, as he bluntly put it, “sloppy work.”

What is especially disturbing here is how the use of improper statistical thinking resulted in an unsubstantiated conclusion that perpetuates damaging stereotypes. In this case, the paper served to support the antiquated notion that women are, in some sense, “controlled” by their ovulatory cycle. Thus, the publication of this paper lends credence to those seeking to defend structural and societal sexism through the use of “scientific” arguments. Every bad paper that is published hands ammunition to those who want to reinforce inequality.

The two previous types of consequences that I discussed — damage to the research literature and community, and reproduction of structural inequalities — are of a broader, more long-term aspect. But shoddy statistical reasoning can also have more immediate impacts. Take, for example, the Oct. 23 poll of Rhode Island voters conducted by the Taubman Center for Public Policy and American Institutions. Patrick Sweeney, campaign manager for Republican gubernatorial nominee Allan Fung, argued that the poll oversampled voters from Providence, which leans Democratic. In the weeks leading up to an election, the results reported by pollsters can impact voter opinion and cause candidates to alter their campaign strategies. So if those conducting the poll make statistical mistakes, such as oversampling a particular portion of the population, the consequences are immediate and real.

Throughout this column, I’ve provided various examples of “bad statistics.” But I want to emphasize that I do not mean to single these authors out. The problem here is not a handful of researchers misusing statistics, but rather a pervasive culture in academia that allows these abuses to continue unchecked.

 

Dilum Aluthge ’15 MD’19 is an applied mathematics concentrator and can be reached at dilum_aluthge@brown.edu.

ADVERTISEMENT


Powered by SNworks Solutions by The State News
All Content © 2024 The Brown Daily Herald, Inc.