Showing posts with label pseudoscience. Show all posts
Showing posts with label pseudoscience. Show all posts

Wednesday, May 26, 2021

How Dominic Cummings And The Warner Brothers Saved The UK




Photo above shows the white board in the Prime Minister's office which Dominic Cummings and team (including the brothers Marc and Ben Warner) used to convince Boris Johnson to abandon the UK government COVID herd immunity plan and enter lockdown. Date: March 13 2020. 

Only now can the full story be told. In early 2020 the UK government had a COVID herd immunity plan in place that would have resulted in disaster. The scientific experts (SAGE) advising the government strongly supported this plan -- there are public, on the record briefings to this effect. These are people who are not particularly good at order of magnitude estimates and first-principles reasoning. 

Fortunately Dom was advised by the brothers Marc and Ben Warner (both physics PhDs, now working in AI and data science), DeepMind founder Demis Hassabis, Fields Medalist Tim Gowers, and others. In the testimony (see ~23m, ~35m, ~1h02m, ~1h06m in the video below) he describes the rather dramatic events that led to a switch from the original herd immunity plan to a lockdown Plan B. More details in this tweet thread.


I checked my emails with Dom during February and March, and they confirm his narrative. I wrote the March 9 blog post Covid-19 Notes in part for Dom and his team, and I think it holds up over time. Tim Gowers' document reaches similar conclusions.


 

Seven hours of riveting Dominic Cummings testimony from earlier today. 


Shorter summary video (Channel 4). Summary live-blog from the Guardian.



This is a second white board used in the March 14 meeting with Boris Johnson:



Friday, April 23, 2021

How a Physicist Became a Climate Truth Teller: Steve Koonin

 

I read an early draft of Koonin's new book discussed in the WSJ article excerpted below, and I highly recommend it. 


Video above is from a 2019 talk discussed in this earlier post: Certainties and Uncertainties in our Energy and Climate Futures: Steve Koonin.
My own views (consistent, as far as I can tell, with what Steve says in the talk): 
1. Evidence for recent warming (~1 degree C) is strong. 
2. There exist previous eras of natural (non-anthropogenic) global temperature change of similar magnitude to what is happening now. 
3. However, it is plausible that at least part of the recent temperature rise is due to increase of atmospheric CO2 due to human activity. 
4. Climate models still have significant uncertainties. While the direct effect of CO2 IR absorption is well understood, second order effects like clouds, distribution of water vapor in the atmosphere, etc. are not under good control. The increase in temperature from a doubling of atmospheric CO2 is still uncertain to a factor of 2-3 and at the low range (e.g., 1.5 degree C) is not catastrophic. The direct effect of CO2 absorption is modest and at the low range (~1 degree C) of current consensus model predictions. Potentially catastrophic outcomes are due to second order effects that are not under good theoretical or computational control. 
5. Even if a catastrophic outcome is only a low probability tail risk, it is prudent to explore technologies that reduce greenhouse gas production. 
6. A Red Team exercise, properly done, would clarify what is certain and uncertain in climate science. 
Simply stating these views can get you attacked by crazy people.
Buy Steve's book for an accessible and fairly non-technical explanation of these points.
WSJ: ... Barack Obama is one of many who have declared an “epistemological crisis,” in which our society is losing its handle on something called truth. 
Thus an interesting experiment will be his and other Democrats’ response to a book by Steven Koonin, who was chief scientist of the Obama Energy Department. Mr. Koonin argues not against current climate science but that what the media and politicians and activists say about climate science has drifted so far out of touch with the actual science as to be absurdly, demonstrably false. 
This is not an altogether innocent drifting, he points out in a videoconference interview from his home in Cold Spring, N.Y. In 2019 a report by the presidents of the National Academies of Sciences claimed the “magnitude and frequency of certain extreme events are increasing.” The United Nations Intergovernmental Panel on Climate Change, which is deemed to compile the best science, says all such claims should be treated with “low confidence.” 
... Mr. Koonin, 69, and I are of one mind on 2018’s U.S. Fourth National Climate Assessment, issued in Donald Trump’s second year, which relied on such overegged worst-case emissions and temperature projections that even climate activists were abashed (a revolt continues to this day). “The report was written more to persuade than to inform,” he says. “It masquerades as objective science but was written as—all right, I’ll use the word—propaganda.” 
Mr. Koonin is a Brooklyn-born math whiz and theoretical physicist, a product of New York’s selective Stuyvesant High School. His parents, with less than a year of college between them, nevertheless intuited in 1968 exactly how to handle an unusually talented and motivated youngster: You want to go cross the country to Caltech at age 16? “Whatever you think is right, go ahead,” they told him. “I wanted to know how the world works,” Mr. Koonin says now. “I wanted to do physics since I was 6 years old, when I didn’t know it was called physics.” 
He would teach at Caltech for nearly three decades, serving as provost in charge of setting the scientific agenda for one of the country’s premier scientific institutions. Along the way he opened himself to the world beyond the lab. He was recruited at an early age by the Institute for Defense Analyses, a nonprofit group with Pentagon connections, for what he calls “national security summer camp: meeting generals and people in congress, touring installations, getting out on battleships.” The federal government sought “engagement” with the country’s rising scientist elite. It worked. 
He joined and eventually chaired JASON, an elite private group that provides classified and unclassified advisory analysis to federal agencies. (The name isn’t an acronym and comes from a character in Greek mythology.) He got involved in the cold-fusion controversy. He arbitrated a debate between private and government teams competing to map the human genome on whether the target error rate should be 1 in 10,000 or whether 1 in 100 was good enough. 
He began planting seeds as an institutionalist. He joined the oil giant BP as chief scientist, working for John Browne, now Baron Browne of Madingley, who had redubbed the company “Beyond Petroleum.” Using $500 million of BP’s money, Mr. Koonin created the Energy Biosciences Institute at Berkeley that’s still going strong. Mr. Koonin found his interest in climate science growing, “first of all because it’s wonderful science. It’s the most multidisciplinary thing I know. It goes from the isotopic composition of microfossils in the sea floor all the way through to the regulation of power plants.” 
From deeply examining the world’s energy system, he also became convinced that the real climate crisis was a crisis of political and scientific candor. He went to his boss and said, “John, the world isn’t going to be able to reduce emissions enough to make much difference.” 
Mr. Koonin still has a lot of Brooklyn in him: a robust laugh, a gift for expression and for cutting to the heart of any matter. His thoughts seem to be governed by an all-embracing realism. Hence the book coming out next month, Unsettled: What Climate Science Tells Us, What It Doesn’t, and Why It Matters.
Any reader would benefit from its deft, lucid tour of climate science, the best I’ve seen. His rigorous parsing of the evidence will have you questioning the political class’s compulsion to manufacture certainty where certainty doesn’t exist. You will come to doubt the usefulness of centurylong forecasts claiming to know how 1% shifts in variables will affect a global climate that we don’t understand with anything resembling 1% precision. ...

Note Added from comments:

If you're older like Koonin or myself you can remember a time when climate change was entirely devoid of tribal associations -- it was not in the political domain at all. It is easier for us just to concentrate on where the science is, and indeed we can remember where it was in the 1990s or 2000s.

Koonin was MUCH more concerned about alternative energy and climate than the typical scientist and that was part of his motivation for supporting the Berkeley Energy Biosciences Institute, created 2007. The fact that it was a $500M partnership between Berkeley and BP was a big deal and much debated at the time, but there was never any evidence that the science they did was negatively impacted. 

It is IRONIC that his focus on scientific rigor now gets him labeled as a climate denier (or sympathetic to the "wrong" side). ALL scientists should be sceptical, especially about claims regarding long term prediction in complex systems.

Contrast the uncertainty estimates in the IPCC reports (which are not defensible and did not change for ~20y!) vs the (g-2) anomaly that was in the news recently.

When I was at Harvard the physics department and applied science and engineering school shared a coffee lounge. I used to sit there and work in the afternoon and it happened that one of the climate modeling labs had their group meetings there. So for literally years I overheard their discussions about uncertainties concerning water vapor, clouds, etc. which to this day are not fully under control. This is illustrated in Fig1 at the link: https://infoproc.blogspot.c...

The gap between what real scientists say in private and what the public (or non-specialists) gets second hand through the media or politically-focused "scientific policy reports" is vast...

If you don't think we can have long-lasting public delusions regarding "settled science" (like a decade long stock or real estate bubble), look up nuclear winter, which has a lot of similarities to greenhouse gas-driven climate change. Note, I am not claiming that I know with high confidence that nuclear winter can't happen, but I AM claiming that the confidence level expressed by the climate scientists working on it at the time was absurd and communicated in a grotesquely distorted fashion to political leaders and the general public. Even now I would say the scientific issue is not settled, due to its sheer complexity, which is LESS than the complexity involved in predicting long term climate change!

https://en.wikipedia.org/wi... 

Tuesday, June 13, 2017

Climate Risk and AI Risk for Dummies

The two figures below come from recent posts on climate change and AI. Please read them.

The squiggles in the first figure illustrate uncertainty in how climate will change due to CO2 emissions. The squiggles in the second figure illustrate uncertainty in the advent of human-level AI.



Many are worried about climate change because polar bears, melting ice, extreme weather, sacred Gaia, sea level rise, sad people, etc. Many are worried about AI because job loss, human dignity, Terminator, Singularity, basilisks, sad people, etc.

You can choose to believe in any of the grey curves in the AI graph because we really don't know how long it will take to develop human level AI, and AI researchers are sort of rational scientists who grasp uncertainty and epistemic caution.

You cannot choose to believe in just any curve in a climate graph because if you pick the "wrong" curve (e.g., +1.5 degree Celsius sensitivity to a doubling of CO2, which is fairly benign, but within the range of IPCC predictions) then you are a climate denier who hates science, not to mention a bad person :-(

Thursday, September 22, 2016

Annals of Reproducibility in Science: Social Psychology and Candidate Gene Studies

Andrew Gelman offers a historical timeline for the reproducibility crisis in Social Psychology, along with some juicy insight into the one funeral at a time manner in which academic science often advances.
OK, that was a pretty detailed timeline. But here’s the point. Almost nothing was happening for a long time, and even after the first revelations and theoretical articles you could still ignore the crisis if you were focused on your research and other responsibilities. ...

Then, all of a sudden, the world turned upside down.

If you’d been deeply invested in the old system, it must be pretty upsetting to think about change. Fiske is in the position of someone who owns stock in a failing enterprise, so no wonder she wants to talk it up. The analogy’s not perfect, though, because there’s no one for her to sell her shares to. What Fiske should really do is cut her losses, admit that she and her colleagues were making a lot of mistakes, and move on. She’s got tenure and she’s got the keys to PPNAS, so she could do it. Short term, though, I guess it’s a lot more comfortable for her to rant about replication terrorists and all that.

... Why do I go into all this detail? Is it simply mudslinging? Fiske attacks science reformers, so science reformers slam Fiske? No, that’s not the point. The issue is not Fiske’s data processing errors or her poor judgment as journal editor; rather, what’s relevant here is that she’s working within a dead paradigm. A paradigm that should’ve been dead back in the 1960s when Meehl was writing on all this, but which in the wake of Simonsohn, Button et al., Nosek et al., is certainly dead today. It’s the paradigm of the open-ended theory, of publication in top journals and promotion in the popular and business press, based on “p less than .05” results obtained using abundant researcher degrees of freedom. It’s the paradigm of the theory that in the words of sociologist Jeremy Freese, is “more vampirical than empirical—unable to be killed by mere data.”

... In her article that was my excuse to write this long post, Fiske expresses concerns for the careers of her friends, careers that may have been damaged by public airing of their research mistakes. Just remember that, for each of these people, there may well be three other young researchers who were doing careful, serious work but then didn’t get picked for a plum job or promotion because it was too hard to compete with other candidates who did sloppy but flashy work that got published in Psych Science or PPNAS. It goes both ways. ...
An old timer who has seen it all before comments.
ex-social psychologist says:
September 21, 2016 at 5:36 pm

Former professor of social psychology here, now happily retired after an early buyout offer. If not so painful, it would almost be funny at how history repeats itself: This is not the first time there has been a “crisis” in social psychology. In the late 1960s and early 1970s there was much hand-wringing over failures of replication and the “fun and games” mentality among researchers; see, for example, Gergen’s 1973 article “Social psychology as history” in JPSP, 26, 309-320, and Ring’s (1967) JESP article, “Experimental social psychology: Some sober questions about some frivolous values.” It doesn’t appear that the field ever truly resolved those issues back when they were first raised–instead, we basically shrugged, said “oh well,” and went about with publishing by any means necessary.

I’m glad to see the renewed scrutiny facing the field. And I agree with those who note that social psychology is not the only field confronting issues of replicability, p-hacking, and outright fraud. These problems don’t have easy solutions, but it seems blindingly obvious that transparency and open communication about the weaknesses in the field–and individual studies–is a necessary first step. Fiske’s strategy of circling the wagons and adhering to a business-as-usual model is both sad and alarming.

I took early retirement for a number of reasons, but my growing disillusionment with my chosen field was certainly a primary one.
Geoffrey Miller also contributes
Geoffrey Miller says:
September 21, 2016 at 8:43 pm

There’s also a political/ideological dimension to social psychology’s methodological problems.

For decades, social psych advocated a particular kind of progressive, liberal, blank-slate ideology. Any new results that seemed to support this ideology were published eagerly and celebrated publicly, regardless of their empirical merit. Any results that challenged it (e.g. by showing the stability or heritability of individual differences in intelligence or personality) were rejected as ‘genetic determinism’, ‘biological reductionism’, or ‘reactionary sociobiology’.

For decades, social psychologists were trained, hired, promoted, and tenured based on two main criteria: (1) flashy, counter-intuitive results published in certain key journals whose editors and reviewers had a poor understanding of statistical pitfalls, (2) adherence to the politically correct ideology that favored certain kinds of results consistent with a blank-slate, situationist theory of human nature, and derogation of any alternative models of human nature (see Steven Pinker’s book ‘The blank slate’).

Meanwhile, less glamorous areas of psychology such as personality, evolutionary, and developmental psychology, intelligence research, and behavior genetics were trundling along making solid cumulative progress, often with hugely greater statistical power and replicability (e.g. many current behavior genetics studies involve tens of thousands of twin pairs across several countries). But do a search for academic positions in the APS job ads for these areas, and you’ll see that they’re not a viable career path, because most psych departments still favor the kind of vivid but unreplicable results found in social psych and cognitive neuroscience.

So, we’re in a situation where the ideologically-driven, methodologically irresponsible field of social psychology has collapsed like a house of cards … but nobody’s changed their hiring, promotion, or tenure priorities in response. It’s still fairly easy to make a good living doing bad social psychology. It’s still very hard to make a living doing good personality, intelligence, behavior genetic, or evolutionary psychology research.

In the title of this post I mention Candidate Gene Studies. Forget, for the moment, about goofy Social Psychology experiments conducted on undergraduates. Much more money was wasted in the early 21st century on under-powered genomics studies that looked for gene-trait associations using small samples. Researchers, overconfident in their vaunted biological or biochemical intuition, performed studies using p < 0.05 thresholds that produced (ultimately false) associations between candidate genes and a variety of traits. According to Ioannidis, almost none of these results replicate (more). When I first became aware of GWAS almost a decade ago, the field was in disarray, with some journals still publishing results at the p < 0.05 threshold, whereas others having adopted the corrected p < 5E-08 = 0.05 x 1E-06 "genome wide significance" threshold (based on multiple testing correction for 1E06 SNPs)! The latter results routinely replicate, as expected.

Clearly, many researchers fundamentally misunderstood basic statistics, or at least were grossly overconfident in their priors for no good reason. But as of today, genomics has corrected its practices and although no one wants to dwell on the 5+ years worth of non-replicable published results, science is at least moving forward. I hope Social Psychology and other problematic areas (such as in biomedical research) can self-correct their practices as genomics has.

See also One funeral at a time?


Bonus Feature!
Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature

Denes Szucs, John PA Ioannidis
doi: http://dx.doi.org/10.1101/071530

We have empirically assessed the distribution of published effect sizes and estimated power by extracting more than 100,000 statistical records from about 10,000 cognitive neuroscience and psychology papers published during the past 5 years. The reported median effect size was d=0.93 (inter-quartile range: 0.64-1.46) for nominally statistically significant results and d=0.24 (0.11-0.42) for non-significant results. Median power to detect small, medium and large effects was 0.12, 0.44 and 0.73, reflecting no improvement through the past half-century. Power was lowest for cognitive neuroscience journals. 14% of papers reported some statistically significant results, although the respective F statistic and degrees of freedom proved that these were non-significant; p value errors positively correlated with journal impact factors. False report probability is likely to exceed 50% for the whole literature. In light of our findings the recently reported low replication success in psychology is realistic and worse performance may be expected for cognitive neuroscience.
From the paper. FRP = False Report Probability = the probability that the null hypothesis is true when we get a statistically significant finding.
... In all, the combination of low power, selective reporting and other biases and errors that we have documented in this large sample of papers in cognitive neuroscience and psychology suggest that high FRP are to be expected in these fields. The low reproducibility rate seen for psychology experimental studies in the recent Open Science Collaboration (Nosek et al. 2015a) is congruent with the picture that emerges from our data. Our data also suggest that cognitive neuroscience may have even higher FRP rates, and this hypothesis is worth evaluating with focused reproducibility checks of published studies. Regardless, efforts to increase sample size, and reduce publication and other biases and errors are likely to be beneficial for the credibility of this important literature.

Sunday, March 27, 2016

Science vs Advocacy: the benefits of diversity

This manuscript is based on the Presidential Address that Alice H. Eagly (Professor of Psychology and of Management at Northwestern University) delivered at the 2015 conference of the Society for the Psychological Study of Social Issues, “A Road Less Traveled: Forging Links between Psychological Science and Social Policy,” Washington, DC.
When Passionate Advocates Meet Research on Diversity, Does the Honest Broker Stand a Chance?

Journal of Social Issues, 72: 199–222. doi: 10.1111/josi.12163

Abstract: In an ideal world, social science research would provide a strong basis for advocacy and social policy. However, advocates sometimes misunderstand or even ignore scientific research in pursuit of their goals, especially when research pertains to controversial questions of social inequality. To illustrate the chasm that can develop between research findings and advocates’ claims, this article addresses two areas: (a) the effects of the gender diversity of corporate boards of directors on firms’ financial performance and (b) the effects of the gender and racial diversity of workgroups on group performance. Despite advocates’ insistence that women on boards enhance corporate performance and that diversity of task groups enhances their performance, research findings are mixed, and repeated meta-analyses have yielded average correlational findings that are null or extremely small. Therefore, social scientists should (a) conduct research to identify the conditions under which the effects of diversity are positive or negative and (b) foster understanding of the social justice gains that can follow from diversity. Unfortunately, promulgation of false generalizations about empirical findings can impede progress in both of these directions. Rather than ignoring or furthering distortions of scientific knowledge to fit advocacy goals, scientists should serve as honest brokers who communicate consensus scientific findings to advocates and policy makers in an effort to encourage exploration of evidence-based policy options.
From the paper
... Establishing that the presence of women on corporate boards causes any of the positive or negative outcomes is far more challenging (see Adams, 2015). As in many other domains of nonexperimental research, relatively few researchers have addressed endogeneity in a manner that allows claims about causation (Antonakis, Bendahan, Jacquart, & Lalive, 2010). However, the “business case”—that is, the boldly causal claim that including women on corporate boards improves firms’ financial outcomes, lives on in communications directed to the public and business community (e.g., Committee for Economic Development, 2015), most often supported by citations of the least informative studies, which are those containing only simple group comparisons (e.g., Catalyst, 2004; Desvaux et al., 2007).

... Over the years, a very large research literature has accumulated relating workgroup diversity to group performance, published in academic journals mainly in industrial-organizational psychology and management. These investigators have distinguished two types of diversity: job-related, which pertains to differences in knowledge and expertise related to the problems that work groups are charged with solving, and demographic, which pertains to differences in attributes such as gender, race, and age (e.g., Mannix & Neale, 2005). Research has extensively examined both of these forms of diversity.

Several meta-analyses of the diversity-performance relation have been prominently published, with the latest and most inclusive produced by van Dijk, van Engen, and van Knippenberg (2012). Among this project's 146 studies, there were three types of settings: (a) laboratory experiments (b) field studies, and (c) studies conducted on teams composed of undergraduate or MBA students. These field and student studies generally provided correlational data relating amount of diversity to group performance. The finding that the classification of studies by these three types of settings did not moderate diversity-performance relations eases concerns about endogeneity, given the greater ability of the laboratory experiments to rule out alternative explanations based on uncontrolled variables.

The meta-analysis produced mainly very small average effect sizes: The key overall findings were that demographic diversity yielded a small negative relation to performance outcomes (r = –.02), which was present for both gender diversity (r = –.01) and racial/ethnic diversity (r = –.05); all of these relations were nonsignificant. In contrast, job-related diversity produced a significant, but small, positive relation (r = .05). These findings replicated four prior meta-analyses based on smaller samples of studies (Bell, Villado, Lukasik, Belau, & Briggs, 20100; Horwitz & Horwitz, 2007; Hülsheger, Anderson, & Salgado, 2009; Joshi & Roh, 2009). In addition, a meta-analysis of 68 studies produced a nonsignificant relation between gender diversity and team performance (r = -.01; Schneid, Isidor, Li, & Kabst, 2015). Moreover, these meta-analytic results were generally consistent with earlier narrative reviewers’ cautions that demographic diversity had yielded mixed and inconclusive effects (Harrison & Klein, 2007; Mannix & Neale, 2005; Milliken & Martins, 1996; Williams & O'Reilly, 1998).

... In summary, when aggregated across studies, an extensive research literature on group performance has shown no overall advantage for demographically diverse groups, with a small tendency toward disadvantage, especially on subjective measures of performance. However, these meta-analytic averages encompassed heterogeneous outcomes, whereby some studies did produce positive effects of diversity. Yet, approximately as many studies yielded negative effects, producing average effects that were near zero. In this respect, these findings are similar to the correlations between the representation of women as corporate directors and financial outcomes.
Concluding paragraph
To conclude, this article conveys some ways in which science, advocacy, and policy have not related easily or harmoniously. I have told two somewhat complicated stories, one pertaining to women on corporate boards and the other to workgroup diversity—two domains with extensive social scientific research relating diversity to performance outcomes. Despite the striking lack of research support for the optimistic generalizations about these outcomes that have been widely shared among advocates, policy makers, and the general public, many social scientists with relevant expertise have remained silent. It is time for more social scientists to take stock of what diversity research has produced so far and join those who are addressing the complexities of diversity's effects on group and organizational performance. It is also time for all stakeholders in diversity initiatives to focus on the violations of social justice inherent in the limited access of women and minorities to decision making in most political and corporate contexts.

Tuesday, March 08, 2016

One funeral at a time?


A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it. -- Max Planck
I'm at the annual AAU meeting of research Vice-Presidents, and the "reproducibility crisis" (in some fields) is one of the topics on the agenda. Today I heard a nice talk by Brian Nosek on the Reproducibility Project and Open Science Framework. Afterwards I asked him whether other social psychologists were really absorbing the implications of the low reproducibility rate in their field. Nosek said that attitudes were changing rapidly and that in 10 years the field would be very different. If so, progress can happen much faster than in Max Planck's pessimistic quote. Let's hope Nosek is right!

As an example of soul searching (learning can hurt!) Nosek pointed me to this blog post, by Professor Michael Inzlicht of Toronto. To his credit, Inzlicht is taking seriously the replication difficulties of two effects he has worked on in the past: ego depletion (i.e., willpower fatigue) and stereotype threat. Anyone who has looked carefully at the literature (positive effects from small sample sizes, but failed replication or very small effect size results from much larger samples) has to question whether these much-hyped phenomena are real. There is also lots of evidence for publication bias.
Inzlicht: ... I have spent nearly a decade working on the concept of ego depletion, including work that is critical of the model used to explain the phenomenon. I have been rewarded for this work, and I am convinced that the main reason I get any invitations to speak at colloquia and brown-bags these days is because of this work. The problem is that ego depletion might not even be a thing. By now, many people are aware that a massive replication attempt of the basic ego depletion effect involving over 2,000 participants found nothing, nada, zip. Only three of the 24 participating labs found a significant effect, but even then, one of these found a significant result in the wrong direction!

There is a lot more to this registered replication than the main headline, and deep in my heart, it is hard to believe that fatigue is not a real phenomenon. I promise to get to it in a later post. But for now, we are left with a sobering question: If a large sample pre-registered study found absolutely nothing, how has the ego depletion effect been replicated and extended hundreds and hundreds of times? More sobering still: What other phenomena, which we now consider obviously real and true, will be revealed to be just as fragile?

As I said, I’m in a dark place. I feel like the ground is moving from underneath me and I no longer know what is real and what is not.

I edited an entire book on stereotype threat, I have signed my name to an amicus brief to the Supreme Court of the United States citing stereotype threat, yet now I am not as certain as I once was about the robustness of the effect. I feel like a traitor for having just written that; like, I’ve disrespected my parents, a no no according to Commandment number 5. But, a meta-analysis published just last year suggests that stereotype threat, at least for some populations and under some conditions, might not be so robust after all. P-curving some of the original papers is also not comforting. Now, stereotype threat is a politically charged topic and I really really want it to be real. I think a lot more pain-staking work needs to be done before I stop believing (and rumor has it that another RRR of stereotype threat is in the works), but I would be lying if I said that doubts have not crept in. ...
See also Is science self-correcting? Figure at top is from this meta-analysis of stereotype threat.

Wednesday, June 10, 2015

Replication and cumulative knowledge in life sciences

See Ioannidis at MSU for video discussion of related topics with the leading researcher in this area, and also Medical Science? Is Science Self-Correcting?
The Economics of Reproducibility in Preclinical Research (PLoS Biology)

Abstract: Low reproducibility rates within life science research undermine cumulative knowledge production and contribute to both delays and costs of therapeutic drug development. An analysis of past studies indicates that the cumulative (total) prevalence of irreproducible preclinical research exceeds 50%, resulting in approximately US$28,000,000,000 (US$28B)/year spent on preclinical research that is not reproducible—in the United States alone. We outline a framework for solutions and a plan for long-term improvements in reproducibility rates that will help to accelerate the discovery of life-saving therapies and cures.
From the introduction:
Much has been written about the alarming number of preclinical studies that were later found to be irreproducible [1,2]. Flawed preclinical studies create false hope for patients waiting for lifesaving cures; moreover, they point to systemic and costly inefficiencies in the way preclinical studies are designed, conducted, and reported. Because replication and cumulative knowledge production are cornerstones of the scientific process, these widespread accounts are scientifically troubling. Such concerns are further complicated by questions about the effectiveness of the peer review process itself [3], as well as the rapid growth of postpublication peer review (e.g., PubMed Commons, PubPeer), data sharing, and open access publishing that accelerate the identification of irreproducible studies [4]. Indeed, there are many different perspectives on the size of this problem, and published estimates of irreproducibility range from 51% [5] to 89% [6] (Fig 1). Our primary goal here is not to pinpoint the exact irreproducibility rate, but rather to identify root causes of the problem, estimate the direct costs of irreproducible research, and to develop a framework to address the highest priorities. Based on examples from within life sciences, application of economic theory, and reviewing lessons learned from other industries, we conclude that community-developed best practices and standards must play a central role in improving reproducibility going forward. ...

Sunday, May 03, 2015

Replication is hard; understanding what that means is even harder



Bad news for psychology -- only 39 of 100 published findings were replicated in a recent coordinated effort.
Nature | News: An ambitious effort to replicate 100 research findings in psychology ended last week — and the data look worrying. Results posted online on 24 April, which have not yet been peer-reviewed, suggest that key findings from only 39 of the published studies could be reproduced. ...
The article goes on:
But the situation is more nuanced than the top-line numbers suggest (See graphic, 'Reliability test'). Of the 61 non-replicated studies, scientists classed 24 as producing findings at least “moderately similar” to those of the original experiments, even though they did not meet pre-established criteria, such as statistical significance, that would count as a successful replication.  [ Yeah, right. ]
This makes me suspect bounded cognition -- humans trusting their post hoc stories and intuition instead of statistical criteria chosen before planned replication attempts.

The most tragic thing about Ioannidis's work on low replication rates and wasted research funding is that while medical researchers might pay lip service to his results (which are highly cited), they typically have not actually grasped the implications for their own work. In particular, they typically have not updated their posteriors to reflect the low reliability of research results, even in the top journals.


Tuesday, October 27, 2009

Do flu vaccines work?

The more I learn about the quality standards of medical research, the less I believe what my doctor tells me! Notice how hard it is, even in a "scientific" context to oppose the conventional wisdom.

Related posts: bounded cognition, Bouchard against group think, the future of innovation.

More from Gary Taubes on diet and nutrition issues, and WIRED on placebo mysteries.

As a postdoc I briefly dated a woman who was doing graduate work at Harvard's school of public health. I was a bit surprised at her low opinion of the scientific and statistical prowess of medical doctors, at least until I thought back to all the pre-med students I had taught while in grad school ;-) She pointed out to me that most MD's (as opposed to MD/PhD's) aren't scientists at all -- they simply apply what they're taught in medical school.

Atlantic Monthly: ... When Lisa Jackson, a physician and senior investigator with the Group Health Research Center, in Seattle, began wondering aloud to colleagues if maybe something was amiss with the estimate of 50 percent mortality reduction for people who get flu vaccine, the response she got sounded more like doctrine than science. “People told me, ‘No good can come of [asking] this,’” she says. “‘Potentially a lot of bad could happen’ for me professionally by raising any criticism that might dissuade people from getting vaccinated, because of course, ‘We know that vaccine works.’ This was the prevailing wisdom.”

Nonetheless, in 2004, Jackson and three colleagues set out to determine whether the mortality difference between the vaccinated and the unvaccinated might be caused by a phenomenon known as the “healthy user effect.” They hypothesized that on average, people who get vaccinated are simply healthier than those who don’t, and thus less liable to die over the short term. People who don’t get vaccinated may be bedridden or otherwise too sick to go get a shot. They may also be more likely to succumb to flu or any other illness, because they are generally older and sicker. To test their thesis, Jackson and her colleagues combed through eight years of medical data on more than 72,000 people 65 and older. They looked at who got flu shots and who didn’t. Then they examined which group’s members were more likely to die of any cause when it was not flu season.

Jackson’s findings showed that outside of flu season, the baseline risk of death among people who did not get vaccinated was approximately 60 percent higher than among those who did, lending support to the hypothesis that on average, healthy people chose to get the vaccine, while the “frail elderly” didn’t or couldn’t. In fact, the healthy-user effect explained the entire benefit that other researchers were attributing to flu vaccine, suggesting that the vaccine itself might not reduce mortality at all. Jackson’s papers “are beautiful,” says Lone Simonsen, who is a professor of global health at George Washington University, in Washington, D.C., and an internationally recognized expert in influenza and vaccine epidemiology. “They are classic studies in epidemiology, they are so carefully done.”

The results were also so unexpected that many experts simply refused to believe them. Jackson’s papers were turned down for publication in the top-ranked medical journals. One flu expert who reviewed her studies for the Journal of the American Medical Association wrote, “To accept these results would be to say that the earth is flat!” When the papers were finally published in 2006, in the less prominent International Journal of Epidemiology, they were largely ignored by doctors and public-health officials. “The answer I got,” says Jackson, “was not the right answer.”

... THE MOST vocal—and undoubtedly most vexing—critic of the gospel of flu vaccine is the Cochrane Collaboration’s Jefferson, who’s also an epidemiologist trained at the famed London School of Tropical Hygiene, and who, in Lisa Jackson’s view, makes other skeptics seem “moderate by comparison.” Among his fellow flu researchers, Jefferson’s outspokenness has made him something of a pariah. At a 2007 meeting on pandemic preparedness at a hotel in Bethesda, Maryland, Jefferson, who’d been invited to speak at the conference, was not greeted by any of the colleagues milling about the lobby. He ate his meals in the hotel restaurant alone, surrounded by scientists chatting amiably at other tables. He shrugs off such treatment. As a medical officer working for the United Nations in 1992, during the siege of Sarajevo, he and other peacekeepers were captured and held for more than a month by militiamen brandishing AK-47s and reeking of alcohol. Professional shunning seems trivial by comparison, he says.

“Tom Jefferson has taken a lot of heat just for saying, ‘Here’s the evidence: it’s not very good,’” says Majumdar. “The reaction has been so dogmatic and even hysterical that you’d think he was advocating stealing babies.” Yet while other flu researchers may not like what Jefferson has to say, they cannot ignore the fact that he knows the flu-vaccine literature better than anyone else on the planet. He leads an international team of researchers who have combed through hundreds of flu-vaccine studies. The vast majority of the studies were deeply flawed, says Jefferson. “Rubbish is not a scientific term, but I think it’s the term that applies.” Only four studies were properly designed to pin down the effectiveness of flu vaccine, he says, and two of those showed that it might be effective in certain groups of patients, such as school-age children with no underlying health issues like asthma. The other two showed equivocal results or no benefit.

... In the flu-vaccine world, Jefferson’s call for placebo-controlled studies is considered so radical that even some of his fellow skeptics oppose it.

Tuesday, August 04, 2009

Economics: is it science?

The Making of An Economist Redux by David Colander

The paper describes a survey taken by graduate students at seven top-ranking economics programs. There is an enormous spread in opinions on questions such as: Is neoclassical economics relevant? Do economists agree on fundamental issues? Can fiscal policy be an effective stabilizer? Should the Fed maintain a constant growth rate of the money supply? (See below for results.)

For example, just over 50% of students agreed somewhat or strongly that Economists agree on fundamental issues, whereas 44% disagreed. These results were barely changed over 20 years -- is that a strong indicator of little progress in the field, at least as far as fundamental issues? ***

(Isn't whether economists agree on fundamental issues itself a fundamental issue? If economists do not agree on whether economists agree on fundamental issues, isn't there an issue? ;-)

See the paper for more, or the book, which contains transcripts of interviews conducted at each of the schools. Some parts of the interviews are very funny, like when Colander asks the students Have you ever read any older economists, like Hayek, Robinson or Keynes? The discussion of sociology vs economics is also quite amusing.

The most tragic (yet deepest) question in the interviews is Do economists test theories? Colander obviously understands what he is asking but few of the students seem to.

In the survey results below, "Then" refers to an earlier survey done in the 1980s, "Now" refers to one done around 2002. Click below for larger images.










*** So as not to appear to be an ugly physicist, let me note that we have yet to find the Higgs particle, haven't discovered what the dark matter is but rather discovered a new mysterious substance called dark energy that no one can explain, and still don't know how quantum gravity works. I might also add that physicists disagree about the foundations / interpretation of quantum mechanics, although we don't disagree about the fact that we disagree :-)

Wednesday, February 11, 2009

The World's Greatest Economic Minds

Via Brad DeLong:

When a questioner suggested a summit of the nation's best economists, Krugman said something like: "We know what will happen if we bring together the greatest economic minds. It's spread across the blogosphere every day, and it's not pretty."

See also here:

Mr Krugman gives liberals the economics they want. Mr Barro gives conservatives the same service. They narrow or deny the common ground. Why does this matter? Because the views of readers inclined to one side or the other are further polarised; and in the middle, those of no decided allegiance conclude that economics is bunk.


Listen to this interview with former Goldman banker John Talbott for some refreshing straight talk (Leonard Lopate Show).



Friday, October 17, 2008

Modigliani-Miller, RIP

Even a casual observer knows that the current financial crisis was partially caused by high leverage ratios. But did you know that a Nobel Prize in Economics [sic] was awarded for a theorem that "proved" that leverage ratios don't matter? Yes, it is called the Modigliani-Miller theorem:

Leverage doesn't matter!

Consider two firms which are identical except for their financial structures. The first (Firm U) is unlevered: that is, it is financed by equity only. The other (Firm L) is levered: it is financed partly by equity, and partly by debt. The Modigliani-Miller theorem states that the value of the two firms is the same.

See if you can spot the completely unrealistic efficient market "no-arbitrage" assumption used to prove the theorem:

Proposition: VU = VL, where VU is the value of an unlevered firm = price of buying a firm composed only of equity, and VL is the value of a levered firm = price of buying a firm that is composed of some mix of debt and equity.

To see why this should be true, suppose an investor is considering buying one of the two firms U or L. Instead of purchasing the shares of the levered firm L, he could purchase the shares of firm U and borrow the same amount of money B that firm L does. The eventual returns to either of these investments would be the same. [Ha ha ha ha!] Therefore the price of L must be the same as the price of U minus the money borrowed B, which is the value of L's debt.

If U and L are banks investing in risky mortgage assets, which do you think is going to collapse due to a run when those assets are marked down? (See here for more problems.)

Believe it or not, Nobelist Myron Scholes invokes Modigliani-Miller in this debate over stronger financial regulation at the Economist web site. ***

Here is what I wrote in a post back in March 2008 (Privatizing gains, socializing losses).

I'd like to hear a believer in efficient markets try to tell the story of Bear Stearns' demise. One week it was OK for them to be levered 30 to 1, the next week it wasn't? When the stock was at 65 people were comfortable with their exposure to mortgages, but then suddenly they weren't? Come on.

My corollary to the Modigliani-Miller theorem: A Nobel in Economics ain't no Nobel in Physics. (Sorry, Krugman.)


*** Footnote: I do agree with Scholes' point that any argument for regulation needs to take into account the benefits from innovation that we might be giving up.

Blog Archive

Labels