Tuesday, December 10, 2013

Is science self-correcting?


More fun from our man Ioannidis. See earlier posts Medical science? , NIH discovers reproducibility and Bounded cognition.

A toy model of the dynamics of scientific research, with probability distributions for accuracy of experimental results, mechanisms for updating of beliefs by individual scientists, crowd behavior, bounded cognition, etc. can easily exhibit parameter regions where progress is limited (one could even find equilibria in which most beliefs held by individual scientists are false!). Obviously the complexity of the systems under study and the quality of human capital in a particular field are important determinants of the rate of progress and its character.

In physics it is said that successful new theories swallow their predecessors whole. That is, even revolutionary new theories (e.g., special relativity or quantum mechanics) reduce to their predecessors in the previously studied circumstances (e.g., low velocity, macroscopic objects). Swallowing whole is a sign of proper function -- it means the previous generation of scientists was competent: what they believed to be true was (at least approximately) true. Their models were accurate in some limit and could continue to be used when appropriate (e.g., Newtonian mechanics).

In some fields (not to name names!) we don't see this phenomenon. Rather, we see new paradigms which wholly contradict earlier strongly held beliefs that were predominant in the field* -- there was no range of circumstances in which the earlier beliefs were correct. We might even see oscillations of mutually contradictory, widely accepted paradigms over decades.

It takes a serious interest in the history of science (and some brainpower) to determine which of the two regimes above describes a particular area of research. I believe we have good examples of both types in the academy.

* This means the earlier (or later!) generation of scientists in that field was incompetent. One or more of the following must have been true: their experimental observations were shoddy, they derived overly strong beliefs from weak data, they allowed overly strong priors to determine their beliefs.

Why Science Is Not Necessarily Self-Correcting 
(DOI: 10.1177/1745691612464056)

John P. A. Ioannidis
Stanford Prevention Research Center, Department of Medicine and Department of Health Research and Policy, Stanford University School of Medicine, and Department of Statistics, Stanford University School of Humanities and Sciences

The ability to self-correct is considered a hallmark of science. However, self-correction does not always happen to scientific evidence by default. The trajectory of scientific credibility can fluctuate over time, both for defined scientific fields and for science at-large. History suggests that major catastrophes in scientific credibility are unfortunately possible and the argument that “it is obvious that progress is made” is weak. Careful evaluation of the current status of credibility of various scientific fields is important in order to understand any credibility deficits and how one could obtain and establish more trustworthy results. Efficient and unbiased replication mechanisms are essential for maintaining high levels of scientific credibility. Depending on the types of results obtained in the discovery and replication phases, there are different paradigms of research: optimal, self-correcting, false nonreplication, and perpetuated fallacy. In the absence of replication efforts, one is left with unconfirmed (genuine) discoveries and unchallenged fallacies. In several fields of investigation, including many areas of psychological science, perpetuated and unchallenged fallacies may comprise the majority of the circulating evidence. I catalogue a number of impediments to self-correction that have been empirically studied in psychological science. Finally, I discuss some proposed solutions to promote sound replication practices enhancing the credibility of scientific results as well as some potential disadvantages of each of them. Any deviation from the principle that seeking the truth has priority over any other goals may be seriously damaging to the self-correcting functions of science

23 comments:

  1. efalken11:50 AM

    There are theories like String Theory, or Keynesianism, that are too broad to ever be falsified. Bad theories aren't rejected, they're orphaned.

    ReplyDelete
  2. Diogenes7:12 PM

    "Obviously the complexity of the systems under study and the quality of
    human capital in a particular field are important determinants of the
    rate of progress and its character."

    so, again, steve must claim that physicists are smarter than everyone else. but how smart could anyone be who chose physics? i'll take antibiotics and vaccines over modern physics any day.

    ReplyDelete
  3. Diogenes7:32 PM

    humans are very unlike all other animals in one respect: the variety of their behavior.


    humans live in greenland and the sahara. they live in manhattan high rises and in huts. they make their living writing (to people or to computers) and by hunting, by speculating in derivatives and by prostitution.



    eskimos and bedouin differ much more in culture than in genes.


    psychology is at best a branch of anthropology. at worst, it's a pseudoscience with a very clear ideological function.

    ReplyDelete
  4. Endre Bakken Stovner8:27 AM

    The paper was pretty funny.

    It includes a list of six rules to follow to avoid the problem of many false positive publications:

    1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.
    2. Authors must collect at least 20 observations per cell or else provide a compelling cost-of-data-collection justification.
    3. Authors must list all variables collected in a study.
    4. Authors must report all experimental conditions, including failed manipulations.
    5. If observations are eliminated, authors must also report what the statistical results are if those observations are included.
    6. If an analysis includes a covariate, authors must report the statistical results of the analysis without the covariate.

    Seems like something that would be useful to researchers in biology and medicine too.

    ReplyDelete
  5. Emil Kirkegaard10:02 AM

    You should read the entire special issue on reproducibility (of which this was the last paper). I put all the papers here: http://emilkirkegaard.dk/en/?p=3395

    ReplyDelete
  6. Rudel2:40 PM

    "so, again, steve must claim that physicists are smarter than everyone else."


    Not just physicists but East Asian physicists if the truth be known. And I agree, just as some areas of theoretical mathematics are meaningless, so it is with much unprovable cosmology and aberrant intellectual masturbations like string "theory" which are utterly useless. Just because a hypothesis is "elegant" doesn't make it true if it can't be tested empirically.

    ReplyDelete
  7. Rudel2:50 PM

    "eskimos and bedouin differ much more in culture than in genes."


    A generality that has ill-defined terms. On the surface I find that most tribal cultures exhibit boring similarities and that the Arab and Eskimo phenotypes imply strong selection due to climate.

    ReplyDelete
  8. Carson Chow6:16 PM

    Do you have an example toy model? If you do, I won't have to bother making one.

    ReplyDelete
  9. Diogenes7:45 PM

    yes they have adapted physically/genetically. perhaps their behavioral adaptation is partly genetic too. but i have no doubt that an eskimo raised by bedouins would learn to speak whatever bedouin language, learn to behave like a bedouin. he wouldn't grow up to say, "i'm sick of this tent living, i want an igloo. i'm sick of this mint tea, i want blubber."

    the emphasis on "individual differences" as explanation for success and failure ignores that the traits of the individual are affected by the culture he lives in. and his success or failure by the traits his culture values.

    ReplyDelete
  10. dxie4810:59 PM

    There seems to be things that replication cannot fix.

    DSM, the bible for psychologist,

    http://en.wikipedia.org/wiki/DSM-IV-TR

    "Diagnostic and Statistical Manual of Mental Disorders"

    quote,

    "The DSM has been praised for standardizing psychiatric diagnostic categories and criteria.
    It has also generated controversy and criticism. Critics,including the National Institute of Mental Health,
    argue that the DSM represents an unscientific and subjective system.[1]

    There are ongoing issues concerning

    the validity and reliability of the diagnostic categories;
    the reliance on superficial symptoms;
    the use of artificial dividing lines between categories and from 'normality';
    possible cultural bias;
    medicalization of human distress."

    There appears to be, using an expression I learned this week from of all people Paris Hilton, "so beyond".

    Example,

    http://en.wikipedia.org/wiki/Mental_disorders_diagnosed_in_childhood

    quote,

    "Intellectual disability

    DSM-IV-TR
    ...
    There are varying degrees of intellectual disability, which are identified by an IQ test.
    ...
    intellectual disability, Severity Unspecified:
    This unspecified diagnosis is given when there is a strong assumption that the child is mentally retarded,
    but cannot be tested because the individual ... not willing to take the IQ test or is an infant."

    This appears to be dianosis by subjective assumption not test.

    ReplyDelete
  11. dxie4811:13 PM

    Off topics but concerning the comment system,

    http://cornucopia-en.cornubot.se/2013/12/flash-disqus-cracked-security-flaw.html

    "FLASH: Disqus cracked - security flaw reveals user e-mail addresses"

    "The crack uses a serious security flaw in the Disqus API:s, enabling the extraction of MD5 hashes of user e-mail addresses."

    http://en.wikipedia.org/wiki/MD5

    "In 1996 a flaw was found in the design of MD5. While it was not a
    clearly fatal weakness, cryptographers began recommending the use of
    other algorithms ..."

    This might pose the question if the security was intentionally weakened??

    ReplyDelete
  12. David Coughlin10:06 AM

    I have heard anecdotally that DSM-V is a complete clownshow.

    ReplyDelete
  13. Richard Seiter11:09 AM

    I think you are correct to a large degree, but wonder how substantial the physical adaptation issues are. Eskimos and bedouins live in VERY different environments. Body fat composition could affect survival dramatically (I wonder what the nature/nurture balance is for body types in those cultures). Another difference is eskimos (on their historical diet) have a high omega 3 fatty acid intake. Are they perhaps more dependent on that than other populations? (something that's useful as a comparison is the loss of the ability to produce vitamin C in primates and other species. Hypothesized to have originated as a survival advantage when living on a diet high in vitamin C, e.g. fruit) Would an eskimo/bedouin even survive if raised in the other's ancestral environment?


    The problem I have with the whole nature/nurture physical/culture etc. debate is that so few people are willing to have a serious conversation about the different roles each plays and instead spend all their time attacking extremist strawmen (and at their worst acting like extremist strawmen).

    ReplyDelete
  14. Richard Seiter11:16 AM

    Who would have guessed such a much derided movie could have an interesting message? Thank you.

    ReplyDelete
  15. Richard Seiter11:31 AM

    Kind of like the 2013 Economics Nobel Prize ;-)


    A nice optimistic take (I hope you are right). I would feel better if I saw more Synthesis and less Thesis and Antithesis shouting at each other (and most of all at any attempts at Synthesis it seems). This seems to be a common theme in current American public discourse.


    On a related note, can you suggest any good historical analogies for how Thesis/Antithesis/Synthesis works in practice. For example, is it just a matter of generations of researchers/advocates dying off or is there a more sophisticated analysis explaining the timing of changes.

    ReplyDelete
  16. Rudel2:39 PM

    It's not the movie that is derided, it's the leading man.

    ReplyDelete
  17. Richard Seiter3:07 PM

    One can argue about the balance, but I think derision of the movie (or at least the title) was used to deride the leading man by association.

    ReplyDelete
  18. Diogenes7:14 PM

    the big physical differences i know of are that eskimos tend to be shaped like weebles http://en.wikipedia.org/wiki/Weeble as predicted by allen's rule http://en.wikipedia.org/wiki/Allen%27s_rule (a sphere has highest volume to surface area ratio). also the bedouin tend to have big noses, an adaptation to low humidity supposedly.

    "... is that so few people are willing to have a serious conversation about the different roles each plays..."

    bingo. it's not as if: if the the extreme hereditarians were right, that should mean less government and no concern for inequality.

    imho, inequality, per se, is ugly and if eugenics is required to remedy it, then eugenics should be govt policy.



    a "deep" reason for opposition to eugenics is it would mean the end to liberal capitalism. liberal capitalism cannot survive when the "you know those people. what do you expect?" argument is taken away. and any govt powerful enough to carry out eugenics would be a threat to capital's hegemony. but i don't think brave new world's cyprus would be the result.

    obviously the crimes of hitler and stalin limits acceptable opinion. but hitler was an animal lover and tree hugger too.

    ReplyDelete
  19. dxie489:00 PM

    Interesting metric from,

    http://retractionwatch.com/2013/07/11/why-has-the-number-of-scientific-retractions-increased-new-study-tries-to-answer/

    "Time-to-retraction (from publication of article to publication of
    retraction) averaged 32.91 months. Among 714 retracted articles
    published in or before 2002, retraction required 49.82 months; among
    1,333 retracted articles published after 2002, retraction required 23.82
    months (p<0.0001). This suggests that journals are retracting papers
    more quickly than in the past, although recent articles requiring
    retraction may not have been recognized yet."


    A half-life metric might also be appropriate.

    ReplyDelete
  20. Diogenes3:59 AM

    hegel never used those terms. his were "moment", negation, aufhebung or "absolute negativity". that is, the "synthesis" was not a compromise but a "neither this nor that". that is, again, the problem with thesis and antithesis was conceptual. it wasn't that one was true and the other false or there was some via media, it was that the picture of the world they expressed did not correspond to the world as it is.

    the aufhebung for nature v nurture, imho, is the simple proposition "each has an ideal environment wherein he will reach his potential. the prevailing environment may be closer to the ideal for some than for others, but potentials do differ."

    it should be kept in mind that those at the outer edge of the bell curve are more distant from their genetic true score than those in the middle. given a twin-twin rho of .68, if one twin scores160, the likelihood the other twin will score lower is 98%.

    ReplyDelete
  21. stevesailer8:59 PM

    Strangely enough, a lot of the derision directed at the movie over the years by Johnny Carson wasn't aimed at Reagan but was Carson's joking at the expense of his producer Fred De Cordova, who had directed "Bedtime for Bonzo." De Cordova, who was the model for Rip Torn's great producer character Artie in the "Larry Sanders Show," was standing right off the edge of the stage throughout Carson's long run, so Carson made a lot of Bedtime for Bonzo jokes at his expense.

    ReplyDelete
  22. Diogenes1:04 AM

    all psychiatric disorders are syndromes without any tests. and psychiatrists are in business. and psychiatric disorders are often chronic so drug companies have an incentive to develop drugs for them. it's also possible that academic psychiatrists have career motivations for finding (making up) a new disorder. all of these contribute to the expansion in the number of disorders and the "pathologizing of difference". these mercenary motives are especially apparent in child psychiatry, which has exploded over the last 20 years. the children are easy marks.

    also, very unlike other medical concerns, many people want to have something psychiatric wrong with them, want to be special, and want to believe that all their troubles will go away with the right pill. i've read that self-diagnosis of asperger's is common in silicon valley.

    but just limiting to the apparent (to me) "natural kinds" of depression, ocd, scz, addiction, anxiety, manic depression how can the check list method be improved on when the causes are unknown.


    all that said, psychiatrists would do well to read "industrial society and its future". if you've seen the "thin blue line" you know court psychiatrists can be deliberate frauds. i doubt whether ted k had anything more wrong with him than bin laden or any other terrorist.

    ReplyDelete
  23. LondonYoung11:26 AM

    Indeed. "Any deviation from the principle that seeking the truth has priority over any other goals..." If the outcome of the "science" is going to be govt policy, I suppose entire fields want to get out in front of that trade. The physicists are probably tame since changes in our understanding of cosmology are not going to influence govt policy.

    ReplyDelete