Tuesday, December 10, 2013

Is science self-correcting?


More fun from our man Ioannidis. See earlier posts Medical science? , NIH discovers reproducibility and Bounded cognition.

A toy model of the dynamics of scientific research, with probability distributions for accuracy of experimental results, mechanisms for updating of beliefs by individual scientists, crowd behavior, bounded cognition, etc. can easily exhibit parameter regions where progress is limited (one could even find equilibria in which most beliefs held by individual scientists are false!). Obviously the complexity of the systems under study and the quality of human capital in a particular field are important determinants of the rate of progress and its character.

In physics it is said that successful new theories swallow their predecessors whole. That is, even revolutionary new theories (e.g., special relativity or quantum mechanics) reduce to their predecessors in the previously studied circumstances (e.g., low velocity, macroscopic objects). Swallowing whole is a sign of proper function -- it means the previous generation of scientists was competent: what they believed to be true was (at least approximately) true. Their models were accurate in some limit and could continue to be used when appropriate (e.g., Newtonian mechanics).

In some fields (not to name names!) we don't see this phenomenon. Rather, we see new paradigms which wholly contradict earlier strongly held beliefs that were predominant in the field* -- there was no range of circumstances in which the earlier beliefs were correct. We might even see oscillations of mutually contradictory, widely accepted paradigms over decades.

It takes a serious interest in the history of science (and some brainpower) to determine which of the two regimes above describes a particular area of research. I believe we have good examples of both types in the academy.

* This means the earlier (or later!) generation of scientists in that field was incompetent. One or more of the following must have been true: their experimental observations were shoddy, they derived overly strong beliefs from weak data, they allowed overly strong priors to determine their beliefs.

Why Science Is Not Necessarily Self-Correcting 
(DOI: 10.1177/1745691612464056)

John P. A. Ioannidis
Stanford Prevention Research Center, Department of Medicine and Department of Health Research and Policy, Stanford University School of Medicine, and Department of Statistics, Stanford University School of Humanities and Sciences

The ability to self-correct is considered a hallmark of science. However, self-correction does not always happen to scientific evidence by default. The trajectory of scientific credibility can fluctuate over time, both for defined scientific fields and for science at-large. History suggests that major catastrophes in scientific credibility are unfortunately possible and the argument that “it is obvious that progress is made” is weak. Careful evaluation of the current status of credibility of various scientific fields is important in order to understand any credibility deficits and how one could obtain and establish more trustworthy results. Efficient and unbiased replication mechanisms are essential for maintaining high levels of scientific credibility. Depending on the types of results obtained in the discovery and replication phases, there are different paradigms of research: optimal, self-correcting, false nonreplication, and perpetuated fallacy. In the absence of replication efforts, one is left with unconfirmed (genuine) discoveries and unchallenged fallacies. In several fields of investigation, including many areas of psychological science, perpetuated and unchallenged fallacies may comprise the majority of the circulating evidence. I catalogue a number of impediments to self-correction that have been empirically studied in psychological science. Finally, I discuss some proposed solutions to promote sound replication practices enhancing the credibility of scientific results as well as some potential disadvantages of each of them. Any deviation from the principle that seeking the truth has priority over any other goals may be seriously damaging to the self-correcting functions of science

23 comments:

efalken said...

There are theories like String Theory, or Keynesianism, that are too broad to ever be falsified. Bad theories aren't rejected, they're orphaned.

Diogenes said...

"Obviously the complexity of the systems under study and the quality of
human capital in a particular field are important determinants of the
rate of progress and its character."

so, again, steve must claim that physicists are smarter than everyone else. but how smart could anyone be who chose physics? i'll take antibiotics and vaccines over modern physics any day.

Diogenes said...

humans are very unlike all other animals in one respect: the variety of their behavior.


humans live in greenland and the sahara. they live in manhattan high rises and in huts. they make their living writing (to people or to computers) and by hunting, by speculating in derivatives and by prostitution.



eskimos and bedouin differ much more in culture than in genes.


psychology is at best a branch of anthropology. at worst, it's a pseudoscience with a very clear ideological function.

Endre Bakken Stovner said...

The paper was pretty funny.

It includes a list of six rules to follow to avoid the problem of many false positive publications:

1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.
2. Authors must collect at least 20 observations per cell or else provide a compelling cost-of-data-collection justification.
3. Authors must list all variables collected in a study.
4. Authors must report all experimental conditions, including failed manipulations.
5. If observations are eliminated, authors must also report what the statistical results are if those observations are included.
6. If an analysis includes a covariate, authors must report the statistical results of the analysis without the covariate.

Seems like something that would be useful to researchers in biology and medicine too.

Emil Kirkegaard said...

You should read the entire special issue on reproducibility (of which this was the last paper). I put all the papers here: http://emilkirkegaard.dk/en/?p=3395

Rudel said...

"so, again, steve must claim that physicists are smarter than everyone else."


Not just physicists but East Asian physicists if the truth be known. And I agree, just as some areas of theoretical mathematics are meaningless, so it is with much unprovable cosmology and aberrant intellectual masturbations like string "theory" which are utterly useless. Just because a hypothesis is "elegant" doesn't make it true if it can't be tested empirically.

Rudel said...

"eskimos and bedouin differ much more in culture than in genes."


A generality that has ill-defined terms. On the surface I find that most tribal cultures exhibit boring similarities and that the Arab and Eskimo phenotypes imply strong selection due to climate.

Carson Chow said...

Do you have an example toy model? If you do, I won't have to bother making one.

Diogenes said...

yes they have adapted physically/genetically. perhaps their behavioral adaptation is partly genetic too. but i have no doubt that an eskimo raised by bedouins would learn to speak whatever bedouin language, learn to behave like a bedouin. he wouldn't grow up to say, "i'm sick of this tent living, i want an igloo. i'm sick of this mint tea, i want blubber."

the emphasis on "individual differences" as explanation for success and failure ignores that the traits of the individual are affected by the culture he lives in. and his success or failure by the traits his culture values.

dxie48 said...

There seems to be things that replication cannot fix.

DSM, the bible for psychologist,

http://en.wikipedia.org/wiki/DSM-IV-TR

"Diagnostic and Statistical Manual of Mental Disorders"

quote,

"The DSM has been praised for standardizing psychiatric diagnostic categories and criteria.
It has also generated controversy and criticism. Critics,including the National Institute of Mental Health,
argue that the DSM represents an unscientific and subjective system.[1]

There are ongoing issues concerning

the validity and reliability of the diagnostic categories;
the reliance on superficial symptoms;
the use of artificial dividing lines between categories and from 'normality';
possible cultural bias;
medicalization of human distress."

There appears to be, using an expression I learned this week from of all people Paris Hilton, "so beyond".

Example,

http://en.wikipedia.org/wiki/Mental_disorders_diagnosed_in_childhood

quote,

"Intellectual disability

DSM-IV-TR
...
There are varying degrees of intellectual disability, which are identified by an IQ test.
...
intellectual disability, Severity Unspecified:
This unspecified diagnosis is given when there is a strong assumption that the child is mentally retarded,
but cannot be tested because the individual ... not willing to take the IQ test or is an infant."

This appears to be dianosis by subjective assumption not test.

dxie48 said...

Off topics but concerning the comment system,

http://cornucopia-en.cornubot.se/2013/12/flash-disqus-cracked-security-flaw.html

"FLASH: Disqus cracked - security flaw reveals user e-mail addresses"

"The crack uses a serious security flaw in the Disqus API:s, enabling the extraction of MD5 hashes of user e-mail addresses."

http://en.wikipedia.org/wiki/MD5

"In 1996 a flaw was found in the design of MD5. While it was not a
clearly fatal weakness, cryptographers began recommending the use of
other algorithms ..."

This might pose the question if the security was intentionally weakened??

David Coughlin said...

I have heard anecdotally that DSM-V is a complete clownshow.

Richard Seiter said...

I think you are correct to a large degree, but wonder how substantial the physical adaptation issues are. Eskimos and bedouins live in VERY different environments. Body fat composition could affect survival dramatically (I wonder what the nature/nurture balance is for body types in those cultures). Another difference is eskimos (on their historical diet) have a high omega 3 fatty acid intake. Are they perhaps more dependent on that than other populations? (something that's useful as a comparison is the loss of the ability to produce vitamin C in primates and other species. Hypothesized to have originated as a survival advantage when living on a diet high in vitamin C, e.g. fruit) Would an eskimo/bedouin even survive if raised in the other's ancestral environment?


The problem I have with the whole nature/nurture physical/culture etc. debate is that so few people are willing to have a serious conversation about the different roles each plays and instead spend all their time attacking extremist strawmen (and at their worst acting like extremist strawmen).

Richard Seiter said...

Who would have guessed such a much derided movie could have an interesting message? Thank you.

Richard Seiter said...

Kind of like the 2013 Economics Nobel Prize ;-)


A nice optimistic take (I hope you are right). I would feel better if I saw more Synthesis and less Thesis and Antithesis shouting at each other (and most of all at any attempts at Synthesis it seems). This seems to be a common theme in current American public discourse.


On a related note, can you suggest any good historical analogies for how Thesis/Antithesis/Synthesis works in practice. For example, is it just a matter of generations of researchers/advocates dying off or is there a more sophisticated analysis explaining the timing of changes.

Rudel said...

It's not the movie that is derided, it's the leading man.

Richard Seiter said...

One can argue about the balance, but I think derision of the movie (or at least the title) was used to deride the leading man by association.

Diogenes said...

the big physical differences i know of are that eskimos tend to be shaped like weebles http://en.wikipedia.org/wiki/Weeble as predicted by allen's rule http://en.wikipedia.org/wiki/Allen%27s_rule (a sphere has highest volume to surface area ratio). also the bedouin tend to have big noses, an adaptation to low humidity supposedly.

"... is that so few people are willing to have a serious conversation about the different roles each plays..."

bingo. it's not as if: if the the extreme hereditarians were right, that should mean less government and no concern for inequality.

imho, inequality, per se, is ugly and if eugenics is required to remedy it, then eugenics should be govt policy.



a "deep" reason for opposition to eugenics is it would mean the end to liberal capitalism. liberal capitalism cannot survive when the "you know those people. what do you expect?" argument is taken away. and any govt powerful enough to carry out eugenics would be a threat to capital's hegemony. but i don't think brave new world's cyprus would be the result.

obviously the crimes of hitler and stalin limits acceptable opinion. but hitler was an animal lover and tree hugger too.

dxie48 said...

Interesting metric from,

http://retractionwatch.com/2013/07/11/why-has-the-number-of-scientific-retractions-increased-new-study-tries-to-answer/

"Time-to-retraction (from publication of article to publication of
retraction) averaged 32.91 months. Among 714 retracted articles
published in or before 2002, retraction required 49.82 months; among
1,333 retracted articles published after 2002, retraction required 23.82
months (p<0.0001). This suggests that journals are retracting papers
more quickly than in the past, although recent articles requiring
retraction may not have been recognized yet."


A half-life metric might also be appropriate.

Diogenes said...

hegel never used those terms. his were "moment", negation, aufhebung or "absolute negativity". that is, the "synthesis" was not a compromise but a "neither this nor that". that is, again, the problem with thesis and antithesis was conceptual. it wasn't that one was true and the other false or there was some via media, it was that the picture of the world they expressed did not correspond to the world as it is.

the aufhebung for nature v nurture, imho, is the simple proposition "each has an ideal environment wherein he will reach his potential. the prevailing environment may be closer to the ideal for some than for others, but potentials do differ."

it should be kept in mind that those at the outer edge of the bell curve are more distant from their genetic true score than those in the middle. given a twin-twin rho of .68, if one twin scores160, the likelihood the other twin will score lower is 98%.

stevesailer said...

Strangely enough, a lot of the derision directed at the movie over the years by Johnny Carson wasn't aimed at Reagan but was Carson's joking at the expense of his producer Fred De Cordova, who had directed "Bedtime for Bonzo." De Cordova, who was the model for Rip Torn's great producer character Artie in the "Larry Sanders Show," was standing right off the edge of the stage throughout Carson's long run, so Carson made a lot of Bedtime for Bonzo jokes at his expense.

Diogenes said...

all psychiatric disorders are syndromes without any tests. and psychiatrists are in business. and psychiatric disorders are often chronic so drug companies have an incentive to develop drugs for them. it's also possible that academic psychiatrists have career motivations for finding (making up) a new disorder. all of these contribute to the expansion in the number of disorders and the "pathologizing of difference". these mercenary motives are especially apparent in child psychiatry, which has exploded over the last 20 years. the children are easy marks.

also, very unlike other medical concerns, many people want to have something psychiatric wrong with them, want to be special, and want to believe that all their troubles will go away with the right pill. i've read that self-diagnosis of asperger's is common in silicon valley.

but just limiting to the apparent (to me) "natural kinds" of depression, ocd, scz, addiction, anxiety, manic depression how can the check list method be improved on when the causes are unknown.


all that said, psychiatrists would do well to read "industrial society and its future". if you've seen the "thin blue line" you know court psychiatrists can be deliberate frauds. i doubt whether ted k had anything more wrong with him than bin laden or any other terrorist.

LondonYoung said...

Indeed. "Any deviation from the principle that seeking the truth has priority over any other goals..." If the outcome of the "science" is going to be govt policy, I suppose entire fields want to get out in front of that trade. The physicists are probably tame since changes in our understanding of cosmology are not going to influence govt policy.

Blog Archive

Labels