Thursday, September 01, 2011

Epistasis vs additivity

Continuing the discussion from my previous post: strong interactions at the level of individual genes do not preclude a linear (additive) analysis of population variation and natural selection.

On epistasis: why it is unimportant in polygenic directional selection

[Phil. Trans. R. Soc. B (2010) 365, 1241–1244 doi:10.1098/rstb.2009.0275]

James F. Crow*
Genetics Laboratory, University of Wisconsin, Madison, WI 53706, USA

There is a difference in viewpoint of developmental and evo-devo geneticists versus breeders and students of quantitative evolution. The former are interested in understanding the developmental process; the emphasis is on identifying genes and studying their action and interaction. Typically, the genes have individually large effects and usually show substantial dominance and epistasis. The latter group are interested in quantitative phenotypes rather than individual genes. Quantitative traits are typically determined by many genes, usually with little dominance or epistasis. Furthermore, epistatic variance has minimum effect, since the selected population soon arrives at a state in which the rate of change is given by the additive variance or covariance. Thus, the breeder’s custom of ignoring epistasis usually gives a more accurate prediction than if epistatic variance were included in the formulae.

Why did Crow have to write this 2010 paper? Don't evo-devo folks understand population genetics? Why do they find the dominance of additive heritability to be so counter-intuitive? Which of the two groups of scientists has a better understanding of how evolution works? Evo-devo folks seem to be from the traditional "revel in complexity" branch of biology: perfectly happy to find that living creatures are too complicated to be modeled by equations. (But are they?)

Some excerpts from the paper:

... Recent years have seen an increased emphasis on epistasis (e.g. Wolf et al. 2000; Carlborg & Haley 2004). Students of development and evo-devo, as well as some human geneticists, have paid particular interest to interactions. For those in these fields, epistasis is an interesting phenomenon on its own and studying it gives deeper insights into developmental and evolutionary processes. Ultimately one wants to know which individual genes are involved, and if one is studying the effects of such genes, it is natural to con- sider the ways in which they interact. Historically, among many other uses, epistasis has provided a means for identifying steps in biochemical and developmental sequences. More generally, including epistasis is part of the description of gene effects. So epistasis, despite methodological challenges, is usually welcomed as providing further insights. Students of development or evo-devo typically study genes of major effect. Of course, genes with major effects are more easily discovered, so they may be providing a biased sample. But we can say that at least some of the genes involved have large effects. And such genes typically show considerable dominance and epistasis.

In contrast, animal and plant breeders have traditionally regarded epistasis as a nuisance, akin to noise in impeding or obscuring the progress of selection. It may seem surprising that the traditional practice of ignoring epistasis has not led to errors in prediction equations. Why? It is this seeming paradox that I wish to discuss.

Continuously distributed quantitative traits typically depend on a large number of factors, each making a small contribution to the quantitative measurement. In general, the smaller the effects, the more nearly additive they are. Experimental evidence for this is abundant. This is expected for reasons analogous to those for which taking only the first term of a Taylor series provides a good estimate. ...

The most extensive selection experiment, at least the one that has continued for the longest time, is the selection for oil and protein content in maize (Dudley 2007). These experiments began near the end of the nineteenth century and still continue; there are now more than 100 generations of selection. Remarkably, selection for high oil content and similarly, but less strikingly, selection for high protein, continue to make progress. There seems to be no diminishing of selectable variance in the population. The effect of selection is enormous: the difference in oil content between the high and low selected strains is some 32 times the original standard deviation.

... Students of development, evo-devo and human genetics often place great emphasis on epistasis. Usually they are identifying individual genes, and naturally the interactions among these are of the very essence of understanding. The individual gene effects are usually large enough for considerable epistasis to be expected.

Quantitative genetics has a contrasting view. The foregoing analysis shows that, under typical conditions, the rate of change under selection is given by the additive genetic variance or covariance. Any attempt to include epistatic terms in prediction formulae is likely to do more harm than good. Animal and plant breeders who ignored epistasis, for whatever reasons, good or bad, were nevertheless on the right track. And prediction formulae based on simple heritability measurements are appropriate.

The power of using microscopic knowledge (genes) to develop macroscopic theory (phenotypes), whereby phenotypic measurements are used to develop prediction formulae, is beautifully illustrated by quantitative genetics theory.

Can we understand evolution without mathematics? Two more useful references:

Statistical Mechanics and the Evolution of Polygenic Quantitative Traits

The Evolution of Multilocus Systems Under Weak Selection

Note I am at BGI right now so there may be some latency in communication.


Guy_Brodude said...

I've come to the conclusion that biology is best left to those trained in other areas. Think about all the great biologists of the 20th century: they were trained in physics (Crick, Wilkinson and Delbruck), medicine (Avery, Macleod, McCarty and Luria) and chemistry (Chargaff, Franklin, Hershey, Meselson, Miller and Urey). The only solid counterexample would be Jim Watson, who was really not that great of a scientist, but a mediocre disciple of Luria's who crow-barred his way into Crick's lab and got very lucky.

Biology tends to attract the soft-minded, those lacking in mathematical skill but who nonetheless want to drape themselves in the prestigious robes of science. It is non-peer.

esmith said...

I think that you're approaching the whole thing from the completely wrong angle. There were studies very similar to what you're trying to do here. One of them failed to find any genes that contributed more than 0.4% to g. (And in the gene that allegedly contributed 0.4%, the SNP was in an intron.) And, as I argued a few posts back, the idea of many genes with tiny additive effects is incompatible with the real world.

So what are you missing? Copy numbers! Skip the SNP phase altogether and start by looking at copy numbers of any genes. Start with the NBPF family and work from there.

MtMoru said...

"And, as I argued a few posts back, the idea of many genes with tiny additive effects is incompatible with the real world."

Totally wrong.

MtMoru said...

"...lacking in mathematical skill but who nonetheless want to drape themselves in the prestigious robes of science..."

Fundamental attribution error and self-serving no doubt.

MtMoru said...

It is a theme of this blog that "physicists smart everyone else stupid".

Apparently Steve hasn't thought very hard about what physics really is.

It's too complicated to explain briefly, but Steve expecting that those formally trained in biology or whatever should have the mathematical muscles of those formally trained in physics is like a biologists expecting Steve, "his fellow natural scientist", to be able to work in his laboratory at the drop of a hat.

The precise mathematical description of natural and sometimes social (econophysics) phenomena is what physics is essentially. The phenomena which physicist per se concern themselves with are just exactly those phenomena most susceptible to precise mathematical description. Deformation professionelle being human nature, Steve and every physicist I've ever read or heard thinks of physics, in its essential sense, as the whole of science.

Does synthetic organic chemistry require a mathemtical roid monster? No. So OK why doesn't Steve cook up some Halichondrin B in his spare time?

Guy_Brodude said...

How many undergraduate programs require students to study statistics in detail? How many even offer elective credit for statistics coursework?

Most biology courses require nothing more than the mindless regurgitation of trivia. If you look at the SMPY statistics Steve posted awhile back you'll see that students majoring in Biology had below-average math scores comparable to the average for humanities majors. High V, Low M!

Guy_Brodude said...

This is all true, but serious problems are created when biologists (and there are admittedly some exceptions) launch ill-informed and/or misleading criticisms of scientifically and mathematically sound studies and theories. These criticisms are all too often used as an intellectual cudgel by activists and social scientists (pardon the redundancy) to promote ill-conceived and destructive social policy.

If the BGI study yields the expected results, egalitarians will be looking with hopeful eyes to "socially conscious" biologists to provide a rebuttal. And there will be many innumerate biologists with a significant public profile (insert for yourself the obvious names) who will be all too eager to provide that rebuttal.

steve hsu said...

I wouldn't be very useful in a physics lab, let alone a chemistry or bio lab!

But that has nothing to do with certain people claiming to "understand" evolution when in fact they may not. (Despite the fact that important results are available for them to learn by merely picking up a classic textbook, attending a course offered at their own university or even reading a Wikipedia entry... I have no explanation other than bounded cognition and hubris.)

MtMoru said...

That is all true, but " tends to attract the soft-minded, those lacking in mathematical skill" means lacking in such ski

botti said...

James F Crow born in 1916 and still going strong :)

steve hsu said...

The population was both SMPY and SVPY, and the cutoff for the group from which the plot was constructed was 1 in 200 in at least M or V. So the below average individuals in either M or V might be fairly unexceptional (e.g., +1 SD).

>A lot of science requires no more than arithmetic

This is becoming less and less true over time.

gwern said...

Another interesting example of long-term selection possibility:

> In the November 02013 issue of Science, Lenski and two members of his lab – Michael J. Wiser and Noah Ribeck – published their most recent work looking at fitness over the 50,000 generations. They measured how much the evolved bacteria have improved relative to their ancestors under the same environmental setup.
> They found that all 12 lines show consistent responses to selective pressures. For example, their descendants now grow faster in their standard sugary broth, and all populations show an increase in cell size.
> Yet variation lies hidden underneath these parallel changes. The fitness increases were nearly uniform in all 12 lineages, but not exact; the cell size grew in all of the populations, but by different amounts. When Lenski and his colleagues studied the bacteria’s DNA, they found that after thousands of generations, the populations’ genomes were full of alterations. These changes were different in each population and had accumulated at very different rates, suggesting a prominent role of chance in setting evolution’s course.
> In November 02013, after hitting the 50,000 generation mark, Lenski published a blog piece thinking about the long-term fate of his long-term experiment. He questions who will take over when he retires, and how the experiment will be sustained. He imagines his experiment being carried out by another 49,999 generations of scientists, each one overseeing another 50,000 bacterial generations. That is 50,0002 generations, or 2.5 billion generations in total, and would take about a million years to achieve. If this were to happen, Lenski predicts that the bacteria will reduce their doubling time from their ancestors’ ~55 minutes to ~23 minutes–which would also require a lot of freezer space.

Blog Archive