Friday, November 07, 2014

Adaptive evolution and non-coding regions

This morning I attended an excellent talk: Adaptive Evolution of Gene Expression (see paper and video below), by Hunter Fraser of Stanford.

His results support the hypothesis that non-coding regions of the genome play at least as large a role in evolution and heritable variation as protein coding genes.

From an information-theoretic perspective, it seems obvious that there is much more information in the whole genome than in the ~20k coding regions. Without the additional information, it would not be possible to produce diverse organisms such as flies, worms, fish, and humans from very similar sets of genes/proteins. Strangely, though, I've found most biologists to be overly focused on protein sequences. Perhaps results like these will finally modify this prior.
Gene expression drives local adaptation in humans
Hunter B. Fraser
Department of Biology, Stanford University, Stanford, CA, 94305.

The molecular basis of adaptation—and in particular the relative roles of protein-coding vs. gene expression changes—has long been the subject of speculation and debate. Recently, the genotyping of diverse human populations has led to the identification of many putative “local adaptations” that differ between populations. Here I show that these local adaptations are over 10-fold more likely to affect gene expression than amino acid sequence. In addition, a novel framework for identifying polygenic local adaptations detects recent positive selection on the expression levels of genes involved in UV radiation response, immune cell proliferation, and diabetes-related pathways. These results provide the first examples of polygenic gene expression adaptation in humans, as well as the first genome-scale support for the hypothesis that changes in gene expression have driven human adaptation.

This is video of a similar talk at Berkeley.

Note Added: As I mentioned above, simple considerations suggest that the machinery of life must be much more complex than the diversity of specific proteins.
Evolution at Two Levels: On Genes and Form
Sean B Carroll

(This article is based on the Allan Wilson Memorial Lectures given at the University of California at Berkeley in October 2004.)

In their classic paper “Evolution at Two Levels in Humans and Chimpanzees,” published exactly 30 years ago, Mary-Claire King and Allan Wilson described the great similarity between many proteins of chimpanzees and humans [1]. They concluded that the small degree of molecular divergence observed could not account for the anatomical or behavioral differences between chimps and humans. Rather, they proposed that evolutionary changes in anatomy and way of life are more often based on changes in the mechanisms controlling the expression of genes than on sequence changes in proteins.

This article was a milestone in three respects. First, because it was the first comparison of a large set of proteins between closely related species, it may be considered one of the first contributions to “comparative genomics” (although no such discipline existed for another two decades). Second, because it extrapolated from molecular data to make inferences about the evolution of form, it may also be considered a pioneering study in evolutionary developmental biology. And third, its focus on the question of human evolution and human capabilities, relative to our closest living relative, marked the beginning of the quest to understand the genetic basis of the origins of human traits. Like much of Wilson and his colleagues' body of work, this contribution had a great influence on paleoanthropologists as well as molecular biologists.

The 30th anniversary of this landmark article arrives at a moment when comparative genomics, evolutionary developmental biology, and evolutionary genetics are pouring forth unprecedented amounts of new data, and the entire chimpanzee genome is available for study. It is therefore an opportune time to examine what has been and is being revealed about the relationship between evolution at the two levels of molecules and organisms, and to assess the status of King and Wilson's hypothesis concerning the predominant role of regulatory mutations in organismal evolution.

King and Wilson used the phrase “ways of life” to include both physiology and behavior (M.-C. King, personal communication) and proposed that the evolution of both anatomy and ways of life was governed by regulatory changes in the expression of genes. From the outset of this review, I make the sharp distinction between the evolution of anatomy and the evolution of physiology. Changing the size, shape, number, or color patterns of physical traits is fundamentally different from changing the chemistry of physiological processes. There is ample evidence from studies of the evolution of proteins directly involved in animal vision [2], respiration [3], digestive metabolism [4], and host defense [5] that the evolution of coding sequences plays a key role in some (but not all) important physiological differences between species. In contrast, the relative contribution of coding or regulatory sequence evolution to the evolution of anatomy stands as the more open question, and will be my primary focus.

The amount of direct evidence currently in hand is modest, and includes examples of both the evolution of coding and of non-coding, regulatory sequences contributing to morphological evolution. However, I will develop the argument, on the basis of theoretical considerations and a rapidly expanding body of empirical studies, that regulatory sequence evolution must be the major contributor to the evolution of form.

This conclusion poses particular challenges to comparative genomics. While we are often able to infer coding sequence function from primary sequences, we are generally unable to decipher functional properties from mere inspection of non-coding sequences. This has led to a bias in comparative genomics and evolutionary genetics toward the analysis and reporting of readily detectable events in coding regions, such as gene duplications and protein sequence evolution, while non-coding, regulatory sequences are often ignored. However, approximately two-thirds of all sequences under purifying selection in our genome are non-coding [6]. One consequence of the underconsideration of non-coding, regulatory sequences is unrealistic expectations about what can currently be learned about the genetic bases of morphological diversity from comparisons of genome sequences alone. The visible diversity of any group is not reflected by the most visible components of gene diversity—that is, the diversity of gene number or of coding sequences. In order to understand the evolution of anatomy, we have to study and understand regulatory sequences, as well as the proteins that connect them into the regulatory circuits that govern development. I will begin with some historical and theoretical considerations about regulatory and coding sequence evolution, then delve into the insights offered by specific experimental models of anatomical evolution, and finally, I will revisit King and Wilson's original focus and discuss how our emerging knowledge of the evolution of form bears on current efforts to understand human evolution. ...

... Thus, while the coding sequences of the structural and regulatory proteins are constrained by pleiotropy, modular cis-regulatory regions enable a great diversity of patterns to arise from alterations in regulatory circuits through the evolution of novel combinations of sites for regulatory proteins in cis-regulatory elements [35]. This diversity is produced by the sort of “tinkering” with existing components envisaged by Jacob [19]. ... The available evidence suggests ... that the diversification of other traits that are governed by highly pleiotropic and well-conserved proteins can also be accounted for by regulatory sequence evolution.

... Based upon (i) empirical studies of the evolution of traits and of gene regulation in development, (ii) the rate of gene duplication and the specific histories of important developmental gene families, (iii) the fact that regulatory proteins are the most slowly evolving of all classes of proteins, and (iv) theoretical considerations concerning the pleiotropy of mutations, I argue that there is adequate basis to conclude that the evolution of anatomy occurs primarily through changes in regulatory sequences. ...


Wie der Wind said...

Dear Mr. Hsu, I 'd like to ask a question or two not specifically related to this article. It actually bugs my mind for quite a while.

1) I heard that screening cells in general for their genes/genome is actually harmful beacuse it involves technologies (for example: radiation) that cause harm. What is your take on that? Is it possible for the screening process to not cause harm at all?

2) Is it possible that in the future, we can engineer sperm with the genetic information we want (to fertilize the eggs)? Also, if we can do this to the eggs as well, that would be germline genetic engineering, right? It can open up a lot of very interesting opportunities, correct?

3) As far as I understand, different human races have different genetic means of IQ. For the Blacks it is 85, for the Whites it is 100, and for the East-Asians it is 106.

This theoretically means that we can increase the genetic mean IQ with genetic engineering. According to your view, Sir, how much we can increase the genetic means of a population's IQ? How many Standard Deviations (+15 IQ)?

4) Will it be possible in the future to harvest a woman's eggs by extracting one of her ovaries (she could retain the other if she wants, to have babies that way as well) and putting it to work? Wth this method, one couple can have thousands of children, provided the technical means are there, right? How far (in years) do you think are we from being able to do this technically?

I readily admit that some of these issues can be considered highly speculative, but the most recent scientific findings do not contradict these, in fact they mostly support them.

Thank you in advance.


Something that just popped into my mind: I read that Creativity and Intelligence diverges once past 130 IQ. Which means that higly creative people are almost never that intelligent, and vice versa. My thinking here: Are we facing here some of Nature's 'quid pro quo'-s? You can have one (genius intellect or high creativity), but you cant have both at the same time?

DK said...

On a protein sequence level, ~80% of proteins are NOT identical between humans and chimps. All the people talking about how human and chimp proteomes are nearly the same need to remember this simple fact.

Wie der Wind said...

Dear Mr. Hsu, may I get some answers as to why my comment was removed?

steve hsu said...

It was very off-topic.

Titus Brown said...

This is not remotely news to most biologists :). There's lots and lots of direct molecular biology/functional evidence going back decades to support the role of non-coding DNA, never mind the tons of genetic and evolutionary evidence that shows that coding sequence simply isn't responsible for all the interesting variation. There are various arguments about the relative importance and size of the roles of of cis-regulatory DNA/transcription factor binding sites, transcribed non-coding sequence, and more structural variation in the genome, but very little disagreement that as a whole it's very important. (References available on request, if you have a specific interest.)

The question, however, is what do we do with that knowledge? There are still relatively few ways to nail down precisely what a bit of non-coding DNA does, while there is at least another obvious level of analysis and perturbation available for coding sequence. Plus, in large genomes, the coding sequence is so much smaller than the non-coding sequence that it's enriched for functional variation on a per-base level, which makes life easier.

And then there's the fact that exome sequencing is cheaper :).

So when you combine cost with increased interpretability, you can understand why so many drunk genomicists are looking for their keys under the lamp post rather than scavenging around in the darkness.

Although there are still plenty of people who work on non-coding stuff, so it's not completely ignored or anything.

Bibibibibib Blubb said...

They have deleted chunks of non coding DNA from animals and it had very little(if any effect).

Gene deserts are also where a lot of these IQ alleles were found.

Titus Brown said...

Structurally? They are. Sequence wise? They align at 95% plus. So, basically... very similar.

Matt MacManes said...

Might I ask you make a correction: the video is from the Museum of Vertebrate Zoology (UC Berkeley, lunch time seminar. NOT Stanford..

DK said...

What for you is "very similar" can be life and death in real life. E.g., there are many seriously ill people with mutations in actin - but almost all of them have actin that does not seem to be any different from wild type in all in vitro assays. IOW, that 5% might kill you (or turn you into gorilla). All the really bad actin mutants were never born, of course.

Blog Archive