Sunday, August 17, 2014

Genetic Architecture of Intelligence (arXiv:1408.3421)

This paper is based on talks I've given in the last few years. See here and here for video. Although there isn't much that hasn't already appeared in the talks or on this blog (other than some Compressed Sensing results for the nonlinear case) it's nice to have it in one place. The references are meant to be useful to people seriously interested in this subject, although I imagine they are nowhere near comprehensive. Apologies to anyone whose work I missed.

If you don't like the word "intelligence" just substitute "height" and everything will be OK. We live in strange times.
On the genetic architecture of intelligence and other quantitative traits (arXiv:1408.3421)
Categories: q-bio.GN
Comments: 30 pages, 13 figures

How do genes affect cognitive ability or other human quantitative traits such as height or disease risk? Progress on this challenging question is likely to be significant in the near future. I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a "general factor" or g score. The main results concern the stability, validity (predictive power), and heritability of adult g. The largest component of genetic variance for both height and intelligence is additive (linear), leading to important simplifications in predictive modeling and statistical estimation. Due mainly to the rapidly decreasing cost of genotyping, it is possible that within the coming decade researchers will identify loci which account for a significant fraction of total g variation. In the case of height analogous efforts are well under way. I describe some unpublished results concerning the genetic architecture of height and cognitive ability, which suggest that roughly 10k moderately rare causal variants of mostly negative effect are responsible for normal population variation. Using results from Compressed Sensing (L1-penalized regression), I estimate the statistical power required to characterize both linear and nonlinear models for quantitative traits. The main unknown parameter s (sparsity) is the number of loci which account for the bulk of the genetic variation. The required sample size is of order 100s, or roughly a million in the case of cognitive ability.


  1. Matthew Stern12:03 PM

    In case you haven't heard, UIUC is conducting a fairly massive study investigating the effect of various interventions (e.g. exercise, tDCS, working memory training) on G. The study is taking DNA samples to see if results "have a genetic component" (this is admittedly vague, but it's all I could glean from being a participant). I hear results will be available within 4-8 months; here's the website:

  2. Fil Carvalho7:05 PM

    Stephen, is it completly impossible that there are tradoffs between different mental capacities ? Because that would explain why certain individuals with outstanding analytical prowess have language delay (eg: Feynman, Einstein and Teller).

  3. There could be tradeoffs, and the amount of pleiotropy is an open question.

    However, I would bet that in a 10k-dimensional space there are directions that, e.g., increase one or a few traits radically without causing problems in others. Certainly among humans that have already existed there are examples of individuals with very strong characteristics in almost all categories. The "maximal type" according to any chosen combination of characteristics almost certainly exceeds types already produced by random chance.

  4. James Thompson11:49 AM

    Dear Stephen, Just to let you know I have posted a brief summary of your paper, and would appreciate it if you were to point out any errors.

  5. Thanks, James. The summary looks fine. I perhaps should point out that the toy model with 10k causal variants could turn out to be wrong. But it does illustrate some basic points.

  6. James Thompson2:02 PM

    Thanks. Yes, a model is just that, and reality has the last laugh.

  7. Matthew Stern6:07 PM

    After reading Steve's post more carefully, I think "massive" should be deleted from my comment. The scale of the Insight study is nowhere near what Steve's analysis suggests is necessary, although I get the feeling that the genetic aspect of Insight is secondary.

  8. Your point about orthogonal directions in a 10k-space is interesting, but I'd like to point out that the space of cognitive processes to which these loci contribute may be much smaller than that, and much more curved.

  9. HerrWolfHitler3:22 AM

    ...could turn out to be wrong...

    Or rather IS with apodictic certainty wrong.

    I heard you hung out with Obersturmbannfuhrer Greg Cochran in Chicago. Is that true?

  10. reuvenavram8:19 PM

    Einstein didn't have "language delay". That's just a myth advanced by parents of developmentally disabled or slow children. By all accounts, Einstein was a precocious, smart, kid who spoke several languages and played the violin.

  11. cognous2:41 PM

    Excellent article -- thank you for putting it together.

    I recently listened to a recent BBC Radio 4 series Intelligence: Born Smart, Born Equal, Born Different which also addressed the topic, but taught me nothing and left me with a mixture of sadness and rage: