Wednesday, August 17, 2011

@Google: Genetics and Intelligence

I'll be giving a talk at Google tomorrow (Thursday August 18) at 5 pm. The slides are here. The video will probably be available on Google's TechTalk channel on YouTube, perhaps after some delay.

The Cognitive Genomics Lab at BGI is using this talk to kick off the drive for US participants in our intelligence GWAS. More information at www.cog-genomics.org, including automatic qualifying standards for the study, which are set just above +3 SD. Participants will receive free genotyping and help with interpreting the results. (The functional part of the site should be live after August 18.)

Title: Genetics and Intelligence

Abstract: How do genes affect cognitive ability? I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a "general factor" or IQ score. The main results concern the stability, validity (predictive power), and heritability of adult IQ. Next, I discuss ongoing Genome Wide Association Studies which investigate the genetic basis of intelligence. Due mainly to the rapidly decreasing cost of sequencing, it is likely that within the next 5-10 years we will identify genes which account for a significant fraction of total IQ variation.

We are currently seeking volunteers for a study of high cognitive ability. Participants will receive free genotyping.

80 comments:

  1. You might want to contact the Davidson Institute for Talent Development to find volunteers.  To get in you must test in the top 99.9%.
    http://www.davidsongifted.org/youngscholars/

    ReplyDelete
  2. I can't believe Google would allow anyone to deliver a talk on such a controversial topic. Have the organizers seen your blog  ?

    ReplyDelete
  3. Will you update us on how well they received your talk?  By the way, how did BGI determined the cutoffs (exam scores) for automatic admission in the study?

    ReplyDelete
  4. esmith6:29 PM

    Just to clarify, by "genotyping", do you mean full genome sequencing (and then providing full 3 GB of data to each participant), or something more limited?

    Also, is your US sample open to immigrants, or are you going for any kind of ancestral background restrictions?

    ReplyDelete
  5. I don't even understand, you're a physicist, since when do physicists give talks about IQ ?

    ReplyDelete
  6. 1. The first stage of genotyping will involve SNPs only; in the future some or all participants may receive more extensive genotyping such as exome or full genome sequencing.

    2. There are no restrictions on ethnicity or citizenship. However, if you reside outside the US or Canada there may be some delay in receiving a saliva dna kit from BGI after qualifying for the study.

    ReplyDelete
  7. Angela, you seriously think that Google would not be interested in finding the genetic basis for high IQ?

    ReplyDelete
  8. Angela, haven't you been paying attention to this blog? :) If you had been, surely you would've remembered Steve discussing how physicists, with their superior cognitive horsepower, often dabble in other fields such as economics, psychometrics, biology, computer science, etc, but that the converse seldom happens, if ever. And also, as Steve has discussed, many of these other fields could benefit from the superior cognitive horsepower of physicists, as some of the conceptual muddle could certainly be cleared up by intellectually superior minds . ;)

    ReplyDelete
  9. Thanks, it's on our to do list :-)

    ReplyDelete
  10. botti9:30 PM

    Heh, yes well biology has certainly benefited from the involvement of physicists.

    "According to Crick, the experience of learning physics had taught him something important—hubris—and the conviction that since physics was already a success, great advances should also be possible in other sciences such as biology. Crick felt that this attitude encouraged him to be more daring than typical biologists who tended to concern themselves with the daunting problems of biology and not the past successes of physics."

    http://en.wikipedia.org/wiki/Francis_Crick
    http://en.wikipedia.org/wiki/Maurice_Wilkins

    ReplyDelete
  11. William_JD9:46 PM

    Can I participate anonymously? 

    ReplyDelete
  12. I don't think Google is stupid enough to step into the political minefield of the genetic basis of IQ. All they need is high IQ employees, not the genetic basis of high IQ. Frankly I'm surprised Steve Hsu still has a job at Univ. of Oregon considering how controversial his views are. He certainly has no shortage of hubris in maintaining an HBD blog using his real name and identity. That takes cajones, considering how liberal and politically correct academia is.

    ReplyDelete
  13. Will you take an LSAT score?

    ReplyDelete
  14. Will you take the LSAT?

    ReplyDelete
  15. athelas31410:17 PM

    I'm interested in participating.  In addition to the raw SNP data, will there be an interpretation like that provided by 23andme, or at least links to places where we can find known associations between SNPs and phenotype?

    ReplyDelete
  16. Question

    Is 'Automatic qualifying criteria' the minimum criteria for qualification.

    I scored approx. 700, 700, 700! (SAT V, M and GMAT). I am interested to participate just to get my genotype. But if that is too low a score, oh well :-)

    Joker

    ReplyDelete
  17. Cool, I'd love to see my genotyping.  Mind if I spread this around to some other Tech alums?

    ReplyDelete
  18. Not sure. There's a place in the survey where you can submit your LSAT score (and other additional information), but please submit your other scores or other academic information as well.

    ReplyDelete
  19. I second this question.

    I miss the GRE cutoff by 10 verbal points, but I got a 10 on the AIME math exam in high school.  Should I apply?

    ReplyDelete
  20. I think the math/verbal split poses a bit of a problem for the cutoffs. I know people who qualified for USAMO, but who didn't meet the V cutoff.

    ReplyDelete
  21. If you miss the automatic criteria you might still get in based on supplemental information. The form allows you to submit additional information and documentation.

    ReplyDelete
  22. Those are just automatic criteria. USAMO would probably get someone in as long as their other scores are somewhat close to the cutoffs.

    ReplyDelete
  23. You can't remain anonymous to us but your identity is protected under the privacy agreement. See web site.

    ReplyDelete
  24. Something that came to my mind when looking at this and some of the earlier posts. Aren't IQ scores censored data? Surely a sufficiently large population that gains, on average, a total of 105 points on an IQ test over one hour, but a total of 120 over two, is smarter than one that gains a total of 105 over one hour but makes no improvement in the second hour. Why is this aspect never considered? I suppose one can argue that how `fast' one can think is an important component of IQ, but in that case you can penalize the score gain by the additional time spent...

    ReplyDelete
  25. esmith12:41 AM

    LSAT should definitely be used, in my opinion, it is a much stronger indicator of high-end g than GRE. (I tried them both.) GRE-M will be maxed out by 6% of all students who try to apply to grad schools in the country (which means that the score of 800 only indicates +2 SD or so). In contrast, LSAT is much tougher to max out. The "perfect 180" is achieved by one out of ten thousand (!) test takers.

    ReplyDelete
  26. How many people have super high LSATs but can't make the SAT/ACT cutoff?

    I want to stress again that people can qualify without meeting any of the automatic criteria. The survey form on the site is sufficiently general that you can make your case: e.g., I got 180 on the LSAT; here is a scan of my score report ...

    ReplyDelete
  27. esmith1:32 AM

    Very true. I guess, what I'm trying to say is, automatic SAT/ACT/especially GRE cutoffs are rather low. You're saying that automatic qualifiers are set above +3 SD, but, in reality, they will allow lots of people below 2.5 SD and occasionally even below 2 SD. If you just want to get a bunch of smart guys, SAT and ACT will suffice. (Or you could just run an ad in a Mensa newsletter.) If you really want a list of people above 3 SD without shelling out $1000/person on psychometrist to retest everyone using real, in-person IQ tests, that's where LSAT comes useful.

    ReplyDelete
  28. Seems to me like Steve's study is geared more towards people with high mathematical ability, given the various qualifying criteria. It's possible that using the LSAT will lead to some high M people perhaps missing the cutoff.

    ReplyDelete
  29. esmith1:46 AM

    High V low M types will end up buried by the logic games section.

    Try this one: http://www.griffonprep.com/logicgame.html (there are answers below, don't peek!)

    ReplyDelete
  30. MtMoru2:54 AM

    So 1. Steve finds out wo you are and 2. You might get a letter saying you're a lot smarter than you should be.

    ReplyDelete
  31. Reactionary_Konkvistador2:54 AM

    The slides are really familiar, you already gave this talk or at least a very similar one before haven't you? Can't wait to see the video.

    ReplyDelete
  32. Matthew Carnegie6:08 AM

    Yan,

    I think that in terms of V and M scores, it's that the study is all about finding g, and having high V and M subtests is a better indicator of having higher g than having having a lower M but higher V or conversely a lower V but higher M, which are more indicative of having a lower g and a higher subfactor related to maths in some way. Which subfactors may or may not be heritable (current evidence for subfactor heritabiliy I believe being low within population) but isn't what's being looked for here in any case.

    There shouldn't be too much difference since g is the largest common factor, but it's probably better to optimise for high g using high V and M.

    The LSAT does seem like a fairly general test that isn't strongly skewed towards verbal capabilities, assuming the http://en.wikipedia.org/wiki/Law_School_Admission_Test list of scores is accurate, even if the pool of test takers might be.

    ReplyDelete
  33. ben_g7:11 AM

    In an earlier thread I raised the problem of gene-environment correlations.  Here's a specific one to consider.. Suppose that parents with high IQ genes are able to impart better environments for their children.  If these environments have any effect, then they'll be correlated with the signals being picked up by Steve's study.  Furthermore, there's no reason to believe this correlation wouldn't exist in other societies, so replicating in other populations wouldn't avoid this issue.

    ReplyDelete
  34. sykes.17:45 AM

    Angela is right. Nowadays, physicists are not permitted to have opinions on IQ.

    ReplyDelete
  35. Funny.  I used to make it a habit of applying for things I was 'kind-of' to 'mostly' qualified for in case I got lucky.

    ReplyDelete
  36. Leor Jacobi8:22 AM

    I think I'm pretty smart, but I haven't been able to figure out when one can apply.

    ReplyDelete
  37. Leor Jacobi8:23 AM

    When can we apply?   Maybe I'm not smart enough to figure that one out...

    ReplyDelete
  38. The volunteer page will go live in a few hours.

    ReplyDelete
  39. James_Lee9:42 AM

    What you are saying is that the "average excess" will exceed the "average effect" as a result of a kind of population structure: ability-increasing alleles are confounded with beneficial environments. See Ronald Fisher's paper on the subject for definitions of these terms and explications of their meanings. I do not think it can be rigorously shown that Fisher's implicit regression of the phenotype on all loci in the genome properly isolates the effect of an actual causal locus (see this book for what this means)--but it is certainly a very reasonable notion. In any case many tools devised by statistical geneticists (e.g., EIGENSTRAT and EMMAX) can be seen as approximations of Fisher's ideal, and these have proven to be extremely successful in the control of population structure. See this new paper on multiple sclerosis for an impressive exampe.

    ReplyDelete
  40. James Lee9:50 AM

    The GIANT Consortium has confirmed that the great majority of their height loci discovered in population samples replicate in within-family designs. Since nature randomly selects which allele a heterozygous parent passes on to an offspring, within-family designs are immune from population structure. In the future, as GWAS expands for any given phenotype, this kind of confirmation of population associations in (smaller) samples of families will be highly desirable.

    ReplyDelete
  41. James Lee9:54 AM

    The relation between time taken and ability is rather complex. A rough generalization is that more able people take less time on easy times and more time on hard items; less able people tend to give up quickly on harder items.

    Psychometricians have proposed using the time taken on a given item to update the provisional estimate of an examinee's ability in computer-adaptive testing. Taking into temporal information should thus extract more information from a fixed number of items. I do not know if any operational testing programs have actually incorporated a proposal of this kind.

    ReplyDelete
  42. saucyskeptic10:38 AM

    ROFL! Y'all are amusingly obsessed (OBSESSED!) with your own brilliance and the need for proof thereof. I look forward to watching the talk. A thought on the use of LSAT scores -- I smoked the LSAT... but that was way back in 1990.  Year by year the LSAT has become much harder. Perhaps the exam administrator (is that LSAC?) has a meaningful way to compare scores from different years but I suspect you'll run into the same "low ceiling" problem with old LSAT scores that you have with SAT scores.

    ReplyDelete
  43. William_JD12:38 PM

    My SAT scores make me an automatic qualifier, but how will you verify this claim? 

    ReplyDelete
  44. Free beer ought to be a bigger draw than free genotyping.

    ReplyDelete
  45. ben_g2:09 PM

    James,  thanks for the response..

    First, is my example really a case of population stratification?  I thought population stratification required that there be sub-populations with systematic differences in allele frequencies outside of the genes that have an effect.  What I raised would be a problem even if the high IQ people only correlated on the IQ effect genes..  So I don't see how it can be controlled for in the same way as typical population stratification.

    Second, isn't using principle components to control for population structure not without controversy?  For example, see this criticism of the method http://www.cell.com/AJHG/retrieve/pii/S0002929711002187

    ReplyDelete
  46. James Lee4:10 PM

    Thanks for bringing that letter to our attention. 

    The letter addresses a slightly different issue than the control of population structure in determining the effect of a single locus. The letter criticizes a method introduced by Goddard and colleagues for estimating the total genetic variance associated with the SNPs that happen to be present on a genotyping chip (without regard for individual loci). They point out that the method produces a massively biased estimate if there is extreme population structure, a bias that is only partially removed by the use of PCs as regression covariates. (I use the term "population structure," here, to mean any confounding of genotype with other causes of the trait, including environmental causes. The reply by Goddard et al. makes finer distinctions.) 

    What are the implications of this for identifying individual causal variants? Well, the letter cites the EMMAX method as being appropriate in this context, so according to the letter--not much. Thinking more about your own example of ability being confounded with the environmental boost of being raised by smart parents, I am no longer certain that genomic background can fully control for it. (Even God would not be able to predict the ability of your parents with complete accuracy from just your genome.)  However, as I said, family designs are immune to confounding, and in the future I anticipate that such designs will be used to verify any results in samples of unrelated individuals.

    The separate issue of whether the letter casts any doubts on applications of SNP-based heritability estimation is also an interesting one. In their reply, I think Goddard et al. get the better of the argument.

    ReplyDelete
  47. ben_g4:49 PM

    James, thanks for the great replies and good luck on the study!

    ReplyDelete
  48. Hao Ye6:35 PM

    Taking the 2010 stats, ~130k took the AMC 10/12 (about 60k each).  About ~500 qualify for the USAMO/USAJMO, or about 1 in 250.  There's definitely some self-selection bias for people taking the test, so it seems reasonable to me.  Oddly enough, if you go back further, there were more students taking the AHSME (240k in 1999) with fewer USAMO qualifiers (around 200 in 1999, I think).

    I wonder if the self-selection bias has increased over time? Or maybe the increase of standardized testing has edged out the AMC?

    ReplyDelete
  49. Hao Ye6:43 PM

    What makes you think the cutoffs are low?
    2010 stats for the SAT (http://professionals.collegeboard.com/profdownload/sat-percentile-ranks-composite-cr-m-2010.pdf) indicate 4646 with 1560+ for V+M out of ~1500k.  Even if you suppose all of those scorers are 800M, that's still +2.7 SD.

    Also keep in mind that there's huge selection bias for the GRE already, so the percentiles are not what they would be for the general population.

    ReplyDelete
  50. Hao Ye6:48 PM

    There have been plenty of papers showing heritability, so looking for a genetic basis is a logical next step.  I don't see it as a political minefield unless you start throwing in race and gender (see Larry Summers).

    ReplyDelete
  51. sykes.18:18 AM

    Again, Angela is correct. I taught at colleges and universities for 37 years Nowadays, each institution has an administrative office dedicated to detecting and suppressing politically correct ideas, usually imposing such punishments as outright dismissal, loss of salary or tenure or rank, or public humiliation. Prof. Hsu has a job only because the office at UO hasn't found his blog yet. 

    By the way, Angela, it's "cojones."

    ReplyDelete
  52. I made a 1500 (740 verbal, 760 math) and then a 1560 (760 verbal, 800 math) on the SATs in late 2006. Would my higher score get me in?

    ReplyDelete
  53. Why is genetic basis of intelligence controversial? This is the mainstream view in psychology and biology. Whether you or millions of average Americans like this or not will not prevent other countries from studying this to benefit their people and prevent the truth from coming out.
    Last I checked, even scholar like Philippe Rushton is still tenured with the University of Western Ontario. If you call Steve's research controversial, what do you call Phillippe Rushton's research?

    ReplyDelete
  54. Biology is built on physical laws, unless you think intelligence has no biological basis.
    Plus have you ever heard of a term called polymath?

    ReplyDelete
  55. James Lee10:53 PM

    We will count your most recent score.

    ReplyDelete
  56. whatisgoingon whatisgoingon11:17 PM

    Aww. 800 math, 800 math 2,800 physics,740 verbal

    Damn, so close. I guess I have to wait for graduate school.

    ReplyDelete
  57. whatisgoingon whatisgoingon12:01 AM

    Not quite.  The gre-m is taken by college students applying to graduate school. So that means that those taking the test not only got into college, but then had higher than average gpa's there.  So it may be closer to a 2.5+ for the math. You need to account for the fact that those applying to grad school probably have an average iq of 110 at least. Well, hopefully.

    ReplyDelete
  58. James Lee10:54 AM

    We cannot give many additional details regarding the design of this study (or others we are carrying out) for several reasons. One is that our potential and actual collaborators may not want to be disclosed at the moment.

    ReplyDelete
  59. TheGuyFromEarlier3:31 PM

    My brother makes the cutoff.  Alas, I do not.  Le sigh.

    ...but mama says i'm good at other stuff...!  I can draw real good.

    ReplyDelete
  60. MtMoru11:40 PM

    I'm an automatic, but in the consent form there's this:
    At an advanced stage of the study, BGI-CGL may provide you access to your genetic dataand interpretations thereof with respect to ancestry, disease risk, and predicted trait levels(including level of cognitive ability).Is that estimate of cognitive ability for the non-automatics only. If not there's a problem. If you

    ReplyDelete
  61. MtMoru11:52 PM

    I'm an automatic, but in the consent form there's this: At an advanced stage of the study, BGI-CGL may provide you access to your genetic data and interpretations thereof with respect to ancestry, disease risk, and predicted trait levels (including level of cognitive ability). Is that estimate of cognitive ability for the non-automatics only. If not there's a huge problem.

    ReplyDelete
  62. The discrepancy between the trait prediction and your actual phenotype is an (increasingly noisy, as the relative contribution of environment increases) estimator of how much is still unknown about your genome.

    ReplyDelete
  63. ben_g2:17 PM

    I'm interested in the answer to point #1.. On that note, what if you have a curiosity gene, which made you want to be a case? 
    Or a teaching gene that made you want to be in a PhD program?  Anything
    that separates the case group from smart people as a whole could
    confound the study.

    ReplyDelete
  64. James Lee7:28 PM

    Your genetic data will probably never provide as much information about your phenotype as measurements of the phenotype itself. If you want to know how fast you are, use a stopwatch; don't bother to measure your ACTN3 genotype. That said, even elite athletes often *are* curious about their ACTN3 genotype, and there seems to be no harm in allowing that itch to be scratched. 

    Hopefully we can tell whether an association arises from population stratification.

    ReplyDelete
  65. James Lee7:31 PM

    A case-control design that relies on volunteers to fill out the case group cannot get around this problem. There will need to be replication in other designs that do not suffer from this flaw.

    ReplyDelete
  66. esmith5:07 PM

    I was thinking about prospects of genetic engineering (what would a person with all IQ genes turned on look like?) and estimates on page 28 of the slides made me realize something.

    It assumes that intelligence is determined by many (10^3) genes of equal small effect. But it can't work like that! Either the number N must be much smaller, or some genes are significantly more important than others.

    Suppose that there are in fact N genes of equal effect. For simplicity, assume that they all have normal allele frequencies of 50%. Then we should be able to construct an "intelligence measure" equal to the share of positive alleles among these N, which correlates linearly with physically measurable quantities, e.g. the speed of solving Raven's matrices of fixed difficulty.

    If N=10^3, then the average person has 50% of positive alleles and the person at +3 SD has 57% of positive alleles. It means that the person at +3 SD would only be 10-15% better/faster on any such test. But that is obviously not the case.

    The very difficulty of devising tests that measure IQ much further than that would suggest that people at +3..4 SD have their abilities nearly "saturated", which could happen if N is rather low. For example, if N=50, then the median person has 25 positive alleles and the person at +4 SD has 47 positive alleles (and the remaining 3 would not matter much).

    Of course, the very idea of additive IQ is rather crude, because, at some points, there are qualitative shifts. Hence IQ is hard to reduce to a linear performance measure like height or a 100-meter sprint time. But still, that is a useful perspective.

    If I think about it some more, I should be able to come up with estimates of N and tests that help us measure it.

    ReplyDelete
  67. James Lee6:12 PM

    The simplifying assumptions in that slide are made purely to allow the relevant point to be made with a minimum of complications. The assumptions themselves should not be taken too seriously.

    There is a quantitative-genetic literature on the estimation of gene number. To summarize, these methods are not very informative. 

    ReplyDelete
  68. The model in the slides is just a toy model to illustrate scaling. In reality there will be distributions in effect sizes and allele frequencies in a particular population. See the height results which are starting to flesh this out for a different quantitative phenotype.

    ReplyDelete
  69. esmith8:48 PM

    Are you aware of any studies that quantify the relationship between processing speed and IQ? (Preferably the problem-solving processing speed, and not things like reaction time.) I'm trying to quantify it, and I'm getting curious results (see image). The huge dynamic range leads me to suspect that the mean frequency of IQ-positive alleles is very low (maybe 10-20%). But it's hard to reproduce the high-end behavior, regardless of the model I try to use.

    ReplyDelete
  70. esmith11:02 PM

    On the second thought, that high end behavior is EXACTLY what we should expect ... Suppose that we break down the time to execute a task into N pieces, and time to execute each piece depends on a single gene, and total time is a simple sum of all pieces. Having a few "strong" genes which can significantly reduce the processing time, and a lot of "weak" genes, each of which independently shave off a percent or two, would produce the relationship between 'g' and processing speed as shown above.

    Let me see if I can come up with a good fit now.

    ReplyDelete
  71. MtMoru9:55 AM

    "Then we should be able to construct an "intelligence measure" equal to the share of positive alleles among these N, which correlates linearly with physically measurable quantities."
     
    If you wanted to, but there'd still be a bell shaped curve.
     
    "It means that the person at +3 SD would only be 10-15% better/faster on any such test."
     
    It doesn't mean that. The samll effect is in IQ points. That's the measure of better worse. With 500,000 SNPs all with +1/2 or -1/2 point effect for homos and 0 points for heteros with probs 1/4, 1/4, 1/2 the SD is sqrt(500,000)*1/2 = A LOT assuming no covariance. 

    ReplyDelete
  72. William_JD9:46 AM

    On the "Volunteer" page, when I enter my email address and click submit, I receive the following message:  "Check your email for instructions."  It's been a week since I submitted my email address, and I have yet to receive any instructions.  When can I expect them?

    ReplyDelete
  73. After you've entered your email address on the volunteer page once, any further submits don't send additional emails.  Check your spam filter on the day you first tried to submit.

    ReplyDelete
  74. William_JD11:16 AM

    I've received an email finally -- thanks.

    ReplyDelete
  75. efalken6:33 PM

    I thought humans had only 25k genes.  1000 relate to intelligence?  That seems a lot. Our bodies have a lot going on other than g-related activity.

    ReplyDelete
  76. esmith3:00 AM

    But a surprisingly large part of those 25k is responsible for brain development or functioning of the nervous system. There was an article a few years ago that estimated that 58% of human transcriptome is expressed in brains of at least 5% of humans. The human brain map at http://human.brain-map.org identifies around 1000 genes which may be relevant here.

    ReplyDelete
  77. esmith4:24 AM

    Hmm, I could swear I made a response to this, but it's not visible any more?

    Anyway. Have you ever heard of the protein domain DUF1220? This is a protein domain of unknown function that is encoded independently by at least 30 and possibly over 60 different genes (some of them also do it multiple times); it's highly specific to humans (we have 6 times the number of copies of higher apes and it's almost nonexistent in other mammals); it's expressed primarily in regions of the brain responsible for higher cognitive function, and its copy number variation is correlated with things like brain size, the risk of autism, and the risk of schizophrenia. I'd expect to see a correlation with IQ as well. That's 60 genes right there. And it's just one pathway of many.

    ReplyDelete
  78. Steve,
    Are open discussions allowed on your blog? I felt compelled to comment on this study of intelligence and posted some points on what I felt was an incongruence with your intended study of intelligence and the qualifying criteria listed, yesterday, but post seemed to have been deleted.

    A reply would be appreciated.

    ReplyDelete
  79. Not sure why the Disqus spam filter grabbed your comment. But I've now released it.

    ReplyDelete