GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment (Science)Clueless extremists may want to point and shout "Eugenics!" at these researchers, but I wouldn't recommend it. Sample author affiliations below -- no sinister Chinese institutions as far as I can tell ;-)
A genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent SNPs are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈ 2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
1 Department of Applied Economics, Erasmus School of Economics, Erasmus University Rotterdam, 3000 DR Rotterdam, The Netherlands.
2 Department of Epidemiology, Erasmus Medical Center, Rotterdam 3000 CA, The Netherlands.
3 Queensland Institute of Medical Research, 300 Herston Road, Brisbane, Queensland 4006, Australia.
4 Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309–0447, USA.
5 University of Queensland Diamantina Institute, The University of Queensland, Princess Alexandra Hospital, Brisbane, Queensland 4102, Australia.
125 Centre for Medical Systems Biology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands.
126 Department of Economics, Cornell University, Ithaca, NY 14853, USA.
127 Center for Experimental Social Science, Department of Economics, New York University, New York, NY 10012, USA.
128 Division of Social Science, New York University Abu Dhabi, PO Box 129188, Abu Dhabi, UAE.
129 Research Institute of Industrial Economics, Stockholm 102 15, Sweden.
I expect the future of this kind of research to look like earlier GWAS, with steady accumulation of hits now that we have passed the statistical power threshold.
Related: Myopia GWAS results (Nature Genetics).
Note Added: I've been asked by several people whether this is a discouraging result. If effect sizes are so small, won't it take enormous sample sizes to detect specific alleles accounting for a big chunk of total genetic variance? There is some relevant discussion in the supplement to the paper (see figure S22 and section 7). The answer to the question really depends on the correlation between g and years of education (most of the data these researchers had access to specified educational attainment as the phenotype, with no direct measurement of g). If, for example, the correlation is 0.5, and it is actually g that is driving the effect on years of education, then the corresponding g effect size for these alleles is (1/0.5)^2 or 4 times larger in variance units. This makes the g effect size in variance units about 5 times smaller than for the corresponding largest height locus. However, if the correlation is only 0.25, the g effect is about as big as the largest height locus. Having looked at correlations between SAT and college GPA, I'd guess that 0.5 is too large, but on the other hand in the Swedish sample for which they have both g and years of education the correlation is 0.46. Using 0.5 as the correct correlation, the minimal sample size with actual g data to detect these rs alleles is (see Fig S22) in the 20-50k range. I'd guess that, worst case, the sample size requirements are still less than an order of magnitude larger for g than for height. However, one can't be very confident of any guess because of the uncertainties discussed above, and because we've only seen the first few alleles.