Tuesday, September 07, 2021

Kathryn Paige Harden Profile in The New Yorker (Behavior Genetics)

This is a good profile of behavior geneticist Paige Harden (UT Austin professor of psychology, former student of Eric Turkheimer), with a balanced discussion of polygenic prediction of cognitive traits and the culture war context in which it (unfortunately) exists.
Can Progressives Be Convinced That Genetics Matters? 
The behavior geneticist Kathryn Paige Harden is waging a two-front campaign: on her left are those who assume that genes are irrelevant, on her right those who insist that they’re everything. 
Gideon Lewis-Kraus
Gideon Lewis-Kraus is a talented writer who also wrote a very nice article on the NYTimes / Slate Star Codex hysteria last summer.

Some references related to the New Yorker profile:
1. The paper Harden was attacked for sharing while a visiting scholar at the Russell Sage Foundation: Game Over: Genomic Prediction of Social Mobility 

2. Harden's paper on polygenic scores and mathematics progression in high school: Genomic prediction of student flow through high school math curriculum 

3. Vox article; Turkheimer and Harden drawn into debate including Charles Murray and Sam Harris: Scientific Consensus on Cognitive Ability?

A recent talk by Harden, based on her forthcoming book The Genetic Lottery: Why DNA Matters for Social Equality

Regarding polygenic prediction of complex traits 

I first met Eric Turkheimer in person (we had corresponded online prior to that) at the Behavior Genetics Association annual meeting in 2012, which was back to back with the International Conference on Quantitative Genetics, both held in Edinburgh that year (photos and slides [1] [2] [3]). I was completely new to the field but they allowed me to give a keynote presentation (if memory serves, together with Peter Visscher). Harden may have been at the meeting but I don't recall whether we met. 

At the time, people were still doing underpowered candidate gene studies (there were many talks on this at BGA although fewer at ICQG) and struggling to understand GCTA (Visscher group's work showing one can estimate heritability from modestly large GWAS datasets, results consistent with earlier twins and adoption work). Consequently a theoretical physicist talking about genomic prediction using AI/ML and a million genomes seemed like an alien time traveler from the future. Indeed, I was.

My talk is largely summarized here:
On the genetic architecture of intelligence and other quantitative traits 
How do genes affect cognitive ability or other human quantitative traits such as height or disease risk? Progress on this challenging question is likely to be significant in the near future. I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a "general factor" or g score. The main results concern the stability, validity (predictive power), and heritability of adult g. The largest component of genetic variance for both height and intelligence is additive (linear), leading to important simplifications in predictive modeling and statistical estimation. Due mainly to the rapidly decreasing cost of genotyping, it is possible that within the coming decade researchers will identify loci which account for a significant fraction of total g variation. In the case of height analogous efforts are well under way. I describe some unpublished results concerning the genetic architecture of height and cognitive ability, which suggest that roughly 10k moderately rare causal variants of mostly negative effect are responsible for normal population variation. Using results from Compressed Sensing (L1-penalized regression), I estimate the statistical power required to characterize both linear and nonlinear models for quantitative traits. The main unknown parameter s (sparsity) is the number of loci which account for the bulk of the genetic variation. The required sample size is of order 100s, or roughly a million in the case of cognitive ability.
The predictions in my 2012 BGA talk and in the 2014 review article above have mostly been validated. Research advances often pass through the following phases of reaction from the scientific community:
1. It's wrong ("genes don't affect intelligence! anyway too complex to figure out... we hope")
2. It's trivial ("ofc with lots of data you can do anything... knew it all along")
3. I did it first ("please cite my important paper on this")
Or, as sometimes attributed to Gandhi: "First they ignore you, then they laugh at you, then they fight you, then you win.”

Technical note

In 2014 I estimated that ~1 million genotype | phenotype pairs would be enough to capture most of the common SNP heritability for height and cognitive ability. This was accomplished for height in 2017. However, the sample size of well-phenotyped individuals is much smaller for cognitive ability, even in 2021, than for height in 2017. For example, in UK Biobank the cognitive test is very brief (~5 minutes IIRC, a dozen or so questions), but it has not even been administered to the full cohort as yet. In the Educational Attainment studies the phenotype EA is only moderately correlated (~0.3 ?) or so with actual cognitive ability.

Hence, although the most recent EA4 results use 3 million individuals [1], and produce a predictor which correlates ~0.4 with actual EA, the statistical power available is still less than what I predicted would be required to train a really good cognitive ability predictor.

In our 2017 height paper, which also briefly discussed bone density and cognitive ability prediction, we built a cognitve ability predictor roughly as powerful as EA3 using only ~100k individuals with the noisy UKB test data. So I remain confident that  ~million individuals with good cognitive scores (e.g., SAT, AFQT, full IQ test) would deliver results far beyond what we currently have available. We also found that our predictor, built using actual (albeit noisy) cognitive scores exhibits less power reduction in within-family (sibling) analyses compared to EA. So there is evidence that (no surprise) EA is more influenced by environmental factors, including so-called genetic nurture effects, than is cognitive ability.

A predictor which captures most of the common SNP heritability for cognitive ability might correlate ~0.5 or 0.6 with actual ability. Applications of this predictor in, e.g., studies of social mobility or educational success or even longevity using existing datasets would be extremely dramatic.

No comments:

Blog Archive