The new material is in slides 13-17. See also Live Long and Prosper: Genetic Architecture of Complex Traits and Disease Risk Predictors. I believe the sibling validation results are extremely important: typically most of the predictive power persists in within-family validation tests. We have not released this paper but will soon -- the slides are a preview. To be honest I fully anticipated these results: the large number of out of sample predictor validations using unrelated individuals strongly suggests that real genetic effects are at work. However, many people are irrationally biased against -- have strong priors against -- genetic causation of complex traits (even disease risks). These family designs provide important "gold standard" evidence, which, one can hope, will enlighten even the most stubborn. The sad alternative is progress one funeral at a time...
Otherwise the talk is similar to the one I gave at the Berkeley/UCSF Innovative Genomics Institute last summer. Video of IGI talk.
Title: Genomic Prediction of Complex Traits and Disease Risks via AI/ML and Large Genomic DatasetsSome photos. The ones on the wall of the seminar room capture a golden era in molecular biology and the study of DNA. Leo Szilard on the right in the one below. Also, Jacques Monod, Crick and Watson, Wally Gilbert, Max Delbruck, Frank Stahl, Francois Jacob, David Baltimore. Of these individuals I have known four in person. I would give a lot to have met Crick and especially Szilard. While at CSHL I learned that James Watson is still alive and intellectually active.
Abstract: The talk is divided into two parts. The first gives an overview of the rapidly advancing area of genomic prediction of disease risks using polygenic scores. We can now identify risk outliers (e.g., with 5 or 10 times normal risk) for about 20 common disease conditions, ranging from diabetes to heart diseases to breast cancer, using inexpensive SNP genotypes (i.e., as offered by 23andMe). We can also predict some complex quantitative traits (e.g., adult height with accuracy of few cm, using ~20k SNPs). I discuss application of these results in precision medicine as well as embryo selection in IVF, and give some details about genetic architectures. The second part covers the AI/ML used to build these predictors, with an emphasis on "sparse learning" and phase transitions in high dimensional statistics.
See H. Judson's The Eighth Day of Creation (PDF) for a brilliant but readable history of the golden age of molecular biology.