New paper, prepared for the book
Genomic Prediction of Complex Traits, Springer Nature series
Methods in Molecular Biology.
From Genotype to Phenotype: polygenic prediction of complex human traits
arXiv.org > q-bio > arXiv:2101.05870 33 pages, 7 figures, 1 table
Timothy G. Raben, Louis Lello, Erik Widen, Stephen D.H. Hsu
Decoding the genome confers the capability to predict characteristics of the organism (phenotype) from DNA (genotype). We describe the present status and future prospects of genomic prediction of complex traits in humans. Some highly heritable complex phenotypes such as height and other quantitative traits can already be predicted with reasonable accuracy from DNA alone. For many diseases, including important common conditions such as coronary artery disease, breast cancer, type I and II diabetes, individuals with outlier polygenic scores (e.g., top few percent) have been shown to have 5 or even 10 times higher risk than average. Several psychiatric conditions such as schizophrenia and autism also fall into this category. We discuss related topics such as the genetic architecture of complex traits, sibling validation of polygenic scores, and applications to adult health, in vitro fertilization (embryo selection), and genetic engineering.
From the introduction:
I, on the other hand, knew nothing, except ... physics and mathematics and an ability to turn my
hand to new things. — Francis Crick
The challenge of decoding the genome has loomed large over biology since the time of Watson and Crick. Initially, decoding referred to the relationship between DNA and specific proteins
or molecular mechanisms, but the ultimate goal is to deduce the relationship between DNA and
phenotype — the character of the organism itself. How does Nature encode the traits of the organism in DNA? In this review we describe recent advances toward this goal, which have resulted
from the application of machine learning (ML) to large genomic data sets. Genomic prediction is
the real decoding of the genome: the creation of mathematical models which map genotypes to
complex traits.
It is a peculiarity of ML and artificial intelligence (AI) applied to complex systems that these
methods can often “solve” a problem without explicating, in a manner that humans can absorb,
the intricate mechanisms that lie intermediate between input and output. For example, AlphaGo
[1] achieved superhuman mastery of an ancient game that had been under serious study for
thousands of years. Yet nowhere in the resulting neural network with millions of connection
strengths is there a human-comprehensible guide to Go strategy or game dynamics. Similarly, genomic prediction has produced mathematical functions which predict quantitative human traits
with surprising accuracy — e.g., height, bone density, and cholesterol or lipoprotein A levels in
blood (see Table 1); using typically thousands of genetic variants as input (see next section for
details) — but without explicitly revealing the role of these variants in actual biochemical mechanisms. Characterizing these mechanisms — which are involved in phenomena such as bone
growth, lipid metabolism, hormonal regulation, protein interactions — will be a project which
takes much longer to complete.
If recent trends persist, in particular the continued growth of large genotype | phenotype
data sets, we will likely have good genomic predictors for a host of human traits within the next
decade. ...
No comments:
Post a Comment