Wednesday, June 10, 2015

Sparsity estimates for complex traits

Note the estimate of few to ten thousand causal SNP variants, consistent with my estimates for height and cognitive ability.

Sparsity (number of causal variants), along with heritability, determines the amount of data necessary to "solve" a specific trait. See Genetic architecture and predictive modeling of quantitative traits.

T1D looks like it could be cracked with only a limited amount of data.
Simultaneous Discovery, Estimation and Prediction Analysis of Complex Traits Using a Bayesian Mixture Model

PLoS Genet 11(4): e1004969. doi:10.1371/journal.pgen.1004969

Gene discovery, estimation of heritability captured by SNP arrays, inference on genetic architecture and prediction analyses of complex traits are usually performed using different statistical models and methods, leading to inefficiency and loss of power. Here we use a Bayesian mixture model that simultaneously allows variant discovery, estimation of genetic variance explained by all variants and prediction of unobserved phenotypes in new samples. We apply the method to simulated data of quantitative traits and Welcome Trust Case Control Consortium (WTCCC) data on disease and show that it provides accurate estimates of SNP-based heritability, produces unbiased estimators of risk in new samples, and that it can estimate genetic architecture by partitioning variation across hundreds to thousands of SNPs. We estimated that, depending on the trait, 2,633 to 9,411 SNPs explain all of the SNP-based heritability in the WTCCC diseases. The majority of those SNPs (>96%) had small effects, confirming a substantial polygenic component to common diseases. The proportion of the SNP-based variance explained by large effects (each SNP explaining 1% of the variance) varied markedly between diseases, ranging from almost zero for bipolar disorder to 72% for type 1 diabetes. Prediction analyses demonstrate that for diseases with major loci, such as type 1 diabetes and rheumatoid arthritis, Bayesian methods outperform profile scoring or mixed model approaches.
Table S5 below gives estimates of sparsity for various disease conditions.


Coronary Artery Disease CAD
Type 1 diabetes T1D
Type 2 diabetes T2D
Crohn's disease CD
Hypertension HT
Bipolar disorder BD
Rheumatoid arthritis RA

No comments:

Post a Comment