Tuesday, April 01, 2014

Sequencing and GWAS

A very nice discussion of the challenges associated with sequence data, as opposed to SNP array output, in GWAS. All of these issues are familiar to our team as we work with our high cognitive ability sample at BGI.
8 Realities of the Sequencing GWAS

For several years, the genome-wide association study (GWAS) has served as the flagship discovery tool for genetic research, especially in the arena of common diseases. The wide availability and low cost of high-density SNP arrays made it possible to genotype 500,000 or so informative SNPs in thousands of samples. These studies spurred development of tools and pipelines for managing large-scale GWAS, and thus far they’ve revealed hundreds of new genetic associations.

As we all know, the cost of DNA sequencing has plummeted. Now it’s possible to do targeted, exome, or even whole-genome sequencing in cohorts large enough to power GWAS analyses. While we can leverage many of the same tools and approaches developed for SNP array-based GWAS, the sequencing data comes with some very important differences.

...

These caveats of the sequencing GWAS, while important, should not detract from the advantages over SNP array-based experiments. Sequencing studies enable the discovery, characterization, and association of many forms of sequence variation — SNPs, DNPs, indels, etc. — in a single experiment. They capture known as well as unknown variants.

Sequencing also produces an archive that can be revisited and re-analyzed in the future. That’s why submitting BAM files and good clinical data to public repositories — like dbGaP — is so important. Single analyses and meta-analyses of sequencing GWAS may ultimately help us understand the contribution of all forms of genetic variation (common, rare, SNPs, indels) to important human traits.

Blog Archive

Labels