8 Realities of the Sequencing GWAS
For several years, the genome-wide association study (GWAS) has served as the flagship discovery tool for genetic research, especially in the arena of common diseases. The wide availability and low cost of high-density SNP arrays made it possible to genotype 500,000 or so informative SNPs in thousands of samples. These studies spurred development of tools and pipelines for managing large-scale GWAS, and thus far they’ve revealed hundreds of new genetic associations.
As we all know, the cost of DNA sequencing has plummeted. Now it’s possible to do targeted, exome, or even whole-genome sequencing in cohorts large enough to power GWAS analyses. While we can leverage many of the same tools and approaches developed for SNP array-based GWAS, the sequencing data comes with some very important differences.
...
These caveats of the sequencing GWAS, while important, should not detract from the advantages over SNP array-based experiments. Sequencing studies enable the discovery, characterization, and association of many forms of sequence variation — SNPs, DNPs, indels, etc. — in a single experiment. They capture known as well as unknown variants.
Sequencing also produces an archive that can be revisited and re-analyzed in the future. That’s why submitting BAM files and good clinical data to public repositories — like dbGaP — is so important. Single analyses and meta-analyses of sequencing GWAS may ultimately help us understand the contribution of all forms of genetic variation (common, rare, SNPs, indels) to important human traits.
Pessimism of the Intellect, Optimism of the Will Favorite posts | Manifold podcast | Twitter: @hsu_steve
Tuesday, April 01, 2014
Sequencing and GWAS
A very nice discussion of the challenges associated with sequence data, as opposed to SNP array output, in GWAS. All of these issues are familiar to our team as we work with our high cognitive ability sample at BGI.
Labels:
bgi,
genetics,
genomics,
technology
Subscribe to:
Post Comments (Atom)
Blog Archive
-
▼
2014
(184)
-
▼
04
(12)
- Larry's rules
- The Soul of the Research University
- Early intervention and long term outcomes
- No Exit
- Two cultures in Chicago
- Cognition über alles
- Human Capital, Genetics and Behavior
- The essential difference
- Measuring Wealth Inequality
- CRISPR symposium at MSU
- Implications of cosmological tensor modes
- Sequencing and GWAS
-
▼
04
(12)
Labels
- physics (420)
- genetics (325)
- globalization (301)
- genomics (295)
- technology (282)
- brainpower (280)
- finance (275)
- american society (261)
- China (249)
- innovation (231)
- ai (206)
- economics (202)
- psychometrics (190)
- science (172)
- psychology (169)
- machine learning (166)
- biology (163)
- photos (162)
- genetic engineering (150)
- universities (150)
- travel (144)
- podcasts (143)
- higher education (141)
- startups (139)
- human capital (127)
- geopolitics (124)
- credit crisis (115)
- political correctness (108)
- iq (107)
- quantum mechanics (107)
- cognitive science (103)
- autobiographical (97)
- politics (93)
- careers (90)
- bounded rationality (88)
- social science (86)
- history of science (85)
- realpolitik (85)
- statistics (83)
- elitism (81)
- talks (80)
- evolution (79)
- credit crunch (78)
- biotech (76)
- genius (76)
- gilded age (73)
- income inequality (73)
- caltech (68)
- books (64)
- academia (62)
- history (61)
- intellectual history (61)
- MSU (60)
- sci fi (60)
- harvard (58)
- silicon valley (58)
- mma (57)
- mathematics (55)
- education (53)
- video (52)
- kids (51)
- bgi (48)
- black holes (48)
- cdo (45)
- derivatives (43)
- neuroscience (43)
- affirmative action (42)
- behavioral economics (42)
- economic history (42)
- literature (42)
- nuclear weapons (42)
- computing (41)
- jiujitsu (41)
- physical training (40)
- film (39)
- many worlds (39)
- quantum field theory (39)
- expert prediction (37)
- ufc (37)
- bjj (36)
- bubbles (36)
- mortgages (36)
- google (35)
- race relations (35)
- hedge funds (34)
- security (34)
- von Neumann (34)
- meritocracy (31)
- feynman (30)
- quants (30)
- taiwan (30)
- efficient markets (29)
- foo camp (29)
- movies (29)
- sports (29)
- music (28)
- singularity (27)
- entrepreneurs (26)
- conferences (25)
- housing (25)
- obama (25)
- subprime (25)
- venture capital (25)
- berkeley (24)
- epidemics (24)
- war (24)
- wall street (23)
- athletics (22)
- russia (22)
- ultimate fighting (22)
- cds (20)
- internet (20)
- new yorker (20)
- blogging (19)
- japan (19)
- scifoo (19)
- christmas (18)
- dna (18)
- gender (18)
- goldman sachs (18)
- university of oregon (18)
- cold war (17)
- cryptography (17)
- freeman dyson (17)
- smpy (17)
- treasury bailout (17)
- algorithms (16)
- autism (16)
- personality (16)
- privacy (16)
- Fermi problems (15)
- cosmology (15)
- happiness (15)
- height (15)
- india (15)
- oppenheimer (15)
- probability (15)
- social networks (15)
- wwii (15)
- fitness (14)
- government (14)
- les grandes ecoles (14)
- neanderthals (14)
- quantum computers (14)
- blade runner (13)
- chess (13)
- hedonic treadmill (13)
- nsa (13)
- philosophy of mind (13)
- research (13)
- aspergers (12)
- climate change (12)
- harvard society of fellows (12)
- malcolm gladwell (12)
- net worth (12)
- nobel prize (12)
- pseudoscience (12)
- Einstein (11)
- art (11)
- democracy (11)
- entropy (11)
- geeks (11)
- string theory (11)
- television (11)
- Go (10)
- ability (10)
- complexity (10)
- dating (10)
- energy (10)
- football (10)
- france (10)
- italy (10)
- mutants (10)
- nerds (10)
- olympics (10)
- pop culture (10)
- crossfit (9)
- encryption (9)
- eugene (9)
- flynn effect (9)
- james salter (9)
- simulation (9)
- tail risk (9)
- turing test (9)
- alan turing (8)
- alpha (8)
- ashkenazim (8)
- data mining (8)
- determinism (8)
- environmentalism (8)
- games (8)
- keynes (8)
- manhattan (8)
- new york times (8)
- pca (8)
- philip k. dick (8)
- qcd (8)
- real estate (8)
- robot genius (8)
- success (8)
- usain bolt (8)
- Iran (7)
- aig (7)
- basketball (7)
- free will (7)
- fx (7)
- game theory (7)
- hugh everett (7)
- inequality (7)
- information theory (7)
- iraq war (7)
- markets (7)
- paris (7)
- patents (7)
- poker (7)
- teaching (7)
- vietnam war (7)
- volatility (7)
- anthropic principle (6)
- bayes (6)
- class (6)
- drones (6)
- econtalk (6)
- empire (6)
- global warming (6)
- godel (6)
- intellectual property (6)
- nassim taleb (6)
- noam chomsky (6)
- prostitution (6)
- rationality (6)
- academia sinica (5)
- bobby fischer (5)
- demographics (5)
- fake alpha (5)
- kasparov (5)
- luck (5)
- nonlinearity (5)
- perimeter institute (5)
- renaissance technologies (5)
- sad but true (5)
- software development (5)
- solar energy (5)
- warren buffet (5)
- 100m (4)
- Poincare (4)
- assortative mating (4)
- bill gates (4)
- borges (4)
- cambridge uk (4)
- censorship (4)
- charles darwin (4)
- computers (4)
- creativity (4)
- hormones (4)
- humor (4)
- judo (4)
- kerviel (4)
- microsoft (4)
- mixed martial arts (4)
- monsters (4)
- moore's law (4)
- soros (4)
- supercomputers (4)
- trento (4)
- 200m (3)
- babies (3)
- brain drain (3)
- charlie munger (3)
- cheng ting hsu (3)
- chet baker (3)
- correlation (3)
- ecosystems (3)
- equity risk premium (3)
- facebook (3)
- fannie (3)
- feminism (3)
- fst (3)
- intellectual ventures (3)
- jim simons (3)
- language (3)
- lee kwan yew (3)
- lewontin fallacy (3)
- lhc (3)
- magic (3)
- michael lewis (3)
- mit (3)
- nathan myhrvold (3)
- neal stephenson (3)
- olympiads (3)
- path integrals (3)
- risk preference (3)
- search (3)
- sec (3)
- sivs (3)
- society generale (3)
- systemic risk (3)
- thailand (3)
- twitter (3)
- alibaba (2)
- bear stearns (2)
- bruce springsteen (2)
- charles babbage (2)
- cloning (2)
- david mamet (2)
- digital books (2)
- donald mackenzie (2)
- drugs (2)
- dune (2)
- exchange rates (2)
- frauds (2)
- freddie (2)
- gaussian copula (2)
- heinlein (2)
- industrial revolution (2)
- james watson (2)
- ltcm (2)
- mating (2)
- mba (2)
- mccain (2)
- monkeys (2)
- national character (2)
- nicholas metropolis (2)
- no holds barred (2)
- offices (2)
- oligarchs (2)
- palin (2)
- population structure (2)
- prisoner's dilemma (2)
- singapore (2)
- skidelsky (2)
- socgen (2)
- sprints (2)
- star wars (2)
- ussr (2)
- variance (2)
- virtual reality (2)
- war nerd (2)
- abx (1)
- anathem (1)
- andrew lo (1)
- antikythera mechanism (1)
- athens (1)
- atlas shrugged (1)
- ayn rand (1)
- bay area (1)
- beats (1)
- book search (1)
- bunnie huang (1)
- car dealers (1)
- carlos slim (1)
- catastrophe bonds (1)
- cdos (1)
- ces 2008 (1)
- chance (1)
- children (1)
- cochran-harpending (1)
- cpi (1)
- david x. li (1)
- dick cavett (1)
- dolomites (1)
- eharmony (1)
- eliot spitzer (1)
- escorts (1)
- faces (1)
- fads (1)
- favorite posts (1)
- fiber optic cable (1)
- francis crick (1)
- gary brecher (1)
- gizmos (1)
- greece (1)
- greenspan (1)
- hypocrisy (1)
- igon value (1)
- iit (1)
- inflation (1)
- information asymmetry (1)
- iphone (1)
- jack kerouac (1)
- jaynes (1)
- jazz (1)
- jfk (1)
- john dolan (1)
- john kerry (1)
- john paulson (1)
- john searle (1)
- john tierney (1)
- jonathan littell (1)
- las vegas (1)
- lawyers (1)
- lehman auction (1)
- les bienveillantes (1)
- lowell wood (1)
- lse (1)
- machine (1)
- mcgeorge bundy (1)
- mexico (1)
- michael jackson (1)
- mickey rourke (1)
- migration (1)
- money:tech (1)
- myron scholes (1)
- netwon institute (1)
- networks (1)
- newton institute (1)
- nfl (1)
- oliver stone (1)
- phil gramm (1)
- philanthropy (1)
- philip greenspun (1)
- portfolio theory (1)
- power laws (1)
- pyschology (1)
- randomness (1)
- recession (1)
- sales (1)
- skype (1)
- standard deviation (1)
- starship troopers (1)
- students today (1)
- teleportation (1)
- tierney lab blog (1)
- tomonaga (1)
- tyler cowen (1)
- venice (1)
- violence (1)
- virtual meetings (1)
- wealth effect (1)
7 comments:
"Expect Many Ugly Variants... sequencing picks up a lot of the 'ugly' variants that SNP arrays so meticulously avoid: low-complexity regions, indels, triallelic SNPs, tandem repeats, etc... These caveats of the sequencing GWAS, while important, should not detract from the advantages over SNP array-based experiments."
Oh, no! Sequencing will study all of the candidate repeat alleles that the GWAS Jihadists have been dismissing all this time without any evidence at all. Thanks for the warning!
Didn't some mathematicians recently figure out that a whole other set of genes were hidden in centromeres. When does the real whole-genome analysis begin?
But in the end it's still just an attempt to find the best linear
approximation to the function which takes vertices of a very high
dimensional hypercube to the quantitative trait or conditional
probability of a binary trait. And obviously a least squares linear
approximation means nothing if the conditional expectation surface isn't
linear.
As soon as one "looks" at the real world gxe to trait function this best linear approximation looks very bad and ridiculous.
So multiple gwases will overlap, but the hard core of non-zero variants will be elusive. It's a wild goose chase.
I like two quotes Vischer gave in his review paper:
From Tim Crow, Molecular Psychiatry 20112:
“There comes a point at which the genetic skeptic can be pardoned
the suggestion that if the genes are so small and so multiple, what they
are hardly matters, the dividing line between polygenes and no genes is
of little practical consequence. Have we reached this point”?
Jonathan Latham, on guardian.co.uk, 17 April 2011:
“Among all the genetic findings for common illnesses, such as heart
disease, cancer and mental illnesses, only a handful are of genuine
significance for human health. Faulty genes rarely cause, or even mildly
predispose us, to disease, and as a consequence the science of human
genetics is in deep crisis.Since the Collins paper [Manolio et al.
20093] was published nothing has happened to change that conclusion. It
now seems that the original twin-study critics were more right than they
imagined. The most likely explanation for why genes for common diseases
have not been found is that, with few exceptions, they do not exist.”
But in the end it's still just an attempt to find the best linear approximation to the function which takes vertices of a very high dimensional unit cube to the quantitative trait or conditional probability of a binary trait.
As soon as one "looks" at the real world gxe to trait function this best linear approximation looks very bad.
So multiple gwases with L1 "penalization" will overlap, but the hard core will be elusive. It's a wild goose chase.
From Tim Crow, Molecular Psychiatry 20112:
"There comes a point at which the genetic skeptic can be pardoned the suggestion that if the genes are so small and so multiple, what they are hardly matters, the dividing line between polygenes and no genes is of little practical consequence. Have we reached this point?"
For various reasons, I thought genome sequencing would have diminishing marginal returns compared to GWAS.
In the real world plant and animal breeders have fairly accurate predictive models built using similar methods. BTW, effect sizes are not 0 or 1.
The vertices of the unit cube indicate genome not effect size. For a long list of variants, the genome is a sequence of 1s and 0s. The linear estimate vector is the vector of effect sizes. Multiply them and you get the estimate of trait or probability of trait. Some vertices will have no value as some 1s require 0s in the genome vector.
And BTW, humans are homogeneous at genetic level compared to apes, yet are more heterogeneous in behavior than any species. An eskimo isn't born knowing how to hunt narwhal. Domesticated animals and plants may be even more homogeneous genetically, but behavior isn't like other traits, especially in humans.
Plomin, R., Caspi, A., Corley, R., Fulker, D.W., & DeFries, J. (1998).
Adoption results for self-reported personality: Evidence for nonadditive
genetic effects? Journal of Personality and Social Psychology, 75, 211-218.
In contrast to twin studies, we find little evidence for hereditary or shared
environmental influences in parent-offspring and sibling analyses of self-report
personality data. Although several factors might contribute to the discrepancy
between twin and adoption results, we suggest that nonadditive genetic
effects, which can be detected by twin studies but not adoption studies,
are a likely culprit. This has important implications for attempts to
identify specific genes responsible for genetic influence on personality.
Yeah. That's it. That's the ticket.
Post a Comment