Wednesday, October 30, 2013

Project Einstein


I met Jonathan Rothberg, a real pioneer in genetic sequencing technology, at Scifoo back in 2008 (see Gene machines). Jonathan's foundation is now backing an effort similar to the BGI Cognitive Genomics project. He may not remember, but we had a long conversation about this topic on the bus from the hotel to the Googleplex.

I've agreed to participate in Project Einstein (I am not worthy!) as a DNA donor, and I hope that our projects will someday share data and resources. Rothberg's attitude is typical of a true innovator: damn the critics, full speed ahead!
Nature: He founded two genetic-sequencing companies and sold them for hundreds of millions of dollars. He helped to sequence the genomes of a Neanderthal man and James Watson, who co-discovered DNA’s double helix. Now, entrepreneur Jonathan Rothberg has set his sights on another milestone: finding the genes that underlie mathematical genius.

Rothberg and physicist Max Tegmark, who is based at the Massachusetts Institute of Technology in Cambridge, have enrolled about 400 mathematicians and theoretical physicists from top-ranked US universities in a study dubbed ‘Project Einstein’. They plan to sequence the participants’ genomes using the Ion Torrent machine that Rothberg developed.

The team will be wading into a field fraught with controversy. Critics have assailed similar projects, such as one at the BGI (formerly the Beijing Genomics Institute) in Shenzhen, China, that is sequencing the genomes of 1,600 people identified as mathematically precocious children in the 1970s (see Nature 497, 297–299; 2013).

... Rothberg has long been interested in cognition. He is also in awe of the abilities of famous scientists. “Einstein said ‘the most incomprehensible thing about the Universe is that it is comprehensible’,” he says. “I’d love to find the genes that make the Universe comprehensible.”

There is precedent to the concept of sequencing extreme outliers in a population in the hunt for influential genes. Scientists have used the technique to sift for genes that influence medical conditions such as high blood pressure and bone loss. Some behavioural geneticists, such as Robert Plomin at King’s College London, who is involved with the BGI project, say that there is no reason that this same approach won’t work for maths ability. As much as two-thirds of a child’s mathematical aptitude seems to be influenced by genes (Y. Kovas et al. Psychol. Sci. 24, 2048–2056; 2013).

... The Rothberg Institute for Childhood Diseases, Rothberg’s private foundation based in Guilford, Connecticut, is the study’s sponsor. But Rothberg won’t say who is funding the project, which other geneticists estimate will cost at least US$1 million. Some speculate that Rothberg is funding it himself. In 2001, Fortune estimated his net worth to be $168 million, and that was before he sold the sequencing companies he founded — 454 Life Sciences and Ion Torrent, both based in Connecticut — for a combined total of $880 million.

Rothberg is adamant that the project is well worth the time and the money, whoever is paying for it. “This study may not work at all,” he says — before adding, quickly, that it “is not a crazy thing to do”. For a multimillionaire with time on his hands, that seems to be justification enough.
Let me repeat the scientific motivations for this type of project. The human brain is arguably the most complex object we know of in the universe. Yet, it is constructed from a blueprint containing less than a few gigabits of information. Unlocking the genetic architecture of cognition is one of the greatest challenges -- now feasible in the age of genomics that Rothberg and others helped bring into existence.

For a discussion of previous GWAS results on general cognition, and their implications for the prospects of studies like Project Einstein, see First GWAS hits for cognitive ability. For general background on the science, watch this video. Or read these: MIRI interview, FAQ.

Nabokov on teaching


Nabokov was professor of literature at Cornell from 1948-1959. The excerpt below is from a 1964 Playboy interview, reproduced at longform.org (a site I highly recommend).
Nabokov: I gave up teaching—that’s about all in the way of change. Mind you, I loved teaching, I loved Cornell, I loved composing and delivering my lectures on Russian writers and European great books. But around 60, and especially in winter, one begins to find hard the physical process of teaching, the getting up at a fixed hour every other morning, the struggle with the snow in the driveway, the march through long corridors to the classroom, the effort of drawing on the blackboard a map of James Joyce’s Dublin or the arrangement of the semi-sleeping car of the St. Petersburg-Moscow express in the early 1870s—without an understanding of which neither Ulysses nor Anna Karenina, respectively, makes sense. For some reason my most vivid memories concern examinations. Big amphitheater in Goldwin Smith. Exam from 8 a.m. to 10:30. About 150 students—unwashed, unshaven young males and reasonably well-groomed young females. A general sense of tedium and disaster. Half-past eight. Little coughs, the clearing of nervous throats, coming in clusters of sound, rustling of pages. Some of the martyrs plunged in meditation, their arms locked behind their heads. I meet a dull gaze directed at me, seeing in me with hope and hate the source of forbidden knowledge. Girl in glasses comes up to my desk to ask: “Professor Kafka, do you want us to say that…? Or do you want us to answer only the first part of the question?” The great fraternity of C-minus, backbone of the nation, steadily scribbling on. A rustic arising simultaneously, the majority turning a page in their bluebooks, good teamwork. The shaking of a cramped wrist, the failing ink, the deodorant that breaks down. When I catch eyes directed at me, they are forthwith raised to the ceiling in pious meditation. Windowpanes getting misty. Boys peeling off sweaters. Girls chewing gum in rapid cadence. Ten minutes, five, three, time’s up.
The first paragraph of Lolita, one of my favorites in all of literature:
Lolita, light of my life, fire of my loins. My sin, my soul. Lo-lee-ta: the tip of the tongue taking a trip of three steps down the palate to tap, at three, on the teeth. Lo. Lee. Ta.

Saturday, October 26, 2013

Annals of psychometry: adult cognitive skills by country

You may have read (NYTimes; enjoy the 500 expert comments!) about the recent OECD study of adult skills, which showed Americans lagging behind most other advanced countries. The outcomes more or less recapitulate the PISA results, which are produced by the same organization, led by Andreas Schleicher (Without data, you are just another person with an opinion; Schleicher was trained in physics before moving into educational assessment). Studying adults rather than students makes generational differences apparent: young Koreans scored near the top in the world, whereas older Koreans, who grew up in a much poorer country, do less well. This was also true for Finns, but countries like the UK, US and Australia showed almost no difference between the young and old. It would appear that the Flynn Effect has abated for the current adult population in those countries.

The most striking findings in the report to someone already familiar with this kind of country level data, are that:

(I) Job growth in areas requiring superior ability is high, whereas there is almost no growth in other sectors (see figure below and figure 1.3 in the report). The biggest decline in jobs is in manufacturing-related areas, which are characterized by low, but not the lowest, scores. The number of menial jobs, characterized by the lowest scores, held their own. In other words, in developed countries jobs for factory workers declined while maids, gardeners and servants held their own... a grim future of income inequality based on cognitive ability?

(Difference between mean scores of skilled and "elementary" occupations is about 1 population SD -- Table A3.19. This is about the same as mean score difference between college graduates and those that did not reach upper secondary education! Table A3.9 (l))

(II) Scores on numeracy and literacy showed high correlations: 0.8 -- 0.9 in every country. This is much higher than, for instance, M and V correlation on the SAT.




Here are national score ranges for math skills:


This figure shows that Japanese high school graduates have better literacy skills than Italian college graduates. There is plenty more data like this. Interestingly the standard deviations in scores for Japan and Korea are on the low end compared to other countries.


Roughly every fifth Finn and Japanese reads at high levels (Level 4 or 5 on the Survey of Adult Skills). This means, for example, that they can perform multiple-step operations to integrate, interpret, or synthesise information from complex or lengthy texts that involve conditional and/or competing information; and they can make complex inferences and appropriately apply background knowledge as well as interpret or evaluate subtle truth claims or arguments. They are also good at numbers: they can analyse and engage in complex reasoning about quantities and data, statistics and chance, spatial relationships, change, proportions and formulae; perform tasks involving multiple steps and select appropriate problem-solving strategies and processes; and understand arguments and communicate well-reasoned explanations for answers or choices.
How does someone who can't do these things obtain a good college degree? If only 20% of Finns and Japanese can do these things (the largest fraction among all nations), what fraction of US students are ready for serious college work?

Thursday, October 24, 2013

The Fate of Empires

John Bagot Glubb, a British officer in the first and second world war, and British Commander of the Arab Legion during the Arab-Israeli war of 1948, wrote a number of books including The Fate of Empires, which examines regularities in the rise and fall of 11 empires over 3000 years. The empires Glubb studied had a lifespan of about ten human generations, or two hundred and fifty years, despite changing factors such as technology. Glubb describes a pattern of growth and decline, with six stages: the Ages of Pioneers, Conquest, Commerce, Affluence, Intellect and Decadence. He pointedly avoided writing about India or China, focusing rather on middle and western Eurasia, stating that his knowledge was inadequate to the task.

Note that six stages in 10 generations means that significant change can occur over one or two generations -- a nation can pass from one stage to the next, as I believe we have in America during my lifetime.
The Fate of Empires

... There does not appear to be any doubt that money is the agent which causes the decline of this strong, brave and self-confident people. The decline in courage, enterprise and a sense of duty is, however, gradual. The first direction in which wealth injures the nation is a moral one. Money replaces honour and adventure as the objective of the best young men. Moreover, men do not normally seek to make money for their country or their community, but for themselves. Gradually, and almost imperceptibly, the Age of Affluence silences the voice of duty. The object of the young and the ambitious is no longer fame, honour or service, but cash. Education undergoes the same gradual transformation. No longer do schools aim at producing brave patriots ready to serve their country. [ Or to discover great things for all mankind! ] Parents and students alike seek the educational qualifications which will command the highest salaries.


The inadequacy of intellect

Perhaps the most dangerous by-product of the Age of Intellect is the unconscious growth of the idea that the human brain can solve the problems of the world. Even on the low level of practical affairs this is patently untrue. Any small human activity, the local bowling club or the ladies’ luncheon club, requires for its survival a measure of self-sacrifice and service on the part of the members. In a wider national sphere, the survival of the nation depends basically on the loyalty and self-sacrifice of the citizens. The impression that the situation can be saved by mental cleverness, without unselfishness or human self-dedication, can only lead to collapse.

Thus we see that the cultivation of the human intellect seems to be a magnificent ideal, but only on condition that it does not weaken unselfishness and human dedication to service. Yet this, judging by historical precedent, seems to be exactly what it does do. Perhaps it is not the intellectualism which destroys the spirit of self-sacrifice—the least we can say is that the two, intellectualism and the loss of a sense of duty, appear simultaneously in the life-story of the nation.  [  Correlation != Causation  :-)  The point is that the ages of intellectualism and decadence occur at a similar stage of development. ]

Indeed it often appears in individuals, that the head and the heart are natural rivals. The brilliant but cynical intellectual appears at the opposite end of the spectrum from the emotional self-sacrifice of the hero or the martyr. Yet there are times when the perhaps unsophisticated self-dedication of the hero is more essential than the sarcasms of the clever.


... Neither is decadence physical. The citizens of nations in decline are sometimes described as too physically emasculated to be able to bear hardship or make great efforts. This does not seem to be a true picture. Citizens of great nations in decadence are normally physically larger and stronger than those of their barbarian invaders ...
See also Duty, Honor, Country:
The unbelievers will say they are but words, but a slogan, but a flamboyant phrase. Every pedant, every demagogue, every cynic, every hypocrite, every troublemaker, and I am sorry to say, some others of an entirely different character, will try to downgrade them even to the extent of mockery and ridicule.
The 21st century American reality (the Age of Decadence):
"Yeah, I calculated the NPV, and, you know, it's just not worth it for me. I really believe in your project, though. And, I share your passion. Good luck."

Wednesday, October 23, 2013

Number sense and math ability


This is consistent with my experience as a parent and educator: my guess is that number sense is a cognitive module, at least somewhat distinct from general intelligence, and somewhat hardwired.
Number sense in infancy predicts mathematical abilities in childhood (PNAS)

Abstract: Human infants in the first year of life possess an intuitive sense of number. This preverbal number sense may serve as a developmental building block for the uniquely human capacity for mathematics. In support of this idea, several studies have demonstrated that nonverbal number sense is correlated with mathematical abilities in children and adults. However, there has been no direct evidence that infant numerical abilities are related to mathematical abilities later in childhood. Here, we provide evidence that preverbal number sense in infancy predicts mathematical abilities in preschool-aged children. Numerical preference scores at 6 months of age correlated with both standardized math test scores and nonsymbolic number comparison scores at 3.5 years of age, suggesting that preverbal number sense facilitates the acquisition of numerical symbols and mathematical abilities. This relationship held even after controlling for general intelligence, indicating that preverbal number sense imparts a unique contribution to mathematical ability. These results validate the many prior studies purporting to show number sense in infancy and support the hypothesis that mathematics is built upon an intuitive sense of number that predates language.

Saturday, October 19, 2013

Halloween


My son wanted a mutant cyclops jack o lantern :-)

Did I mention my kids love Calvin and Hobbes?






Tuesday, October 15, 2013

Compressed sensing and genomes

For more discussion of our recent paper (The human genome as a compressed sensor), see this blog post by my collaborator Carson Chow and another on the machine learning blog Nuit Blanche. One of our main points in the paper is that the phase transition between the regimes of poor and good recovery of the L1 penalized algorithm (LASSO) is readily detectable, and that the scaling behavior of the phase boundary allows theoretical estimates for the necessary amount of data required for good performance at a given sparsity. Apparently, this reasoning has appeared before in the compressed sensing literature, and has been used to optimize hardware designs for sensors. In our case, the sensor is the human genome, and its statistical properties are fixed. Fortunately, we find that genotype matrices are in the same universality class as random matrices, which are good compressed sensors.

The black line in the figure below is the theoretical prediction (Donoho 2006) for the location of the phase boundary. The shading shows results from our simulations. The scale on the right is L2 (norm squared) error in the recovered effects vector compared to the actual effects.


Perhaps we are approaching a D-T moment in genomics ;-)
... a Donoho-Tao moment in the Radar community at the next CoSeRa meeting :-). As a reminder the Donoho-Tao moment was well put in this 2008 IPAM newsletter: .... It’s David Donoho [5] reportedly exclaiming [to] a panel of NSF folks “You’ve got Terry Tao (a Fields medalist [6]) talking to geoscientists, what do you want?” ....

In previous discussions I predicted that of order millions of phenotype-genotype pairs would be sufficient to extract the genetic architecture of complex traits like height or g. This estimate is based on two ingredients:

1. The sparsity of these traits is probably no greater than s ~ 10k (evidence for this comes from looking at genomic Hamming distance as a function of phenotype distance).

2. The compressed sensing results suggest that good recovery can be achieved above a data threshold of roughly n ~ 30 s (assuming 1E06 SNPs and additive heritability h2 = 0.5 or so).

Including an extra order of magnitude to be safe, this leads to n ~ millions.

Sunday, October 13, 2013

Tap or Snap



The dreaded heel hook in action. Palhares has been banned from the UFC after this fight, for holding the submission too long. It doesn't actually seem that egregious to me in the video -- when the stakes are high, the fighter should not release the hold until instructed by the referee. See also Snap, Crackle, Pop.
Back in the day when grappling and BJJ were still fringe activities, I often had to travel to strange clubs to find training. It was intimidating to visit a new school where I didn't know anyone, even more so to spar with people who could easily injure me. The one submission I was most afraid of was the heel hook. The two serious injuries I sustained in years of training were from a straight armbar (juji gatame) and a heel hook, which sprained the tendons around my knee. The heel hook is much more effective on the street, where the opponent is likely to be wearing shoes and pants (escaping by pulling the leg out is much harder than in MMA), although there are also reasons not to pull guard in a street fight.

Palhares reportedly cried at the weigh-in after a hard cut to 170 lbs. In the past he's fought at 185 and probably walks around at 200 or so.

Thursday, October 10, 2013

Bezos quotes

These quotes appear in an excerpt from a new biography by Brad Stone. Stone uncovers some new ground -- tracking down Bezos' long-lost biological father, who was unaware(!) that his son had become a billionaire e-commerce titan.

I had always heard that Bezos has a vulcan or hyper-rational management style. It's good to know he occasionally loses his temper like everyone else.
“Are you lazy or just incompetent?”

“I’m sorry, did I take my stupid pills today?”

“Do I need to go down and get the certificate that says I’m CEO of the company to get you to stop challenging me on this?”

“Are you trying to take credit for something you had nothing to do with?”

“If I hear that idea again, I’m gonna have to kill myself.”

“We need to apply some human intelligence to this problem.”

[After reviewing the annual plan from the supply chain team] “I guess supply chain isn’t doing anything interesting next year.”

[After reading a start-of-meeting memo] “This document was clearly written by the B team. Can someone get me the A team document? I don’t want to waste my time with the B team document.”

[After an engineer’s presentation] “Why are you wasting my life?”

... To the amazement and irritation of employees, Bezos’s criticisms are almost always on target. Bruce Jones, a former Amazon supply chain vice president, describes leading a five-engineer team figuring out ways to make the movement of workers in fulfillment centers more efficient. The group spent nine months on the task, then presented their work to Bezos. “We had beautiful documents, and everyone was really prepared,” Jones says. Bezos read the paper, said, “You’re all wrong,” stood up, and started writing on the whiteboard.

“He had no background in control theory, no background in operating systems,” Jones says. “He only had minimum experience in the distribution centers and never spent weeks and months out on the line.” But Bezos laid out his argument on the whiteboard, and “every stinking thing he put down was correct and true,” Jones says. “It would be easier to stomach if we could prove he was wrong, but we couldn’t. That was a typical interaction with Jeff. He had this unbelievable ability to be incredibly intelligent about things he had nothing to do with, and he was totally ruthless about communicating it.”
See also Bezos on the big brains ;-)
Jeff Bezos: Yeah. So, I went to Princeton primarily because I wanted to study physics, and it's such a fantastic place to study physics. Things went fairly well until I got to quantum mechanics and there were about 30 people in the class by that point and it was so hard for me. I just remember there was a point in this where I realized I'm never going to be a great physicist. There were three or four people in the class whose brains were so clearly wired differently to process these highly abstract concepts, so much more. I was doing well in terms of the grades I was getting, but for me it was laborious, hard work. And, for some of these truly gifted folks -- it was awe-inspiring for me to watch them because in a very easy, almost casual way, they could absorb concepts and solve problems that I would work 12 hours on, and it was a wonderful thing to behold.
"Abstract geniuses" like the ones Bezos encountered at Princeton might not have the common sense or practical inclination necessary to run an organization like Amazon, but on the other hand, perhaps a few do! 8^)  In any case, credit to Bezos for being so brutally honest and logical about his own abilities and limitations. Most people, when confronted by an obviously superior intellect (even if confined to some narrow subset of abilities), resort to comforting rationalizations: "I could understand quantum mechanics if I really wanted to. But I don't, so who cares!"

Global innovation and entrepreneurship

Photos from the World Summit on Innovation and Entrepreneurship, held at the Museum of the Moving Image, NYC.

It seems that every region in the world is trying to replicate the Silicon Valley model. There were delegations at this event from the EU, Norway, Jordan, Egypt, Japan, you name it. Lots of interesting NYC startups represented.

The NSA will note that I do a lot of blogging while traveling -- in particular while waiting for a flight  8^/








Creation, Myths and Twitter

Great article by Nick Bilton on the creation myth (and true story) behind Twitter. To see that luck plays an unimaginably huge role in life you just need to look carefully at the story behind any successful company or entrepreneur.
NYTimes: ... Soon, the question of a name came up. Williams jokingly suggested calling the project “Friendstalker,” which was ruled out as too creepy. Glass became obsessive, flipping through a physical dictionary, almost word by word, looking for the right name. One late afternoon, alone in his apartment, he reached over to his cellphone and turned it to silent, which caused it to vibrate. He quickly considered the name “Vibrate,” which he nixed, but it led him to the word “twitch.” He dismissed that too, but he continued through the “Tw” section of the dictionary: twist, twit, twitch, twitcher, twitchy . . . and then, there it was. He read the definition aloud. “The light chirping sound made by certain birds.” This is it, he thought. “Agitation or excitement; flutter.” Twitter.

... Whatever his reasons, Dorsey had recently met with Williams and threatened to quit if Glass wasn’t let go. And for Williams, the decision was easy. Dorsey had become the lead engineer on Twitter, and Glass’s personal problems were affecting his judgment. (For a while, portions of the company existed entirely on Glass’s I.B.M. laptop.) After conferring with the Odeo board, around 6 p.m. on Wednesday, July 26, 2006, Williams asked Glass to join him for a walk to South Park. Sitting on a green bench, Williams gave his old friend an ultimatum: six months’ severance and six months’ vesting of his Odeo stock, or he would be publicly fired. Williams said the decision was his alone.

... Williams and Dorsey started meeting for weekly dinners to discuss the problems, but one night Dorsey became defensive. “Do you want to be C.E.O.?” he said abruptly. Williams tried to evade the question, but eventually replied: “Yes, I want to be C.E.O. I have experience running a company, and that’s what Twitter needs right now.” ... told him that they were replacing him as C.E.O. with Williams. Dorsey sat before a bowl of uneaten yogurt and granola as he was offered stock, a $200,000 severance and a face-saving role as the company’s “silent” chairman. No one in the industry had to know that he was fired. (Investors would not want to be seen as pitting one founder against another anyway.) But Dorsey had no voting rights at the company. He was, essentially, out.

... Access to the tech blogosphere and press can help percolate a fledgling start-up into a multibillion-dollar business. But this access often relies on having a narrative — being an entrepreneur with just the right creation story. ... After he was stripped of his power at Twitter, Dorsey went on a media campaign to promote the idea that he and Williams had switched roles. He also began telling a more elaborate story about the founding of Twitter. In dozens of interviews, Dorsey completely erased Glass from any involvement in the genesis of the company. He changed his biography on Twitter to “inventor”; before long, he started to exclude Williams and Stone too.

... Without Williams and Stone influencing its development with the lessons they learned from Blogger, it still would not have taken off. Making it a company required Williams’s money, then Wilson, Sabet and Fenton’s and dozens of other investors, not to mention Costolo, who turned it into viable business, and 2,000 employees who helped shape it into one of the biggest social networks on the planet. Such is the case with every company in Silicon Valley, though you never hear it in their creation myth. Dorsey will make $400 million to $500 million when Twitter goes public. Glass stands to make about as much as Dorsey’s secretary at Square. ...
This is from a 2009 post Me and Twitter. I had met Williams at Foo Camp, probably in 2007. In the 2009 post I didn't mention that most of the conversation was about Odeo, Williams' podcasting startup from which Twitter sprang as an almost accidental creation. His description to me of how the Twitter idea originated was a bit different than what Bilton reports.
I met Twitter founder Evan Williams a few years ago, before Twitter was anywhere near a big thing. He told me about Blogger, which he sold to Google, and then the inevitable "So what are you working on now?" question came up.

He described Twitter to me, and two thoughts entered my mind. The first shows I am old, or out of touch, or have no feel for Web 2.0 consumer startups: "Who would use that?" I said to myself.

The second thought, which I actually verbalized, turns out to be a good question (still unanswered) and shows I may have VC potential: "How are you going to monetize that?" :-)

Wednesday, October 09, 2013

The human genome as a compressed sensor



Compressed sensing (see also here) is a method for efficient solution of underdetermined linear systems: y = Ax + noise , using a form of penalized regression (L1 penalization, or LASSO). In the context of genomics, y is the phenotype, A is a matrix of genotypes, x a vector of effect sizes, and the noise is due to nonlinear gene-gene interactions and the effect of the environment. (Note the figure above, which I found on the web, uses different notation than the discussion here and the paper below.)

Let p be the number of variables (i.e., genetic loci = dimensionality of x), s the sparsity (number of variables or loci with nonzero effect on the phenotype = nonzero entries in x) and n the number of measurements of the phenotype (i.e., the number of individuals in the sample = dimensionality of y). Then  A  is an  n x p  dimensional matrix. Traditional statistical thinking suggests that  n > p  is required to fully reconstruct the solution  x  (i.e., reconstruct the effect sizes of each of the loci). But recent theorems in compressed sensing show that  n > C s log p  is sufficient if the matrix A has the right properties (is a good compressed sensor). These theorems guarantee that the performance of a compressed sensor is nearly optimal -- within an overall constant of what is possible if an oracle were to reveal in advance which  s  loci out of  p  have nonzero effect. In fact, one expects a phase transition in the behavior of the method as  n  crosses a critical threshold given by the inequality. In the good phase, full recovery of  x  is possible.

In the paper below, available on arxiv, we show that

1. Matrices of human SNP genotypes are good compressed sensors and are in the universality class of random matrices. The phase behavior is controlled by scaling variables such as  rho = s/n  and our simulation results predict the sample size threshold for future genomic analyses.

2. In applications with real data the phase transition can be detected from the behavior of the algorithm as the amount of data  n  is varied. A priori knowledge of  s  is not required; in fact one deduces the value of  s  this way.

3.  For heritability h2 = 0.5 and p ~ 1E06 SNPs, the value of  C log p  is ~ 30. For example, a trait which is controlled by s = 10k loci would require a sample size of n ~ 300k individuals to determine the (linear) genetic architecture.
Application of compressed sensing to genome wide association studies and genomic selection          
http://arxiv.org/abs/1310.2264
Authors: Shashaank Vattikuti, James J. Lee, Stephen D. H. Hsu, Carson C. Chow
Categories: q-bio.GN
Comments: 27 pages, 4 figures; Supplementary Information 5 figures

We show that the signal-processing paradigm known as compressed sensing (CS)
is applicable to genome-wide association studies (GWAS) and genomic selection
(GS). The aim of GWAS is to isolate trait-associated loci, whereas GS attempts
to predict the phenotypic values of new individuals on the basis of training
data. CS addresses a problem common to both endeavors, namely that the number
of genotyped markers often greatly exceeds the sample size. We show using CS
methods and theory that all loci of nonzero effect can be identified (selected)
using an efficient algorithm, provided that they are sufficiently few in number
(sparse) relative to sample size. For heritability h2 = 1, there is a sharp
phase transition to complete selection as the sample size is increased. For
heritability values less than one, complete selection can still occur although
the transition is smoothed. The transition boundary is only weakly dependent on
the total number of genotyped markers. The crossing of a transition boundary
provides an objective means to determine when true effects are being recovered.
For h2 = 0.5, we find that a sample size that is thirty times the number
of nonzero loci is sufficient for good recovery.

Tuesday, October 08, 2013

Nobels for Higgs and Englert


Congratulations to Peter Higgs and François Englert on their Nobel prize. A bit of background from an earlier post How the Higgs boson became the Higgs boson:
IIRC, I met Peter Higgs in Erice in 1990. He was quite a nice fellow, but the story below by Steve Weinberg illustrates how capricious is the allocation of credit in science.

NYBooks: (Footnote 1) In his recent book, The Infinity Puzzle (Basic Books, 2011), Frank Close points out that a mistake of mine was in part responsible for the term “Higgs boson.” In my 1967 paper on the unification of weak and electromagnetic forces, I cited 1964 work by Peter Higgs and two other sets of theorists. This was because they had all explored the mathematics of symmetry-breaking in general theories with force-carrying particles, though they did not apply it to weak and electromagnetic forces. As known since 1961, a typical consequence of theories of symmetry-breaking is the appearance of new particles, as a sort of debris. A specific particle of this general class was predicted in my 1967 paper; this is the Higgs boson now being sought at the LHC.
As to my responsibility for the name “Higgs boson,” because of a mistake in reading the dates on these three earlier papers, I thought that the earliest was the one by Higgs, so in my 1967 paper I cited Higgs first, and have done so since then. Other physicists apparently have followed my lead. But as Close points out, the earliest paper of the three I cited was actually the one by Robert Brout and François Englert. In extenuation of my mistake, I should note that Higgs and Brout and Englert did their work independently and at about the same time, as also did the third group (Gerald Guralnik, C.R. Hagen, and Tom Kibble). But the name “Higgs boson” seems to have stuck.

[ Note that to Higgs' credit his is the only paper that clearly works out the properties of the excitation now known as the Higgs boson. ]
Jeffrey Goldstone showed (1961) that when rigid ("global") continuous symmetries are spontaneously broken by the vacuum (the vacuum configuration is not invariant under the symmetry), a massless boson necessarily results. This boson is the eponymous Goldstone boson: the particle excitation corresponding to small perturbations of the vacuum state in the direction of the symmetry. The natural next step is to ask what happens if the broken symmetry is a gauge (local) symmetry. This is the problem that Higgs et al. solved. But Goldstone had one of the first cracks at the problem. Indeed, Jeffrey deduced the existence of a massive excitation (i.e., the Higgs boson), but its physical reality was in question -- only apparent in certain "choices of gauge"; gauge theory was not then very well understood. According to legend, Sidney Coleman convinced Goldstone that the boson was only a gauge artifact. For years afterward Goldstone would say that Sidney, despite his obvious brilliance, was, when it really counted, always wrong!

I met Englert for the first time in 2008 at a workshop in Paris on the black hole information problem. Over coffee, he explained to me some mysterious comments 't Hooft had made in his talk. A real gentleman, and still very sharp.

A photo from the summer school in Erice, Sicily 1990. Higgs is in the blue socks and sandals, holding a glass of wine. I'm in a maroon shirt two rows back.


A portrait of Higgs in the physics department of the University of Edinburgh.

Monday, October 07, 2013

UK to sequence 100k genomes



Here's to the NHS! The plan is to complete this in 5 years.
Welcome to Genomics England

We are a new company set up by the Department of Health to help deliver the 100k Genome Project first announced by the Prime Minister, David Cameron, in December 2012.

This project will sequence the personal DNA code – known as a genome – of up to 100,000 patients over the next five years. This unrivalled knowledge will help doctors’ understanding, leading to better and earlier diagnosis and personalised care. Based on expert scientific advice, we will start by tackling cancer, rare diseases and infectious diseases.

The company will manage contracts for sequencing, data linkage and analysis, and set standards for patient consent.

Friday, October 04, 2013

Fuzzballs, black holes and firewalls




Yesterday Samir Mathur gave a colloquium here on the black hole information paradox. I've known Samir for many years; he was an assistant professor at MIT when I was a postdoc up the river. I've always found him to be a very precise and clear thinker.

On his web page there is a very simple introduction to the paradox. The initial presentation emphasizes the role of negative binding energy in black hole physics, which is related to the question of monsters: configurations in classical general relativity with more entropy than black holes of the same mass. (Slides.)

Here is a recent paper in which Samir discusses the black hole firewall problem and subadditivity of entropy.

Samir in action giving a more technical seminar earlier today:


Wednesday, October 02, 2013

Deemed Naughty by Nature


Nature editorial condemns research in cognitive genomics. A slight exaggeration, but consistent with the level of reporting in the accompanying article. (Click through and vote in their poll!)

I am quoted as follows:
After this summer's furore over Miller's interview [in Vice Magazine, of all places; Miller referred to Chinese embryo selection for "genius babies"], Hsu played down the potential for abuse. “There's a big gap between finding a few hits and finding thousands of hits — enough to predict the trait on the basis of the genotype — and we were never saying we were going to get to that point,” he says. But in 2011, before the uproar over the study, Hsu told Nature: “I'm 100% sure that a technology will eventually exist for people to evaluate their embryos or zygotes for quantitative traits, like height or intelligence. I don't see anything wrong with that.”
The first quote refers to the discovery power of our sample of 2000 gifted individuals. We would be quite happy to find even one genome-wide significant hit. The second quote refers to my prediction for what will be possible eventually (perhaps decades from now). Juxtaposing the quotes this way is deliberately misleading.

Beanbag genetics: blood pressure

As I wrote in an earlier post Beanbags and Causal Variants:
Not only do these results implicate common causal variants as the source of heritability in disease susceptibility, but they also suggest that gene-gene (epistasis) and gene-environment interactions are of limited impact. Both the genetic and environmental backgrounds for a particular allele vary across Eurasia, so replicability puts an upper limit on their influence. See also Epistasis vs Additivity.
How can it be? But what about the marvelous incomprehensible beautiful sacred complexity of Nature? But But But ...

In the blood pressure (BP) study cited below, the data include East and South Asians, African Americans and Europeans. The effect sizes of variants in one population are well correlated with effect sizes in other populations, despite changes in the genetic background (i.e. other genes) and environments with which they interact. This suggests the interaction effects are small.
Genome-wide Association Analysis of Blood-Pressure Traits in African-Ancestry Individuals Reveals Common Associated Genes in African and Non-African Populations

Abstract: ... We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability.


(COGENT = African Americans, ICBP = European Americans)

Long live "beanbag genetics"! :-)
A Defense of Beanbag Genetics

JBS Haldane

My friend Professor Ernst Mayr, of Harvard University, in his recent book Animal Species and Evolution1, which I find admirable, though I disagree with quite a lot of it, has the following sentences on page 263.
The Mendelian was apt to compare the genetic contents of a population to a bag full of colored beans. Mutation was the exchange of one kind of bean for another. This conceptualization has been referred to as “beanbag genetics”. Work in population and developmental genetics has shown, however, that the thinking of beanbag genetics is in many ways quite misleading. To consider genes as independent units is meaningless from the physiological as well as the evolutionary viewpoint.  [Italics mine]
... In another place2 Mayr made a more specific challenge. He stated that Fisher, Wright, and I “have worked out an impressive mathematical theory of genetical variaion and evolutionary change. But what, precisely, has been the contribution of this mathematical school to evolutionary theory, if I may be permitted to ask such a provocative question?” “However,” he continued in the next paragraph, “I should perhaps leave it to Fisher, Wright, and Haldane to point out what they consider their major contributions.” ...

Now, in the first place I deny that the mathematical theory of population genetics is at all impressive, at least to a mathematician. On the contrary, Wright, Fisher, and I all made simplifying assumptions which allowed us to pose problems soluble by the elementary mathematics at our disposal, and even then did not always fully solve the simple problems we set ourselves. Our mathematics may impress zoologists but do not greatly impress mathematicians. Let me give a simple example. ...
See also Eric, why so gloomy?
Fisher's Fundamental Theorem of Natural Selection identifies additive variance as the main driver of evolutionary change in the limit where selection timescales are much longer than recombination (e.g., due to sexual reproduction) timescales. Thus it is reasonable to expect that most of the change in genus Homo [traits which have been under selection] over the last millions of years is encoded in a linear genetic architecture.

Tuesday, October 01, 2013

Too much homework?

I don't recall doing much homework until I was in high school. My grade school kids already have quite a bit to do, although I wouldn't say it has reached an excessive level. It is however clear that school is more serious than when I was a kid. In elementary school they didn't really know what to do with me so I spent a lot of time reading in the library. For some reason our library had all six volumes of Gibbon's History and Decline of the Roman Empire and Shirer's The Rise and Fall of the Third Reich. I could read pretty well but unfortunately no one tried to teach me any advanced math.
Atlantic Monthly: My Daughter’s Homework Is Killing Me

What happens when a father, alarmed by his 13-year-old daughter's nightly workload, tries to do her homework for a week

... whenever I bring up the homework issue with teachers or administrators, their response is that they are required by the state to cover a certain amount of material. There are standardized tests, and everyone—students, teachers, schools—is being evaluated on those tests. I’m not interested in the debates over teaching to the test or No Child Left Behind. What I am interested in is what my daughter is doing during those nightly hours between 8 o’clock and midnight, when she finally gets to bed. During the school week, she averages three to four hours of homework a night and six and a half hours of sleep.

... My daughter has the misfortune of living through a period of peak homework.

It turns out that there is no correlation between homework and achievement. According to a 2005 study by the Penn State professors Gerald K. LeTendre and David P. Baker, some of the countries that score higher than the U.S. on testing in the Trends in International Mathematics and Science Study—Japan and Denmark, for example—give less homework, while some of those scoring lower, including Thailand and Greece, assign more. Why pile on the homework if it doesn’t make even a testable difference, and in fact may be harmful?

“It’s a response to this whole globalized, competitive process,” says Richard Walker, a co-author of the book Reforming Homework. “You get parents demanding their children get more homework because their children are competing against the whole world.”

The irony is that some countries where the school systems are held up as models for our schools have been going in the opposite direction of the U.S., giving less homework and implementing narrower curricula built to encourage deeper understanding rather than broader coverage. ...