Tuesday, May 24, 2016

Free Harvard, Fair Harvard: Overseer election results

None of the Free Harvard, Fair Harvard candidates were among the winners of the Harvard Overseer election, which ended last Friday. I didn't expect to win, but I thought Ralph Nader had a good chance. Nevertheless, it was worthwhile to bring more attention to important issues such as admissions transparency and use of the endowment. My thanks to the thousands of Harvard alumni who supported our efforts and voted for the FHFH ticket.
NYTimes: Group Urging Free Tuition at Harvard Fails to Win Seats on Board

A rebellious slate of candidates who this year upset the normally placid balloting for the Board of Overseers at Harvard has failed to secure positions on the board, which helps set strategy for the university.

Calling itself Free Harvard, Fair Harvard, the group ran on a proposal that Harvard should be free to all undergraduates because the university earns so much money from its $37.6 billion endowment. It tied the notion to another, equally provocative question: Does Harvard shortchange Asian-Americans in admissions?

The outsider slate, which was formed in January, proposed five candidates against a slate of eight candidates officially nominated by the Harvard Alumni Association. After 35,870 alumni votes were counted, five winners were announced from the alumni group on Monday. ...
Perhaps our efforts emboldened other groups to push for important changes:
WSJ: Asian-American Groups Seek Investigation Into Ivy League Admissions

A coalition of Asian-American organizations asked the Department of Education on Monday to investigate Brown University, Dartmouth College and Yale University, alleging they discriminate against Asian-American students during the admissions process.

While the population of college age Asian-Americans has doubled in 20 years and the number of highly qualified Asian-American students “has increased dramatically,” the percentage accepted at most Ivy League colleges has flatlined, according to the complaint. It alleges this is because of “racial quotas and caps, maintained by racially differentiated standards for admissions that severely burden Asian-American applicants.” ...
See also
NYTimes: Professors Are Prejudiced, Too

... To find out, we conducted an experiment. A few years ago, we sent emails to more than 6,500 randomly selected professors from 259 American universities. Each email was from a (fictional) prospective out-of-town student whom the professor did not know, expressing interest in the professor’s Ph.D. program and seeking guidance. These emails were identical and written in impeccable English, varying only in the name of the student sender. The messages came from students with names like Meredith Roberts, Lamar Washington, Juanita Martinez, Raj Singh and Chang Huang, names that earlier research participants consistently perceived as belonging to either a white, black, Hispanic, Indian or Chinese student.

... Professors were more responsive to white male students than to female, black, Hispanic, Indian or Chinese students in almost every discipline and across all types of universities. We found the most severe bias in disciplines paying higher faculty salaries and at private universities. ... our own discipline of business showed the most bias, with 87 percent of white males receiving a response compared with just 62 percent of all females and minorities combined.

... Were Asians favored, given the model minority stereotype they supposedly benefit from in academic contexts? No. In fact, Chinese students were the most discriminated-against group in our sample. ...

Saturday, May 21, 2016

Garwin and the Mike shot

Richard Garwin designed the first H-Bomb, based on the Teller-Ulam mechanism, while still in his early twenties. See also One hundred thousand brains.

From Kenneth Ford's Building the H-Bomb: A Personal History:
... In 1951 Dick Garwin came for his second summer to Los Alamos. He was then twenty-three and two years past his Ph.D.* Edward Teller, having interacted with Garwin at the University of Chicago, knew him to be an extraordinarily gifted experimental physicist as well as a very talented theorist. He knew, too, that Fermi had called Garwin the best graduate student he ever had. [5] So when Garwin came to Teller shortly after arriving in Los Alamos that summer (probably in June 1951) asking him “what was new,” [6] Teller was ready to pounce. He referred Garwin to the Teller-Ulam report of that March and then asked him to “devise an experiment that would be absolutely persuasive that this would really work.” Garwin set about doing exactly that and in a report dated July 25, 1951, titled “Some Preliminary Indications of the Shape and Construction of a Sausage, Based on Ideas Prevailing in July 1951,”[7] he laid out a design with full specifics of size, shape, and composition, for what would be the Mike shot fired the next year. ...

Wikipedia: Ivy Mike was the codename given to the first test of a full-scale thermonuclear device, in which part of the explosive yield comes from nuclear fusion. It was detonated on November 1, 1952 by the United States on Enewetak, an atoll in the Pacific Ocean, as part of Operation Ivy. The device was the first full test of the Teller-Ulam design, a staged fusion bomb, and was the first successful test of a hydrogen bomb. ...

Sunday, May 15, 2016

University quality and global rankings

The paper below is one of the best I've seen on university rankings. Yes, there is a univariate factor one might characterize as "university quality" that correlates across multiple measures. As I have long suspected, the THE (Times Higher Education) and QS rankings, which are partially survey/reputation based, are biased in favor of UK and Commonwealth universities. There are broad quality bands in which many schools are more or less indistinguishable.

The figure above is from the paper, and the error bars displayed (an advanced concept!) show 95% confidence intervals.

Sadly, many university administrators will not understand the methodology or conclusions of this paper.
Measuring University Quality

Christopher Claassen

This paper uses a Bayesian hierarchical latent trait model, and data from eight different university ranking systems, to measure university quality. There are five contributions. First, I find that ratings tap a unidimensional, underlying trait of university quality. Second, by combining information from different systems, I obtain more accurate ratings than are currently available from any single source. And rather than dropping institutions that receive only a few ratings, the model simply uses whatever information is available. Third, while most ratings focus on point estimates and their attendant ranks, I focus on the uncertainty in quality estimates, showing that the difference between universities ranked 50th and 100th, and 100th and 250th, is insignificant. Finally, by measuring the accuracy of each ranking system, as well as the degree of bias toward universities in particular countries, I am able to rank the rankings.
From the paper:
... The USN-GU, Jeddah, and Shanghai rating systems are the most accurate, with R2 statistics in excess of 0.80.

... Plotting the six eigenvalues from the ... global ratings correlation matrix ... the observed data is strongly unidimensional: the first eigenvalue is substantially larger than the others ...

... This paper describes an attempt to improve existing estimates of university quality by building a Bayesian hierarchical latent trait model and inputting data from eight rankings. There are five main findings. First, despite their different sources of information, ranging from objective indicators, such as citation counts, to subjective reputation surveys, existing rating systems clearly tap a unidimensional latent variable of university quality. Second, the model combines information from multiple rankings, producing estimates of quality that offer more accurate ratings than can be obtained from any single ranking system. Universities that are not rated by one or more rating systems present no problem for the model: they simply receive more uncertain estimates of quality. Third, I find considerable error in measurement: the ratings of universities ranked around 100th position are difficult to distinguish from those ranked close to 30th; similarly for those ranked at 100th and those at 250th. Fourth, each rating system performs at least adequately in measuring university quality. Surprisingly, the national ranking systems are the least accurate, which may be due to their usage of numerous indicators, some extraneous. Finally, three of the six international ranking systems show bias toward the universities in their home country. The two unbiased global rankings, from the Center for World University Rankings in Jeddah, and US News & World Report are also the two most accurate.

To discuss a particular example, here are the inputs (all objective) to the Shanghai (ARWU) rankings:

One could critique these measures in various ways. For example:
Counting Nature and Science papers biases towards life science and away from physical science, computer science, and engineering. Inputs are overall biased toward STEM subjects.

Nobel Prizes are a lagging indicator (ARWU provides an Alternative Rank with prize scoring removed).

Per-capita measures better reflect quality, as opposed to weighting toward quantity (sheer size).
One can see the effects of some of these factors in the figure below. Far left column shows Alternative Rank (prizes removed), Rank in ARWU shows result using all criteria above, and far right column shows scores after per capita normalization to size of faculty. On this last measure, one school dominates all the rest, by margins that may appear shocking ;-)

Note added: Someone asked me about per capita (intensive) vs total quantity (extensive) measures. Suppose there are two physics departments of roughly equal quality, but one with 60 faculty and the other with 30. The former should produce roughly twice the papers, citations, prize winners, and grant support as the latter. If the two departments (without normalization) are roughly equal in these measures, then the latter is probably much higher quality. This argument could be applied to the total faculty of a university. One characteristic that distorts rankings considerably is the presence of a large research medical school and hospital(s). Some schools (Harvard, Stanford, Yale, Michigan, UCSD, Washington, etc.) have them, others (Princeton, Berkeley, MIT, Caltech, etc.) do not. The former group gains an advantage from this medical activity relative to the latter group in aggregate measures of grants, papers, citations, etc. Normalizing by number of faculty helps to remove such distortionary effects. Ideally, one could also normalize these output measures by the degree to which the research is actually reproducible (i.e., real) -- this would place much more weight on some fields than others ;-)

Friday, May 13, 2016

Evidence for (very) recent natural selection in humans

This new paper describes a technique for detecting recent (i.e., last 2k years) selection on both Mendelian and polygenic traits. The authors find evidence for selection on a number of phenotypes, ranging from hair and eye color, to height and head size (the data set they applied their method to was UK10K whole genomes, so results are specific to the British). This is a remarkable result, which confirms the hypothesis that humans have been subject to strong selection in the recent past -- i.e., during periods documented by historical record.

See this 2008 post Recent natural selection in humans, in which I estimate that significant selection on millennial (1000 year) timescales is plausible. Evidence for selection on height in Europe over the last 10k years or less has been accumulating for some time: see, e.g., Genetic group differences in height and recent human evolution.

How does the new method work?

Strong selection in the recent past can cause allele frequencies to change significantly. Consider two different SNPs, which today have equal minor allele frequency (for simplicity, let this be equal to one half). Assume that one SNP was subject to strong recent selection, and another (neutral) has had approximately zero effect on fitness.  The advantageous version of the first SNP was less common in the far past, and rose in frequency recently (e.g., over the last 2k years). In contrast, the two versions of the neutral SNP have been present in roughly the same proportion (up to fluctuations) for a long time. Consequently, in the total past breeding population (i.e., going back tens of thousands of years) there have been many more copies of the neutral alleles (and the chunks of DNA surrounding them) than of the positively selected allele. Each of the chunks of DNA around the SNPs we are considering is subject to a roughly constant rate of mutation.

Looking at the current population, one would then expect a larger variety of mutations in the DNA region surrounding the neutral allele (both versions) than near the favored selected allele (which was rarer in the population until very recently, and whose surrounding region had fewer chances to accumulate mutations). By comparing the difference in local mutational diversity between the two versions of the neutral allele (should be zero modulo fluctuations, for the case MAF = 0.5), and between the (+) and (-) versions of the selected allele (nonzero, due to relative change in frequency), one obtains a sensitive signal for recent selection. See figure at bottom for more detail. In the paper what I call mutational diversity is measured by looking at distance distribution of singletons, which are rare variants found in only one individual in the sample under study.

Some numbers: For a unique lineage, ~100 de novo mutations per generation, over ~100 generations = 1 de novo per ~300kb, similar to singleton interval length scale. Note singletons defined in a sample of 10k individuals in the current population; distribution would vary with sample size.
Detection of human adaptation during the past 2,000 years
bioRxiv: doi: http://dx.doi.org/10.1101/052084

Detection of recent natural selection is a challenging problem in population genetics, as standard methods generally integrate over long timescales. Here we introduce the Singleton Density Score (SDS), a powerful measure to infer very recent changes in allele frequencies from contemporary genome sequences. When applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past 2,000 years. We see strong signals of selection at lactase and HLA, and in favor of blond hair and blue eyes. Turning to signals of polygenic adaptation we find, remarkably, that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we report suggestive new evidence for polygenic shifts affecting many other complex traits. Our results suggest that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.

Flipping DNA switches

The recently published SSGAC study (Nature News) found 74 genome-wide significant hits related to educational attainment, using a discovery sample of ~300k individuals. The UK Biobank sample of ~110k individuals was used as a replication check of the results. If both samples are combined as a discovery sample 162 SNPs are identified at genome-wide significance. These SNPs are likely tagging causal variants that have some effect on cognitive ability.

The SNP hits discovered are common variants -- both (+) and (-) versions are found throughout the general population, neither being very rare. This means that a typical individual could carry 80 or so (-) variants. (A more precise estimate can be obtained using the minor allele frequencies of each SNP.)

Imagine that we knew the actual causal genetic variants that are tagged by the discovered SNPs (we don't, yet), and imagine that we could edit the (-) version to a (+) version (e.g., using CRISPR; note I'm not claiming this is easy to do -- it's a gedanken experiment). How much would the IQ of the edited individual increase? Estimated effect sizes for these SNPs are uncertain, but could be in the range of 1/4 or 1/10 of an IQ point. Multiplying by ~80 gives as a crude estimate of perhaps 10 or 15 IQ points up for grabs, just from the SSGAC hits alone.

Critics of the study point out that only a small fraction of the expected total genetic variance in cognitive ability is accounted for by SSGAC SNPs. But the estimate above shows that the potential biological effect of these SNPs, taken in aggregate, is not small! Indeed, once many more causal variants are known (eventually, perhaps thousands in total), an unimaginably large enhancement of human cognitive ability might be possible.

See also
Super-intelligent humans are coming
On the genetic architecture of intelligence and other quantitative traits

(Super-secret coded message for high g readers: N >> sqrt(N), so lots of SDs are up for grabs! ;-)

Wednesday, May 11, 2016

74 SNP hits from SSGAC GWAS

The SSGAC discovery of 74 SNP hits on educational attainment (EA) is finally published in Nature. Nature News article.

EA was used in order to assemble as large a sample as possible (~300k individuals). Specific cognitive scores are only available for a much smaller number of individuals. But SNPs associated with EA are likely to also be associated with cognitive ability -- see figure above.

The evidence is strong that cognitive ability is highly heritable and highly polygenic. With even larger samples we'll eventually be able to build good genomic predictors for cognitive ability.
Genome-wide association study identifies 74 loci associated with educational attainment A. Okbay et al. Nature http://dx.doi.org/10.1038/nature17671; 2016

Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single- nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.

Here's what I wrote back in September of 2015, based on a talk given by James Lee on this work.
James Lee talk at ISIR 2015 (via James Thompson) reports on 74 hits at genome-wide statistical significance (p < 5E-8) using educational attainment as the phenotype. Most of these will also turn out to be hits on cognitive ability.

To quote James: "Shock and Awe" for those who doubt that cognitive ability is influenced by genetic variants. This is just the tip of the iceberg, though. I expect thousands more such variants to be discovered before we have accounted for all of the heritability.
James J Lee

University of Minnesota Twin Cities
Social Science Genetic Association Consortium

Genome-wide association studies (GWAS) have revealed much about the biological pathways responsible for phenotypic variation in many anthropometric traits and diseases. Such studies also have the potential to shed light on the developmental and mechanistic bases of behavioral traits.

Toward this end we have undertaken a GWAS of educational attainment (EA), an outcome that shows phenotypic and genetic correlations with cognitive performance, personality traits, and other psychological phenotypes. We performed a GWAS meta-analysis of ~293,000 individuals, applying a variety of methods to address quality control and potential confounding. We estimated the genetic correlations of several different traits with EA, in essence by determining whether single-nucleotide polymorphisms (SNPs) showing large statistical signals in a GWAS meta-analysis of one trait also tend to show such signals in a meta-analysis of another. We used a variety of bio-informatic tools to shed light on the biological mechanisms giving rise to variation in EA and the mediating traits affecting this outcome. We identified 74 independent SNPs associated with EA (p < 5E-8). The ability of the polygenic score to predict within-family differences suggests that very little of this signal is due to confounding. We found that both cognitive performance (0.82) and intracranial volume (0.39) show substantial genetic correlations with EA. Many of the biological pathways significantly enriched by our signals are active in early development, affecting the proliferation of neural progenitors, neuron migration, axonogenesis, dendrite growth, and synaptic communication. We nominate a number of individual genes of likely importance in the etiology of EA and mediating phenotypes such as cognitive performance.
For a hint at what to expect as more data become available, see Five Years of GWAS Discovery and On the genetic architecture of intelligence and other quantitative traits.

What was once science fiction will soon be reality.
Long ago I sketched out a science fiction story involving two Junior Fellows, one a bioengineer (a former physicist, building the next generation of sequencing machines) and the other a mathematician. The latter, an eccentric, was known for collecting signatures -- signed copies of papers and books authored by visiting geniuses (Nobelists, Fields Medalists, Turing Award winners) attending the Society's Monday dinners. He would present each luminary with an ornate (strangely sticky) fountain pen and a copy of the object to be signed. Little did anyone suspect the real purpose: collecting DNA samples to be turned over to his friend for sequencing! The mathematician is later found dead under strange circumstances. Perhaps he knew too much! ...

Saturday, May 07, 2016

What If Tinder Showed Your IQ? (Dalton Conley in Nautilus)

Dalton Conley is University Professor of Sociology and Medicine and Public Policy at Princeton University. He earned a PhD in Sociology (Columbia), and subsequently one in Behavior Genetics (NYU).

His take on the application of genetic technologies in 2050 is a bit more dystopian than mine (see below). I think he underrates the ability of genetic engineers to navigate pleiotropic effects. The genomic space for any complex trait is very high dimensional, and we know of examples of individuals who had outstanding characteristics in many areas, seemingly without negative compromises. However, Dalton is right to emphasize unforeseen outcomes and human folly in the use of any new technology, be it Tinder or genetic engineering :-)

For related discussion, see Slate Star Codex.
Nautilus: The not-so-young parents sat in the office of their socio-genetic consultant, an occupation that emerged in the late 2030s, with at least one practitioner in every affluent fertility clinic. They faced what had become a fairly typical choice: Twelve viable embryos had been created in their latest round of in vitro fertilization. Anxiously, they pored over the scores for the various traits they had received from the clinic. Eight of the 16-cell morulae were fairly easy to eliminate based on the fact they had higher-than-average risks for either cardiovascular problems or schizophrenia, or both. That left four potential babies from which to choose. One was going to be significantly shorter than the parents and his older sibling. Another was a girl, and since this was their second, they wanted a boy to complement their darling Rita, now entering the terrible twos. Besides, this girl had a greater than one-in-four chance of being infertile. Because this was likely to be their last child, due to advancing age, they wanted to maximize the chances they would someday enjoy grandchildren.

That left two male embryos. These embryos scored almost identically on disease risks, height, and body mass index. Where they differed was in the realm of brain development. One scored a predicted IQ of 180 and the other a “mere” 150. A generation earlier, a 150 IQ would have been high enough to assure an economically secure life in a number of occupations. But with the advent of voluntary artificial selection, a score of 150 was only above average. By the mid 2040s, it took a score of 170 or more to insure your little one would grow up to become a knowledge leader.

... But there was a catch. There was always a catch. The science of reprogenetics—self-chosen, self-directed eugenics—had come far over the years, but it still could not escape the reality of evolutionary tradeoffs, such as the increased likelihood of disease when one maximized on a particular trait, ignoring the others. Or the social tradeoffs—the high-risk, high-reward economy for reprogenetic individuals, where a few IQ points could make all the difference between success or failure, or where stretching genetic potential to achieve those cognitive heights might lead to a collapse in non-cognitive skills, such as impulse control or empathy.

... The early proponents of reprogenetics failed to take into account the basic genetic force of pleiotropy: that the same genes have not one phenotypic effect, but multiple ones. Greater genetic potential for height also meant a higher risk score for cardiovascular disease. Cancer risk and Alzheimer’s probability were inversely proportionate—and not only because if one killed you, you were probably spared the other, but because a good ability to regenerate cells (read: neurons) also meant that one’s cells were more poised to reproduce out of control (read: cancer).3 As generations of poets and painters could have attested, the genome score for creativity was highly correlated with that for major depression.

But nowhere was the correlation among predictive scores more powerful—and perhaps in hindsight none should have been more obvious—than the strong relationship between IQ and Asperger’s risk.4 According to a highly controversial paper from 2038, each additional 10 points over 120 also meant a doubling in the risk of being neurologically atypical. Because the predictive power of genotyping had improved so dramatically, the environmental component to outcomes had withered in a reflexive loop. In the early decades of the 21st century, IQ was, on average, only two-thirds genetic and one-third environmental in origin by young adulthood.5 But measuring the genetic component became a self-fulfilling prophecy. That is, only kids with high IQ genotypes were admitted to the best schools, regardless of their test scores. (It was generally assumed that IQ was measured with much error early in life anyway, so genes were a much better proxy for ultimate, adult cognitive functioning.) This pre-birth tracking meant that environmental inputs—which were of course still necessary—were perfectly predicted by the genetic distribution. This resulted in a heritability of 100 percent for the traits most important to society—namely IQ and (lack of) ADHD, thanks to the need to focus for long periods on intellectually demanding, creative work, as machines were taking care of most other tasks.

Who can say when this form of prenatal tracking started? Back in 2013, a Science paper constructed a polygenic score to predict education.6 At first, that paper, despite its prominent publication venue, did not attract all that much attention. That was fine with the authors, who were quite happy to fly under the radar with their feat: generating a single number based on someone’s DNA that was correlated, albeit only weakly, not only with how far they would go in school, but also with associated phenotypes (outcomes) like cognitive ability—the euphemism for IQ still in use during the early 2000s.

The approach to constructing a polygenic score—or PGS—was relatively straightforward: Gather up as many respondents as possible, pooling any and all studies that contained genetic information on their subjects as well as the same outcome measure. Education level was typically asked not only in social science surveys (that were increasingly collecting genetic data through saliva samples) but also in medical studies that were ostensibly focused on other disease-related outcomes but which often reported the education levels of the sample.

That Science paper included 126,000 people from 36 different studies across the western world. At each measured locus—that is, at each base pair—one measured the average difference in education level between those people who had zero of the reference (typically the rarer) nucleotide—A, T, G, or C—and those who had one of the reference base and those who had two of those alleles. The difference was probably on the order of a thousandth of a year of education, if that, or a hundredth of an IQ point. But do that a million times over for each measured variant among the 30 million or so that display variation within the 3 billion total base pairs in our genome, and, as they say, soon you are talking about real money.

That was the beauty of the PGS approach. Researchers had spent the prior decade or two pursuing the folly of looking for the magic allele that would be the silver bullet. Now they could admit that for complex traits like IQ or height or, in fact, most outcomes people care about in their children, there was unlikely to be that one, Mendelian gene that explained human difference as it did for diseases like Huntington’s or sickle cell or Tay-Sachs.

That said, from a scientific perspective, the Science paper on education was not Earth-shattering in that polygenic scores had already been constructed for many other less controversial phenotypes: height and body mass index, birth weight, diabetes, cardiovascular disease, schizophrenia, Alzheimer’s, and smoking behavior—just to name some of the major ones. Further, muting the immediate impact of the score’s construction was the fact that—at first—it only predicted 3 percent or so of the variation in years of schooling or IQ. Three percent was less than one-tenth of the variation in the bell curve of intelligence that was reasonably thought to be of genetic origin.

Instead of setting off of a stampede to fertility clinics to thaw and test embryos, the lower predictive power of the scores in the first couple decades of the century set off a scientific quest to find the “missing” heritability—that is, the genetic dark matter where the other, estimated 37 percent of the genetic effect on education was (or the unmeasured 72 percentage points of IQ’s genetic basis). With larger samples of respondents and better measurement of genetic variants by genotyping chips that were improving at a rate faster than Moore’s law in computing (doubling in capacity every six to nine months rather than the 18-month cycle postulated for semiconductors), dark horse theories for missing heritability (such as Lamarckian, epigenetic transmission of environmental shocks) were soon slain and the amount of genetic dark matter quickly dwindled to nothing. ...

Dalton and I participated in a panel discussion on this topic recently:

See also this post of 12/25/2015: Nativity 2050

And the angel said unto them, Fear not: for, behold, I bring you good tidings of great joy, which shall be to all people.
Mary was born in the twenties, when the tests were new and still primitive. Her mother had frozen a dozen eggs, from which came Mary and her sister Elizabeth. Mary had her father's long frame, brown eyes, and friendly demeanor. She was clever, but Elizabeth was the really brainy one. Both were healthy and strong and free from inherited disease. All this her parents knew from the tests -- performed on DNA taken from a few cells of each embryo. The reports came via email, from GP Inc., by way of the fertility doctor. Dad used to joke that Mary and Elizabeth were the pick of the litter, but never mentioned what happened to the other fertilized eggs.

Now Mary and Joe were ready for their first child. The choices were dizzying. Fortunately, Elizabeth had been through the same process just the year before, and referred them to her genetic engineer, a friend from Harvard. Joe was a bit reluctant about bleeding edge edits, but Mary had a feeling the GP engineer was right -- their son had the potential to be truly special, with just the right tweaks ...
See also [1], [2], and [3].

Friday, May 06, 2016

HLi and genomic prediction of facial morphology

This WIRED article profiles Human Longevity, Inc., a genomics and machine learning startup led by Craig Venter. Its stated goal is to sequence 1 million people in the next few years.

The figure above is an example of facial morphology prediction from DNA. Face recognition algorithms decompose a given face into a finite feature set (e.g., coefficients of eigen-faces). As we know from the resemblance between identical twins, these features/coefficients are highly heritable, and hence can be predicted from genomic data.
WIRED: ... "From just the fingerprint on your pen, we can sequence your genome and identify how you look," Venter explains. "It's good enough to pick someone out of a ten-person line-up and it's getting better all the time." These prediction algorithms were developed at Venter's latest venture, biosciences startup Human Longevity, Inc (HLi) by measuring 30,000 datapoints from across the faces of a thousand volunteers, then using machine learning to identify patterns between their facial morphology and their entire genetic code. "We could take foetal cells from a mother's bloodstream, sequence the genome and give her a picture of what her future child will look like at 18," he says.
HLi's sequencing and phenotyping are done in San Diego, but the machine learning group hangs out at this third wave cafe in Mountain View :-)

I gave this talk there last year.

Wednesday, May 04, 2016

Atavist Magazine: The Mastermind

Highly recommended! Fantastic long form reporting -- 2 years in the making -- by Evan Ratliff. Podcast interview with the author.

Le Roux ran a global crime empire which accumulated hundreds of millions of dollars, conducted assassinations in multiple countries, and had its own private army. Most criminals are stupid, but Le Roux is highly intelligent, disciplined, hard-working and totally amoral.

(The prisoner in the photo above is not Le Roux, but one of his lieutenants, a former US soldier captured in Thailand.)
Atavist Magazine: The Mastermind: He was a brilliant programmer and a vicious cartel boss, who became a prized U.S. government asset. The Atavist Magazine presents a story of an elusive criminal kingpin, told in weekly installments.

"Not even in a movie. This is real stuff. You see James Bond in the movie and you’re saying, “Oh, I can do that.” Well, you’re gonna do it now. Everything you see, or you’ve thought about you’re gonna do. It’s, it’s real and it’s up to you. You know how the government says if you work through the government [U/I] we don’t know you. Same thing with this job. No different right? So, that’s how it is. Same thing you do in the military except you’re doing for these guys you know? If you get caught in war, you get killed, right? Unless you surrender if they let you surrender or if you get you know, the same thing. This is… Everything’s just like you’re in war [U/I] now."
Here are the final paragraphs:
... It seemed to me that he tried to apply the detached logic of software to real life. That’s why the DEA schemes must have appealed to him as much as his own. His approach was algorithmic, not moral: Set the program in motion and watch it run.

But Lulu’s comment about infamy stuck with me. Perhaps that wasn’t Le Roux’s aim at first, but over time it became something he coveted. Le Roux had known all along that he’d get caught—ultimately, the program could only lead to one outcome. But that meant that I, too, was part the design.

One afternoon, two months ago, I met an Israeli former employee of Le Roux’s at a quiet upstairs table in a cafĂ© inside a Tel Aviv mall. I’d had a difficult time persuading this man to talk to me at all. He was free of Le Roux’s organization, on to new things. He hadn’t been indicted in the prescription-drug case, despite working in one of the call centers, although he said he planned to wait a few years before traveling to the U.S., just in case. I asked him this question, too: What did Le Roux want? “He wanted to be the biggest ever caught,” he said.

As we said good-bye, he told me, “What’s important is that justice be done, for what Paul did.” Then he leaned in, pointing at my notebook. “If you publish this story, ultimately you are giving him what he wanted. And by talking to you I guess I am, too. This is what he wanted. This story to be told, in this way.”

Tuesday, May 03, 2016

Homo Sapiens 2.0? (Jamie Metzl, TechCrunch)

Jamie Metzl writes in TechCrunch.
Homo Sapiens 2.0? We need a species-wide conversation about the future of human genetic enhancement:

After 4 billion years of evolution by one set of rules, our species is about to begin evolving by another.

Overlapping and mutually reinforcing revolutions in genetics, information technology, artificial intelligence, big data analytics, and other fields are providing the tools that will make it possible to genetically alter our future offspring should we choose to do so. For some very good reasons, we will.

Nearly everybody wants to have cancers cured and terrible diseases eliminated. Most of us want to live longer, healthier and more robust lives. Genetic technologies will make that possible. But the very tools we will use to achieve these goals will also open the door to the selection for and ultimately manipulation of non-disease-related genetic traits — and with them a new set of evolutionary possibilities.

As the genetic revolution plays out, it will raise fundamental questions about what it means to be human, unleash deep divisions within and between groups, and could even lead to destabilizing international conflict.

And the revolution has already begun. ...
See also this panel discussion with Metzl, Steve Pinker, Dalton Conley, and me.

When Everyone Goes to College: a Lesson From South Korea

South Korea leads the world in college attendance rate, which is approaching 100%. This sounds great at first, until you consider that the majority of the population (in any country) lacks the cognitive ability to pursue a rigorous college education (or at least what used to be defined as a rigorous college education).
Chronicle of Higher Education: ... Seongho Lee, a professor of education at Chung-Ang University, criticizes what he calls "college education inflation." Not all students are suited for college, he says, and across institutions, their experience can be inconsistent. "It’s not higher education anymore," he says. "It’s just an extension of high school." And subpar institutions leave graduates ill prepared for the job market.

A 2013 study by McKinsey Global Institute, the economic-research arm of the international consulting firm, found that lifetime earnings for graduates of Korean private colleges were less than for workers with just a high-school diploma. In recent years, the unemployment rate for new graduates has topped 30 percent.

"The oversupply in college education is a very serious social problem," says Mr. Lee, even though Korea, with one of the world’s lowest fertility rates, has a declining college-age population. The country, he worries, is at risk of creating an "army of the unemployed." ...
See also Brutal, Just Brutal.

Sunday, May 01, 2016

The Future of Machine Intelligence

See you at Foo Camp in June! Get a free copy of this book at the link.
The Future of Machine Intelligence 
Perspectives from Leading Practitioners
By David Beyer

Publisher: O'Reilly
Released: March 2016

Advances in both theory and practice are throwing the promise of machine learning into sharp relief. The field has the potential to transform a range of industries, from self-driving cars to intelligent business applications. Yet machine learning is so complex and wide-ranging that even its definition can change from one person to the next.

The series of interviews in this exclusive report unpack concepts and innovations that represent the frontiers of ever-smarter machines. You’ll get a rare glimpse into this exciting field through the eyes of some of its leading minds.

In these interviews, these ten practitioners and theoreticians cover the following topics:

Anima Anandkumar: high-dimensional problems and non-convex optimization
Yoshua Bengio: Natural Language Processing and deep learning
Brendan Frey: deep learning meets genomic medicine
Risto Miikkulainen: the startling creativity of evolutionary algorithms
Ben Recht: a synthesis of machine learning and control theory
Daniela Rus: the autonomous car as a driving partner
Gurjeet Singh: using topology to uncover the shape of your data
Ilya Sutskever: the promise of unsupervised learning and attention models
Oriol Vinyals: sequence-to-sequence machine learning
Reza Zadeh: the evolution of machine learning and the role of Spark

About the editor: David Beyer is an investor with Amplify Partners, an early-stage VC focused on the next generation of infrastructure IT, data, and information security companies. Part of the founding team at Patients Know Best, one of the world’s leading cloud-based Personal Health Record (PHR) companies, he was also the co-founder and CEO of Chartio.com, a pioneering provider of cloud-based data visualization and analytics.

Friday, April 29, 2016

Gell-Mann on quantum foundations

Google knows enough about me that my YouTube feed now routinely suggests content of real interest. A creepy but positive development ;-)

Today YouTube suggested this video of Murray Gell-Mann talking about Everett, decoherence, and quantum mechanics. I had seen this video on another web site years ago and blogged about it (post reproduced below), but now someone has uploaded it to YouTube.

More on Murray here and here:
After the talk I had a long conversation with John Preskill about many worlds, and he pointed out to me that both Feynman and Gell-Mann were strong advocates: they would go so far as to browbeat visitors on the topic. In fact, both claimed to have invented the idea independently of Everett.
See also my recent paper The measure problem in many worlds quantum mechanics.

Gell-Mann, Feynman, Everett

This site is a treasure trove of interesting video interviews -- including with Francis Crick, Freeman Dyson, Sydney Brenner, Marvin Minsky, Hans Bethe, Donald Knuth, and others. Many of the interviews have transcripts, which are much faster to read than listening to the interviews themselves.

Here's what Murray Gell-Mann has to say about quantum foundations:
In '63…'64 I worked on trying to understand quantum mechanics, and I brought in Felix Villars and for a while some comments... there were some comments by Dick Feynman who was nearby. And we all agreed on a rough understanding of quantum mechanics and the second law of thermodynamics and so on and so on, that was not really very different from what I'd been working on in the last ten or fifteen years.

I was not aware, and I don't think Felix was aware either, of the work of Everett when he was a graduate student at Princeton and worked on this, what some people have called 'many worlds' idea, suggested more or less by Wheeler. Apparently Everett was, as we learned at the Massagon [sic] meeting, Everett was an interesting person. He… it wasn't that he was passionately interested in quantum mechanics; he just liked to solve problems, and trying to improve the understanding of quantum mechanics was just one problem that he happened to look at. He spent most of the rest of his life working for the Weapon System Evaluation Group in Washington, WSEG, on military problems. Apparently he didn't care much as long as he could solve some interesting problems! [Some of these points, concerning Everett's life and motivations, and Wheeler's role in MW, are historically incorrect.]

Anyway, I didn't know about Everett's work so we discovered our interpretation independent of Everett. Now maybe Feynman knew about… about Everett's work and when he was commenting maybe he was drawing upon his knowledge of Everett, I have no idea, but… but certainly Felix and I didn't know about it, so we recreated something related to it.

Now, as interpreted by some people, Everett's work has two peculiar features: one is that this talk about many worlds and equally… many worlds equally real, which has confused a lot of people, including some very scholarly students of quantum mechanics. What does it mean, 'equally real'? It doesn't really have any useful meaning. What the people mean is that there are many histories of the… many alternative histories of the universe, many alternative course-grained, decoherent histories of the universe, and the theory treats them all on an equal footing, except for their probabilities. Now if that's what you mean by equally real, okay, but that's all it means; that the theory treats them on an equal footing apart from their probabilities. Which one actually happens in our experience, is a different matter and it's determined only probabilistically. Anyway, there's considerable continuity between the thoughts of '63-'64 and the thoughts that, and… and maybe earlier in the ‘60s, and the thoughts that Jim Hartle and I have had more recently, starting around '84-'85.
Indeed, Feynman was familiar with Everett's work -- see here and here.

Where Murray says "it's determined only probabilistically" I would say there is a subjective probability which describes how surprised one is to find oneself on a particular decoherent branch or history of the overall wavefunction -- i.e., how likely or unlikely we regard the outcomes we have observed to have been. For more see here.

Murray against Copenhagen:
... although the so-called Copenhagen interpretation is perfectly correct for all laboratory physics, laboratory experiments and so on, it's too special otherwise to be fundamental and it sort of strains credulity. It's… it’s not a convincing fundamental presentation, correct though… though it is, and as far as quantum cosmology is concerned it's hopeless. We were just saying, we were just quoting that old saw: describe the universe and give three examples. Well, to apply the… the Copenhagen interpretation to quantum cosmology,  you'd need a physicist outside the universe making repeated experiments, preferably on multiple copies of the universe and so on and so on. It's absurd. Clearly there is a definition to things happening independent of human observers. So I think that as this point of view is perfected it should be included in… in teaching fairly early, so that students aren't convinced that in order to understand quantum mechanics deeply they have to swallow some of this…very… some of these things that are very difficult to believe. But in the end of course, one can use the Copenhagen interpretations perfectly okay for experiments.

Thursday, April 28, 2016

Can't we all just get along? Albion's Seed reviewed on Slate Star Codex

Highly recommended! Scott Alexander reviews and summarizes Albion's Seed. The copy I read years ago belonged to the university library. Scott mentions so many interesting topics I had forgotten that I just now got a personal copy of the book.
Slate Star Codex: Albion’s Seed by David Fischer is a history professor’s nine-hundred-page treatise on patterns of early immigration to the Eastern United States. It’s not light reading and not the sort of thing I would normally pick up. I read it anyway on the advice of people who kept telling me it explains everything about America. And it sort of does.

In school, we tend to think of the original American colonists as “Englishmen”, a maximally non-diverse group who form the background for all of the diversity and ethnic conflict to come later. Fischer’s thesis is the opposite. Different parts of the country were settled by very different groups of Englishmen with different regional backgrounds, religions, social classes, and philosophies. The colonization process essentially extracted a single stratum of English society, isolated it from all the others, and then plunked it down on its own somewhere in the Eastern US. ...

... If America is best explained as a Puritan-Quaker culture locked in a death-match with a Cavalier-Borderer culture, with all of the appeals to freedom and equality and order and justice being just so many epiphenomena – well, I’m not sure what to do with that information. Push it under the rug? Say “Well, my culture is better, so I intend to do as good a job dominating yours as possible?” Agree that We Are Very Different Yet In The End All The Same And So Must Seek Common Ground? Start researching genetic engineering? Maybe secede? I’m not a Trump fan much more than I’m an Osama bin Laden fan; if somehow Osama ended up being elected President, should I start thinking “Maybe that time we made a country that was 49% people like me and 51% members of the Taliban – maybe that was a bad idea“.

I don’t know. But I highly recommend Albion’s Seed as an entertaining and enlightening work of historical scholarship which will be absolutely delightful if you don’t fret too much over all of the existential questions it raises.

Tuesday, April 26, 2016

New Yorker on epigenetics

This is a fairly balanced account of recent progress in epigenetics (at least the part I excerpt below). But see here for negative reactions.
New Yorker: ... But, if epigenetic information can be transmitted through sperm and eggs, an organism would seem to have a direct conduit to the heritable features of its progeny. Such a system would act as a wormhole for evolution—a shortcut through the glum cycles of mutation and natural selection.

My visit with Allis had ended on a cautionary note. “Much about the transmission of epigenetic information across generations is unknown, and we should be careful before making up theories about the kind of information or memory that is transmitted,” he told me. By bypassing the traditional logic of genetics and evolution, epigenetics can arouse fantasies about warp-speeding heredity: you can make your children taller by straining your neck harder. Such myths abound and proliferate, often dangerously. A child’s autism, the result of genetic mutation, gets attributed to the emotional trauma of his great-grandparents. Mothers are being asked to minimize anxiety during their pregnancy, lest they taint their descendants with anxiety-ridden genes. Lamarck is being rehabilitated into the new Darwin.

These fantasies should invite skepticism. Environmental information can certainly be etched on the genome. But such epigenetic scratch marks are rarely, if ever, carried forward across generations. A man who loses a leg in an accident bears the imprint of that accident in his cells, wounds, and scars, but he does not bear children with shortened legs. A hundred and forty generations of circumcision have not made the procedure any shorter. ...

Monday, April 25, 2016

Prince Rogers Nelson

If you're a true fan, and like guitar, you'll enjoy this 20 minute live version of Purple Rain. The whole concert.

Prince was at his peak in 1985. I saw him on the Purple Rain tour in Los Angeles, with Sheila E. opening. We camped out at a Tower Records in Pasadena to get the tickets.

When the movie opened in 1984 I was home for the summer in Iowa after my freshman year. I think all of my high school friends were big fans. Eventually I had it on VHS and watched it many times with my girlfriend.

Here's Sheila E. doing The Glamorous Life. 80's forever! :-)

Thursday, April 21, 2016

Deep Learning tutorial: Yoshua Bengio, Yann Lecun NIPS 2015

I think these are the slides.

One of the topics which I've remarked on before is the absence of local minima in the high dimensional optimization required to tune these DNNs. In the limit of high dimensionality a critical point is overwhelmingly likely to be a saddlepoint (have at least one negative eigenvalue). This means that even though the surface is not strictly convex the optimization is tractable.

Wednesday, April 20, 2016

Free Harvard, Fair Harvard

Desert scale model of the solar system

Magnetic brain stimulation and autism

If this account is true, it's simply amazing.
NY Magazine: What It’s Like to ‘Wake Up’ From Autism After Magnetic Stimulation

... Though he wasn’t diagnosed with autism until he was 40, John Elder Robison felt isolated and disconnected throughout his entire youth and early adulthood. But in 2008, at 50, he took part in what became a three-year research project looking at brain function in individuals with autism spectrum disorders and exploring the use of transcranial magnetic stimulation (TMS) to help them.

TMS is a noninvasive procedure that uses magnetic pulses to stimulate nerve cells in the brain. During treatment, a coil is placed against the patient’s scalp and the TMS energy passes through the skull into the outermost layer of the brain. ...

The treatment left Robison momentarily crippled by the weight of other people’s feelings, and he spoke with Science of Us about his experience, which he also discusses in his recently released book, Switched On: A Memoir of Brain Change and Emotional Awakening. ...

Do you understand now what was happening?
TMS modified my emotional response to what you might call ordinary situations. I often put it this way: You might be crossing the street and you fall and you skin your knee. I’d say, “Come on, get up!” The very best advice I could give is come on, get going, this car could run you over. People would see my practical response as cold and emotionless. After TMS, I’d look at you and wince at your skinned knee. I never did that before. And I now realize that wincing at your skinned knee is the response most people have. I still have the autistic response, but I’m also aware of what you might now call the “empathetic response from personal experience.” People can tell you about something a million times, and it won’t mean anything to you until you experience it. That said, it’s important to understand that I always had the ability to feel your pain. Like, if you were my girlfriend and you got sick I’d be more worried about you than your own mother. I was always that way. But no matter how much I cared about you, if we were crossing the street, you fell down and skinned your knee, I would see your skinned knee and I would say “Come on, we gotta get going,” or I would say, “Here, I’ll get you a Band-Aid.” I would have a practical response. The way I responded is no reflection on how much I cared for you. I could care for you with all the love in the world and still I’d respond practically.

So you don’t feel you’d really lacked empathy before?
No. In fact, studies have shown that autistic people feel things more deeply, not less at all. It’s true that autism is described as a condition with communication impairment. And so, to be diagnosed with autism, you must have an impaired ability to speak, to understand speech, or to understand or convey unspoken cues.

So what exactly happened when you first stated noticing emotional cues?
It hit me all at once with an intensity that was absolutely scary. As I lay in bed, trying to fall asleep, the world started revolving. I became afraid I was having a stroke. I’d close my eyes and the world would spin like I was drunk, about to throw up. I don’t drink or do drugs. So for me to have the world spinning like that made me think there was something terribly wrong. And not only was the world spinning, I would close my eyes and I would have these really vivid, half-awake, half-asleep dreams that were a collage of things from the past and things that had just happened that day and they were just so real. The experience was so unsettling that I woke up and wrote a 1,500-word missive to the scientists describing what had happened. Then, finally, I was able to fall asleep.

The next day at work I looked at one of my colleagues and I thought to myself: He has the most beautiful brown eyes. That’s the type of thought I simply do not have. I don’t usually have any comment on your eyes because I don’t look in anyone’s eyes. For me to look in your eyes and say that they are beautiful is totally out of character. When I got to work I walked into the waiting room, as I usually do, and I looked at everyone and there was this flood of emotion. I could see it all: They were scared and anxious and eager, and never in my life had I seen something like that. I had to step out of the room because I didn’t know how to cope. It felt like ESP. Maybe in the past I used the logical part of my brain to look at people around me and carefully analyze. I figured out situations using logic. So I had that powerful ability but now the screen of emotion was turned on, too. ...

Tuesday, April 19, 2016

Julian Assange, Eric Schmidt, and real-time psychometry

In this excerpt from his 2014 book, Julian Assange analyzes Google CEO Eric Schmidt and his sidekick Jared Cohen. See also Chief Executives: brainpower, personality, and height.
Google is not what it seems: ... The stated reason for the visit was a book. Schmidt was penning a treatise with Jared Cohen, the director of Google Ideas, an outfit that describes itself as Google’s in-house “think/do tank.” I knew little else about Cohen at the time. In fact, Cohen had moved to Google from the US State Department in 2010. He had been a fast-talking “Generation Y” ideas man at State under two US administrations, a courtier from the world of policy think tanks and institutes, poached in his early twenties. He became a senior advisor for Secretaries of State Rice and Clinton. At State, on the Policy Planning Staff, Cohen was soon christened “Condi’s party-starter,” channeling buzzwords from Silicon Valley into US policy circles and producing delightful rhetorical concoctions such as “Public Diplomacy 2.0.”

... Schmidt was a good foil. A late-fiftysomething, squint-eyed behind owlish spectacles, managerially dressed—Schmidt’s dour appearance concealed a machinelike analyticity. His questions often skipped to the heart of the matter, betraying a powerful nonverbal structural intelligence. It was the same intellect that had abstracted software-engineering principles to scale Google into a megacorp, ensuring that the corporate infrastructure always met the rate of growth. This was a person who understood how to build and maintain systems: systems of information and systems of people. My world was new to him, but it was also a world of unfolding human processes, scale, and information flows.

For a man of systematic intelligence, Schmidt’s politics—such as I could hear from our discussion—were surprisingly conventional, even banal. He grasped structural relationships quickly, but struggled to verbalize many of them, often shoehorning geopolitical subtleties into Silicon Valley marketese or the ossified State Department microlanguage of his companions.9 He was at his best when he was speaking (perhaps without realizing it) as an engineer, breaking down complexities into their orthogonal components.

I found Cohen a good listener, but a less interesting thinker, possessed of that relentless conviviality that routinely afflicts career generalists and Rhodes scholars. As you would expect from his foreign-policy background, Cohen had a knowledge of international flash points and conflicts and moved rapidly between them, detailing different scenarios to test my assertions. But it sometimes felt as if he was riffing on orthodoxies in a way that was designed to impress his former colleagues in official Washington. ...
See Creators and Rulers and this earlier Assange post, specifically reference to his early pursuit of theoretical physics and thoughts on high cognitive ability. He registered the domain IQ.org for his (now defunct) blog, and cites psychologist Leta Hollingworth in his 23 Sept 2006 post:
IQ.org: ... Observation shows that there is a direct ratio between the intelligence of the leader and that of the led. To be a leader of his contemporaries a child must be more intelligent but not too much more intelligent than those to be led... But generally speaking, a leadership pattern will not form--or it will break up--when a discrepancy of more than about 30 points of IQ comes to exist between leader and led ...

See also Human capital mongering: M-V-S profiles. Note deviation scores (SDs) below are relative to the average among the gifted kids in the sample, not relative to the general population. The people in this sample are probably above average in the general population on each of M-V-S.

The figure below displays the math, verbal and spatial scores of gifted children tested at age 12, and their eventual college majors and career choices. This group is cohort 2 of the SMPY/SVPY study: each child scored better than 99.5 percentile on at least one of the M-V sections of the SAT.

Scores are normalized in units of SDs. The vertical axis is V, the horizontal axis is M, and the length of the arrow reflects spatial ability: pointing to the right means above the group average, to the left means below average; note the arrow for business majors should be twice as long as indicated but there was not enough space on the diagram. The spatial score is obviously correlated with the M score.

Upper right = high V, high M (e.g., physical science)
Upper left = high V, lower M (e.g., humanities, social science)
Lower left = lower V, lower M (e.g., business, law)
Lower right = lower V, high M (e.g., math, engineering, CS)

Thursday, April 14, 2016

The story of the Monte Carlo Algorithm

George Dyson is Freeman's son. I believe this talk was given at SciFoo or Foo Camp.

More Ulam (neither he nor von Neumann were really logicians, at least not primarily).

Wikipedia on Monte Carlo Methods. I first learned these in Caltech's Physics 129: Mathematical Methods, which used the textbook by Mathews and Walker. This book was based on lectures taught by Feynman, emphasizing practical techniques developed at Los Alamos during the war. The students in the class were about half undergraduates and half graduate students. For example, Martin Savage was a first year graduate student that year. Martin is now a heavy user of Monte Carlo in lattice gauge theory :-)

Tuesday, April 12, 2016

Applied genomics: the genetic "super cow"

This figure and the excerpt below come from the genomics column of The Bullvine, a publication aimed at dairy breeders (via Carl Shulman). In the highly-developed world of genomic cattle breeding, it has been speculated that the theoretical "maximal type" bull could have ten times the merit (defined in terms of increased earnings of its progeny, see below) of the top breeding bulls that exist today. Sound familiar? See This is for PZ Myers for related discussion concerning human cognition.

While the "maximal type" may never be achieved, it is interesting that each new top sire surpasses the previous record holder, and previous generations, by a substantial margin. In the human context, this would correspond to a steady stream of individuals, each of greater ability, all of whom surpass the greatest historical geniuses.

By Andrew Hunt

During the recent “Advancing Dairy Cattle Genetics: Genomics and Beyond”, Paul VanRaden with USDA’s Animal Improvement Programs Laboratory pointed out that “If we took the best haplotypes (genes) from all the cows genomic tested to date, we would have a cow at $7515 Net Merit”. That’s pretty spectacular considering the current top Sire on the $NM list is DE VOLMER DG SUPERSHOT at 1000 $NM.

Now to put that into perspective the current rate of gain is $80 per year. So in order to breed that $7515 animal it would take us 81 years to actually breed that animal. Therefore it raises the question whether such an animal is actually achievable and is there technology out there that could accelerate the process of getting that Super Cow.

The interesting fact here is the greatly accelerated rate of genetic gain since the introduction of genomics. This results for the most part from the greatly shortened generation interval. Females are now being used as bull mothers 18 months sooner than in the past (as yearlings vs mid 1st lactation), and sires of sons are now being used 24 and sometime 36 months sooner than they were in the past. (Genomic indexes vs waiting for proven sires). The almost 40% increase in reliability of estimated transmitting ability has breeders and AI companies contracting and working with these elite animals at a significantly younger age. ...
See also this post from 2012: Genomic prediction: no bull
The Atlantic: ... Data-driven predictions are responsible for a massive transformation of America's dairy cows. While other industries are just catching on to this whole "big data" thing, the animal sciences -- and dairy breeding in particular -- have been using large amounts of data since long before VanRaden was calculating the outsized genetic impact of the most sought-after bulls with a pencil and paper in the 1980s. Dairy breeding is perfect for quantitative analysis. Pedigree records have been assiduously kept; relatively easy artificial insemination has helped centralized genetic information in a small number of key bulls since the 1960s; there are a relatively small and easily measurable number of traits -- milk production, fat in the milk, protein in the milk, longevity, udder quality -- that breeders want to optimize; each cow works for three or four years, which means that farmers invest thousands of dollars into each animal, so it's worth it to get the best semen money can buy. The economics push breeders to use the genetics.

The bull market (heh) can be reduced to one key statistic, lifetime net merit, though there are many nuances that the single number cannot capture. Net merit denotes the likely additive value of a bull's genetics. The number is actually denominated in dollars because it is an estimate of how much a bull's genetic material will likely improve the revenue from a given cow. A very complicated equation weights all of the factors that go into dairy breeding and -- voila -- you come out with this single number. For example, a bull that could help a cow make an extra 1000 pounds of milk over her lifetime only gets an increase of $1 in net merit while a bull who will help that same cow produce a pound more protein will get $3.41 more in net merit. An increase of a single month of predicted productive life yields $35 more. When you add it all up, Badger-Fluff Fanny Freddie has a net merit of $792. No other proven sire ranks above $750 and only seven bulls in the country rank above $700.
... theoretical calculations suggest that even outliers with net merit of $700-800 will be eclipsed by specimens with 10x higher merit that can be produced by further selection on existing genetic variation. Similar results apply to humans. ...

Blog Archive