Information Processing: 10/2010

Friday, October 29, 2010

Group effectiveness and intelligence

A colleague sent me this interesting Science paper on the effectiveness of groups at solving problems. The effectiveness of a particular group, while not strongly correlated with the average or maximum intelligence of group members, does depend on the nature of social interactions. I think this could be generalized to large societies as well -- some societies with less human capital might nevertheless be more effective for other reasons related to individual tendencies for cooperation or mode of organization (communism vs market economies is an obvious example). Some academic departments have a very low collective intelligence, not just due to the individual average ;-)

Evidence for a Collective Intelligence Factor in the Performance of Human Groups

Psychologists have repeatedly shown that a single statistical factor—often called “general intelligence”—emerges from the correlations among people’s performance on a wide variety of cognitive tasks. But no one has systematically examined whether a similar kind of “collective intelligence” exists for groups of people. In two studies with 699 people, working in groups of two to five, we find converging evidence of a general collective intelligence factor that explains a group’s performance on a wide variety of tasks. This “c factor” is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group.

[Note: I find their use of a 10 minute Raven's Advanced Progressive Matrices (RAPM) to measure individual intelligence a bit problematic. The correlation of 10 minute RAPM with full RAPM is probably not that high (say, < .7). So they only have a noisy measure of individual intelligence. I didn't see that they took this uncertainty into account in a sophisticated way when analyzing the relation between c and avg or max IQ. This information is in the supplement.]

These results remind me (perhaps tangentially) of remarks by W.D. Hamilton in his paper Innate Social Aptitudes of Man:

The incursions of barbaric pastoralists seem to do civilizations less harm in the long run than one might expect. Indeed, two dark ages and renaissances in Europe suggest a recurring pattern in which a renaissance follows an incursion by about 800 years. It may even be suggested that certain genes or traditions of pastoralists revitalize the conquered people with an ingredient of progress which tends to die out in a large panmictic population for the reasons already discussed. I have in mind altruism itself, or the part of the altruism which is perhaps better described as self-sacrificial daring. By the time of the renaissance it may be that the mixing of genes and cultures (or of cultures alone if these are the only vehicles, which I doubt) has continued long enough to bring the old mercantile thoughtfulness and the infused daring into conjunction in a few individuals who then find courage for all kinds of inventive innovation against the resistance of established thought and practice. Often, however, the cost in fitness of such altruism and sublimated pugnacity to the individuals concerned is by no means metaphorical, and the benefits to fitness, such as they are, go to a mass of individuals whose genetic correlation with the innovator must be slight indeed. Thus civilization probably slowly reduces its altruism of all kinds, including the kinds needed for cultural creativity (see also Eshel 1972).

This paper by Hamilton is also of interest because of the nice discussion of multi-level selection (what is the proper unit of selection in evolutionary biology: the gene, the individual, the group... ?), a subject of recent controversy. In physics we would call this "effective field theory" or "relevant degrees of freedom" :-) That one might specialize to the appropriate unit of selection in a given context is equivalent to noting that we need not consider quarks or quantum mechanics to analyze the aerodynamics of a 747. In some contexts (e.g., turbulence or strongly coupled systems), interactions on many scales are relevant and the corresponding dynamics very complex. The same may be true for human evolution: genes, individuals and groups, even cultures, all interact nontrivially.

Wednesday, October 27, 2010

Maxwell's Demon and genetic engineering

Ronald Fisher on positive alleles for intelligence, in Mendelism and Biometry (1911).

Suppose we knew, for example, 20 pairs of mental characters [loci in the genome]. These would combine in over a million pure mental types; each of these would naturally occur rather less frequently than once in a billion; or in a country like England about once in 20,000 generations; it will give some idea as to the excellence of the best of these types when we consider that the Englishmen from Shakespeare to Darwin have occurred within 10 generations; the thought of a race of men combining the illustrious qualities of these giants, and breeding true to them, is almost too overwhelming, but such a race will inevitably arise in whatever country first sees the inheritance of mental characters elucidated.

The amount of variation in intelligence within a particular family is almost as large as in the population as a whole, mainly due to the diploid nature of our genomes (half of the genes come randomly from each parent). Thus, as Fisher noted, superior characteristics do not "breed true" (see earlier discussion of regression and the residual SD of about 12 even after parental midpoint is fixed; note what is commonly referred to as regression to the mean has a large environmental component, which is distinct from what is discussed here). It is unlikely for a particular descendant to inherit most of the positive variants from both the mother and the father. If loci with positive effect on intelligence were identified, and selection performed on gametes (or zygotes), one could ensure offspring with many more advantageous alleles than normally obtained by chance. We would obtain the best from the set of possible offspring of a given mother and father. A process of this type can be thought of as a Maxwell's Demon of reproduction -- it suppresses fluctuations of the wrong sign.

The mathematical theory of additive (linear) genetics was elucidated by Fisher -- he even gave estimates for the non-additive part of heritability (CP9 and CP24 here). It seems likely to me that the first applications on humans (as opposed to selective breeding of plants and animals, which has already taken place) of Fisher's insights will come just over a hundred years later! Technology may finally allow us to explore the most interesting regions of the space of genetic variation :-)

PS I thought about titling this post They Might Be Giants, but the statistical mechanic in me won out :-) Fisher was well aware of the parallels between his use of multivariate statistics in genetics and the corresponding application in the theory of gases.

Monday, October 25, 2010

Acts of creation

Some artwork by my 4 year old kids. It ain't Picasso, but I like it :-)

Le sacre du printemps.

Wolverine.

Eat your Veggies.

Yellow Peril: 2010 and 1920

After watching this commercial, read the excerpt below from The Rising Tide Of Color, a 1920 bestseller (publisher: Charles Scribner) by Lothrop Stoddard (Harvard College class of 1905, Harvard PhD in History 1914).

From The Rising Tide Of Color (Amazon: "Stoddard's arguments were once taken seriously by the American establishment and President Warren G. Harding publicly praised this book at a public speech on 26 October 1922. The introduction to this book was written by Madison Grant, Chairman of the New York Zoological Society, and Trustee of the American Museum of Natural History."). Web version.

[Chapter 2]

... And another French observer, RenÈ Pinon, as far back as 1905, found the primary school children of Kiang-Su province chanting the following lines: "I pray that the frontiers of my country become hard as bronze; that it surpass Europe and America; that it subjugate Japan; that its land and sea armies cover themselves with resplendent glory: that over the whole earth float the Dragon Standard; that the universal mastery of the empire extend and progress. May our empire, like a sleeping tiger suddenly awakened, spring roaring into the arena of combats." (RenÈ Pinon, "La Lutte pour le Pacifique," p. 152 (Paris, 1906).)

... Nor are the Chinese themselves blind to the advantages of Chino-Japanese co-operation. They have an instinctive assurance in their own capacities, they know how they have ultimately digested all their conquerors, and many Chinese to-day think that from a Chino-Japanese partnership, no matter how framed, the inscrutable "Sons of Han" would eventually get the lion's share. Certainly no one has ever denied the Chinaman's extraordinary economic efficiency. Winnowed by ages of grim elimination in a land populated to the uttermost limits of subsistence, the Chinese race is selected as no other for survival under the fiercest conditions of economic stress. At home the average Chinese lives his whole life literally within a hand's breadth of starvation. Accordingly, when removed to the easier environment of other lands, the Chinaman brings with him a working capacity which simply appalls his competitors. That urbane Celestial, Doctor Wu-Ting-Fang, well says of his own people: "Experience proves that the Chinese as all-round laborers can easily outdistance all competitors. They are industrious, intelligent, and orderly. They can work under conditions that would kill a man of less hardy race; in heat that would kill a salamander, or in cold that would please a polar bear, sustaining their energies, through long hours of unremitting toil with only a few bowls of rice." (Quoted by Alleyne Ireland, "Commercial Aspects of the Yellow Peril," North American Review, September, 1900.)

This Chinese estimate is echoed by the most competent foreign observers. The Australian thinker, Charles E. Pearson, wrote of the Chinese a generation ago in his epoch-making book, "National Life and Character": "Flexible as Jews, they can thrive on the mountain plateaux of Thibet and under the sun of Singapore; more versatile even than Jews, they are excellent laborers, and not without merit as soldiers and sailors; while they have a capacity for trade which no other nation of the East possesses. They do not need even the accident of a man of genius to develop their magnificent future." (Charles H. Pearson, "National Life and Character," p. 118 (2nd edition).)

And Lafcadio Hearn says: "A people of hundreds of millions disciplined for thousands of years to the most untiring industry and the most self-denying thrift, under conditions which would mean worse than death for our working masses -- a people, in short, quite content to strive to the uttermost in exchange for the simple privilege of life." (Quoted by Ireland, supra.)

This economic superiority of the Chinaman shows not only with other races, but with his yellow kindred as well. As regards the Japanese, John Chinaman has proved it to the hilt. Wherever the two have met in economic competition, John has won hands down. Even in Japanese colonies like Korea and Formosa, the Japanese, with all the backing of their government behind them, have been worsted. ...

[Chapter 11]

... Thirty years ago, Professor Pearson forecast China's imminent industrial transformation. "Does any one doubt," he asks, "that the day is at hand when China will have cheap fuel from her coal-mines, cheap transport by railways and steamers, and will have founded technical schools to develop her industries? Whenever that day comes, she may wrest the control of the world's markets, especially throughout Asia, from England and Germany." (Pearson, p. 133.)

... Of course there is another side to the story. Low wages alone do not insure cheap production. As Professor Ross remarks: "For all his native capacity, the coolie will need a long course of schooling, industrial training, and factory atmosphere before he inches up abreast of the German or American working man." (Ross, p. 119.) In the technical and directing staffs there is the same absence of the modern industrial spirit, resulting in chronic mismanagement, while Chinese industry is further handicapped by traditional evils like "squeeze," nepotism, lust for quick profits, and incapacity for sustained business team-play. These failings are not peculiar to China; they hamper the industrial development of other Asiatic countries, notably India. Still, the way in which Japanese industry, with all its faults, is perfecting both its technic and its methods shows that these failings will be gradually overcome ...

Saturday, October 23, 2010

Can I play, too?

Why am I in Shenzhen right now? Because I read an article a few months ago suggesting that BGI was going to attempt an IQ GWAS (genome wide association study). I've been thinking about this topic since I was a kid, waiting impatiently for the required technology to develop. (The usual situation in my main field of research, theoretical physics...) After reading the article I did a few calculations and realized that we are on the cusp of being able to find much of the genetic variance related to intelligence. I had to get involved!

The situation reminds me of the story of Caltech grad Eugene Myers and the Human Genome Project. Myers was one of the proposers of the whole genome shotgun sequencing technique. When he read in the Wall Street Journal that Celera might attempt the shotgun technique, he called them up to ask Can I play, too? I'm told there were a lot of techers on the Celera informatics team :-)

Even if our planned study fails, it's clear on the basis of trends in cost and capacity of sequencing that massive GWAS involving 10^5 or 10^6 individuals are right around the corner. It is of primary importance that phenotype data (including IQ!) on this first group of sequenced individuals be collected in a systematic fashion. Assuming this is the case, then under reasonable assumptions a significant fraction of the .6 or so of additive genetic variance will have been discovered within the next 5 (perhaps 10) years.

PS If you are interested in the original Human Genome Project, I highly recommend the book linked to above: The Genome War by James Shreeve.

Friday, October 22, 2010

g, math ability and their population distribution

I've been meaning to post about this report on high level math competitions which appeared in the Notices of the American Mathematical Society. The table below is a good summary of data in the paper. (Click for larger version.)

My guess is that this population is roughly +4 SD in math ability and +(3-4) SD in g. For detecting useful (very) high end ability I trust these competitions more than tests of g, with the proviso that training has a strong effect on performance.

Here's something about what psychometricians would call the "validity" of the test :-)

The skill sets necessary to excel in mathematical problem solving and mathematics research are not identical. Research requires the stamina to work on problems over extended periods of time without knowing whether solutions even exist; the competitions discussed here require the ability to solve difficult problems known to be solvable under timed conditions. Thus, some world-class research mathematicians exist who attempted, but did not excel in the Putnam, IMO, or its qualifying examinations. Nevertheless, a high correlation exists between exceptional ability in mathematical research and problem solving since both require outstanding mathematical intuition and creativity along with the interest in devoting considerable time and effort toward acquiring extensive knowledge in the field. Numerous Putnam Fellows have gone on to receive the Fields Medal (the so-called Nobel Prize of Mathematics) or the Nobel Prize in Physics. Some who never quite achieved Fellow status have also been awarded Nobel Prizes. Eight of the eighteen Fields medalists from 1990 through 2006 were IMO gold or silver medalists in their youth, with Grigorij Perelman, who recently resolved the Poincaré Conjecture, having achieved a perfect forty-two in the 1982 IMO.

Thursday, October 21, 2010

News from the future

I was floored today when the director of BGI told me they would soon reach a sequencing rate of 1000 (human) genomes per day (so, 10^5 to 10^6 genomes per year is right on the horizon). According to him, they can be profitable at a price of $5k per genome! [Clarification: I later learned this might mean at 10x coverage ... not exactly sure, although I tried to get a more precise statement.]

Sequencing costs are roughly 1/3 reagents 1/3 computing and capex and 1/3 labor. No one can compete with BGI on the last factor, and they've made truly original progress on assembly of short reads with their SOAP software package. BGI's ambition, which I think is realistic, is to be THE sequencing and analysis center for the entire world. There are significant advantages to scale in this business (think of the cloud computing and storage issues alone), and BGI currently has the lead.

BGI's main buildings are located in a gritty area of Yantian, which is the site of a major container port. The second building, just recently occupied, is a converted shoe factory. Trucks delivering containers to the port crowd dusty streets just a few blocks away. It's amazing to find world class technology and brains concentrated in a place like this -- the contrast could not be greater. But that's China today.

The view from building 2; note the container port cranes in the distance.

Here's a picture of the room where I gave a lecture yesterday. On the wall are Nature and Science covers featuring BGI research results.

Lecture 2.

Tuesday, October 19, 2010

Dwelling in Dameisha

It's been said for the last decade or more that most of the interesting architectural projects are happening in China. Here's an example close to my hotel in Dameisha (near Shenzhen), which sits at the edge of a man-made lake not far from the beach. The building is apparently as long as the empire state building is tall. More here.

Looking up at the hills surrounding this area, I can see all kinds of modernist houses that would look at home on the cover of Dwell magazine. Unfortunately I don't have a telephoto lens so I don't have any shots of them.

Monday, October 18, 2010

BGI photos

Below are some photos from my first day at BGI Shenzhen. They are actually located in Dameisha, a small beach town over the mountains from the main part of Shenzhen. [Correction: I later learned that while our hotel is in Dameisha, BGI is in Yantian, just over a smaller mountain from Dameisha. You can see the tunnel in one of the pictures below.] BGI has been undergoing hypergrowth in the last year, with the number of software developers shooting from 200 to about 1000, and the total staff population nearing 5000. They recently took over a second building in Dameisha and are already running out of space.

These were taken from our hotel, about a kilometer from the ocean.

From BGI you can see the major shipping port at Yantian.

These gentlemen are standing next to crates of recently delivered Illumina sequencing machines.

These are stuffed cloned pigs (another BGI project).

The BGI building.

Friday, October 15, 2010

Wen Jiabao, Adam Smith and Marcus Aurelius

I was amused to read the following in a recent Fareed Zakaria interview of Chinese Premier Wen Jiabao. ***

Is there a book you've read in the past few months that has impressed you?

The books that are always on my shelves are books about history, because I believe history is like a mirror, and I like to read both Chinese history and history of foreign countries. There are two books that I often travel with. One is The Theory of Moral Sentiments, by Adam Smith. The other is The Meditations [of Marcus Aurelius].

Below is my favorite quote from Marcus Aurelius, which I've mentioned before on the blog, here and here.

Marcus Aurelius

"Or does the bubble reputation distract you? Keep before your eyes the swift onset of oblivion, and the abysses of eternity before us and behind; mark how hollow are the echoes of applause, how fickle and undiscerning the judgments of professed admirers, and how puny the arena of human fame. For the entire earth is but a point, and the place of our own habitation but a minute corner in it; and how many are therein who will praise you, and what sort of men are they?"

It has often been remarked that there is some tension between Adam Smith's Theory of Moral Sentiments (TMS) and Wealth of Nations (WN). Here are the key excerpts:

WN: It is not from the benevolence of the butcher, the brewer, or the baker, that we expect our dinner, but from their regard to their own interest ... This division of labor ... is not originally the effect of any human wisdom, which foresees and intends that general opulence to which it gives occasion. It is the necessary, though very slow and gradual, consequence of a certain propensity in human nature which has in view no such extensive utility; the propensity to truck, barter, and exchange one thing for another.

TMS: How selfish soever man may be supposed, there are evidently some principles in his nature, which interest him in the fortune of others, and render their happiness necessary to him, though he derives nothing from it except the pleasure of seeing it.

See this lecture by our man Vernon Smith for a nice discussion of this tension. (I highly recommend the whole lecture!)

The juxtaposition of these two statements lays bare what would appear to be directly contradictory views of human nature held by Adam Smith. This has long been noted and perhaps helps to account for the greater notoriety of the Wealth of Nations in both popular and academic discourse. Thus, as observed by Jacob Viner, “Many writers, including the present author at an early stage of his study of Smith, have found these two works in some measure basically inconsistent.” (Viner 1991, 250).

These two views are not inconsistent, however, if we recognize that a universal propensity for social exchange is a fundamental distinguishing feature of the hominid line, and that it finds expression in both personal exchange in small-group social transactions, and in impersonal trade through large-group markets. Thus, Smith had but one behavioral axiom, “the propensity to truck, barter, and exchange one thing for another,” where the objects of trade I will interpret to include not only goods, but also gifts, assistance and favors out of sympathy, that is, “generosity, humanity, kindness, compassion, mutual friendship and esteem” (Smith 1759; 1976, 38). As can be seen in both the ethnographic record, and in laboratory experiments, whether it is goods or favors that are exchanged, they bestow gains from trade that humans seek relentlessly in all social transactions. Thus, Adam Smith’s single axiom, broadly interpreted to include the social exchange of goods and favors across time, as well as the simultaneous trade of goods for money or other goods, is sufficient to characterize a major portion of the human social and cultural enterprise. It explains why human nature appears to be simultaneously self-regarding and other-regarding.

*** Of course, I am old enough to remember that before KGB director Yuri Andropov succeeded Brezhnev as leader of the USSR, Soviet intelligence planted fabricated information about him with the Western press, portraying him as a cosmopolitan intellectual. I doubt Wen is playing this game. However, he did, probably, deliberately mention books that would be familiar to Westerners as opposed to whatever Chinese books he travels with.

Thursday, October 14, 2010

Wigner recollections

It is always a pleasure to browse the library when visiting another research institute. Although some books are found in every physics library, one often makes esoteric discoveries. I was less than impressed by the Collected Works of Theodore Von Karman, but charmed by The Recollections of Eugene P. Wigner, from which I quote below.

On John von Neumann. Why is there no definitive biography of this man?

I have known a great many intelligent people in my life. I knew Planck, von Laue and Heisenberg. Paul Dirac was my brother in law; Leo Szilard and Edward Teller have been among my closest friends; and Albert Einstein was a good friend, too. But none of them had a mind as quick and acute as Jansci [John] von Neumann. I have often remarked this in the presence of those men and no one ever disputed me.

... But Einstein's understanding was deeper even than von Neumann's. His mind was both more penetrating and more original than von Neumann's. And that is a very remarkable statement. Einstein took an extraordinary pleasure in invention. Two of his greatest inventions are the Special and General Theories of Relativity; and for all of Jansci's brilliance, he never produced anything as original.

On Einstein and quantum mechanics.

Einstein plainly saw that the statistical view was a quite novel way of interpreting physical events; he realized, perhaps even before many of its backers, that accepting the statistical view implied a need to reexamine a great many things, including human volition and desire. Einstein did not want to reexamine all that. So he made light of the statistical view. "How about the sun," he would say, "Is that also a probability amplitude?"

[Contrast with Hawking's observation that many worlds is "trivially true" once we assume that quantum mechanics applies to each and every component of the universe.]

On quantum mechanics and the limits of human intelligence.

Until 1925, most great physicists, including Einstein and Planck, had doubted that man could truly grasp the deepest implications of quantum theory. They really felt that man might be too stupid to properly describe quantum phenomena. ...the men at the weekly colloquium in Berlin wondered "Is the human mind gifted enough to extend physics into the microscopic domain ...?" Many of those great men doubted that it could.

On specialization.

But it is sad to lose touch with whole branches of physics, to see scientists cut off from each other. Dispersion theorists do not know axiomatic field theory; cosmologists do not know nuclear physics. Quantum mechanics is hard to explain to a chemist ... and yet the best theoretical chemists really ought to know quantum mechanics.

Specialization of science also robbed us of much of our passion. We wanted to grasp science whole, but by then the whole was something far too vast and complex to master. Only rarely could we ask the deep questions that had first drawn us to science.

Tuesday, October 12, 2010

BGI visit

Next week I'll be a visitor at BGI (formerly Beijing Genomics Institute; see earlier posts here). I'm involved in a GWAS (genome wide association study) of IQ involving a very high end sample with a case-control design. More details (perhaps) after my visit.

Long ago I sketched out a science fiction story involving two Junior Fellows, one a bioengineer (a former physicist, building the next generation of sequencing machines) and the other a mathematician. The latter, an eccentric, was known for collecting signatures -- signed copies of papers and books authored by visiting geniuses (Nobelists, Fields Medalists, Turing Award winners) attending the Society's Monday dinners. He would present each luminary with an ornate (strangely sticky) fountain pen and a copy of the object to be signed. Little did anyone suspect the real purpose: collecting DNA samples to be turned over to his friend for sequencing! The mathematician is later found dead under strange circumstances. Perhaps he knew too much! ...

Thanks to recent technological progress, this story is no longer science fiction.

Homework problems: (1) given a high IQ threshold (e.g., +4 SD), what is the most efficient way of collecting thousands of samples from individuals above that threshold? (2) Assuming M alleles, each with equal additive effect on IQ, what will their frequencies in the high group be compared to the general population? How large a population is necessary to resolve the frequency difference beyond statistical error? (Perhaps these problems explain the motivation behind a certain subset of my ruminations on this blog ;-)

Below are some recent articles about BGI. They intend to achieve a sequencing rate of 10^4 human genomes per annum by 2011.

MIT-Harvard Broad Institute vs BGI:

... The Broad’s perch as the largest genome center in the world is getting crowded, as BGI fills its Hong Kong facility with more than twice as many HiSeqs as the Broad (see p. 44). Nusbaum, however, says the Broad and BGI enjoy a friendly (if slightly competitive) relationship. “We’re building ongoing collaborations with them. Ideally we want them to be a sister center with us,” he says. “There’s so much sequencing in the world that needs to be done, right now, I don’t see any need to compete with them.”

While Nusbaum concedes the emergence of BGI “upsets the balance of power,” he thinks the added sequencing capacity is a positive trend. Of course, the spread of sequencing democracy in countless small labs also tilts the balance of power, perhaps even more disruptively.

Sequencing the Human Secret:

... Wong says the Illumina machines are currently producing 200 gigabases per run, and expects a higher throughput by the end of 2010. At 40 gigabases per day per machine, he expects to be generating 3-4 terabases daily by year’s end.

The BGI Hong Kong supercomputer currently has 2,400 cores, but Wong and a colleague seem uncertain about the storage capacity. After some back and forth in Cantonese, they settle on a current total of almost three petabytes (PB) of on site storage in Hong Kong (Shenzhen has a little over 7 PB). The calculation capacity is about 25 teraflops. With the scope of work BGI is hoping for, that may not suffice for long.

... BGI is privately held and employs 3,000 people now across five centers in mainland China (Shenzhen, Beijing, Hangzhou, Shanghai, and Guangzhou), and the three existing international centers. In addition to sequencing and bioinformatics, other areas of focus include diagnostics, biofuels, and agriculture.

BGI Americas:

... With 3,000 employees currently rising to an expected 5,000 by the end of this year, and a fleet of more than 150 Illumina and Life Technologies next-generation sequencing instruments, most of which are being installed in a former printing factory in Hong Kong (see, p. 44), BGI is poised (if it isn’t already) to become the world’s largest genome sequencing center. And it wants to share its extraordinary resources and expertise with, well, everybody.

Last April, BGI Americas was officially incorporated in Delaware as the official interface for BGI in North America. BGI Europe followed suit the next month (See, “European Union”).

... By the time the Hong Kong facility is fully operational at the end of 2010, BGI will have a total sequencing output of 5 terabytes/day—the equivalent of 1500x human genome/day (see, “Lucky Numbers”). The data center now boasts 50,000 CPUs, 200 terabytes of RAM and will reach a whopping 1,000 petabytes—1 exabyte—of data storage within the next 2-3 years. “It’s an awesome machine to play games on,” jokes Tu.

... The average age of the BGI staff is just 24.7. Tu calls the legions of bioinformatics workers “the young and the brightest,” drawn from the top tiers of mathematicians and scientists, supplemented with operations people who have worked abroad. “If they come to BGI, they get to work on real projects. Plus you get to program all day, with these toys in the background! It’s like a video game; they love it!” New recruits cannot rest on their laurels however: every month for the first six months, there’s a test. Fail it, and it’s bye-bye BGI.

Les Grandes Ecoles Chinoises

Note Added (2022): Wikipedia has a long entry on this topic.

The American intellectual elite are endlessly fascinated by the French Grandes Ecoles, which employ a rigorous examination system for admissions. See example below from today's NY Times.

In the past I'd read that the British and French based their civil service and educational examination systems on the much older Chinese model, but was not sure to what extent it is true. See here for an interesting discussion.

... Brunetiere believed that French education was really based on the Chinese system of competitive literary examinations, and that the idea of a civil service recruited by competitive examinations undoubtedly owed its origins to the Chinese system which was popularized in France by the philosophers, especially Voltaire. This definite conclusion that the French civil service examination system came from China is adopted by several authors ...

Summary of the case of Britain and colonial India can be found here. Amusingly, 19th century British writers opposed to the new system of exams referred to it as "... an adopted Chinese culture" (p. 304-305).

NYTimes: ... Born out of the French Enlightenment, the grandes écoles have long been the cradle of the governing class. “Normaliens” (graduates from École Normal Supérieure, whose 12 Nobel laureates include Henri Bergson and Jean-Paul Sartre), “Gadzarts” (from École Nationale Supérieure d’Arts et Métiers, like Jean-Lou Chameau, president of the California Institute of Technology), “X-iens” (from École Polytechnique, including the physicist Sadi Carnot, the philosopher Auguste Comte and the mathematician Benoît B. Mandelbrot), and “Enarques” (from École National d’Administration, including almost all recent prime ministers) occupy a place in French national life similar to Oxbridge graduates in England or the Ivy League in the United States.

Internationally, however, these institutions have far less clout than their Anglo-American counterparts.

... The grandes écoles run along very different lines. Admission is selective, with candidates generally required to complete a grueling two-year preparatory course. This “prépa” includes intensive study in mathematics, economics, philosophy and literature, plus at least two foreign languages. Of 1,079 candidates who took the entrance exam for the business school HEC in 2009, only 50 were offered places, and most of those already held master’s degrees from other institutions. The competition for science places is even tougher.

... Yet he, too, alluded to the new reality of global competition: “When I was a student we spoke of ‘le défi américain’ — the American challenge. Now we speak of ‘le défi asiatique’ — the challenge from Asia.”

How will France face this challenge? Dr. Tapie pointed out that while France “has only 1 percent of the world’s population, we make up 33 percent of Fields medalists,” the mathematics equivalent of Nobel laureates.

It was Cédric Villani, a 37-year-old professor at Lyon who won the 2010 Fields Medal, who gave the most spirited reply to France’s critics. Calling himself “a pure product of the French system,” Mr. Villani, a Normalien who has often taught in the United States, said that while American academic salaries were higher “and it’s easier to make big projects,” France also has particular strengths: “Our tradition, our quality of life, our social cohesion. My big problem in Princeton was finding a place to buy a decent cheese.”

Monday, October 11, 2010

Elite universities and human capital mongering

In an earlier post I discussed the advantages of attending an elite university. A related question is: what fraction of the total population of top students in the US attend an elite university? The answer is a function of where we put the lower cutoff for top students.

One interesting population to consider is the subset of National Merit Semi-Finalists who are awarded scholarships directly funded by the National Merit Corporation. Semi-Finalists are themselves the top percent of PSAT/SAT takers, so this subset is an especially elite group. From my experience I would guess that only the top 10-20 percent (see *** below) of National Merit Semi-Finalists are offered these portable awards, as opposed to other "National Merit Scholarships" that are funded by individual colleges. (Almost all non-elite colleges use these self-funded NMS to increase the enrollment of talented students; any Semi-Finalist is eligible for such an award. The most elite universities do not need to offer self-funded scholarships of this type, as discussed below.)

About 2300 NMS scholarships funded by the corporation are awarded each year. It turns out that just 10 elite universities account for well over half of these awardees. The vast majority of universities in the US have zero students from this select population! Data from this report. (Thanks to a reader of the blog for sending it to me.)

Number of NMS in entering class / size of entering class.

Caltech 42 / 200
Harvard 266 / 1600
Yale 234 / 1300
Princeton 196 / 1300
Stanford 110 / 1600
MIT 110 / 1000
Brown 91 / 1500
Duke 105 / 1600
Penn 125 / 2000
Berkeley 91 / 6000

Total 1270

Note, large numbers of Semi-Finalists who are not in this top group (i.e., not among the top 10-20 percent or so) do not receive a National Merit Scholarship because they choose to attend an elite university that does not self-fund additional awards. Just by looking at average SAT scores one would guess that as many as half the students at some of the schools listed above were Semi-Finalists. Apparently, the marginal value to these schools of "just another Semi-Finalist" does not warrant the expenditure of a few thousand dollars in additional scholarship funds. ("If we wanted to, we could fill our entire freshman class with Semi-Finalists!" etc. etc.)

*** If Semi-Finalists are in the top .5 percent of the population (assuming some selection in PSAT takers relative to the general population), the threshold IQ is +2.5 SD (137 or so). The NMS population discussed above is perhaps another SD higher (IQ 150 or so). You can directly estimate the number of students at the Semi-Finalist level: if 4M students graduate from HS each year in the US, that means about 20K above the 99.5th percentile. If the 2.3K NMS are the elite of this population, they would constitute the top 12 percent or so.

Sunday, October 10, 2010

Taipei photos 5: National Day

We were invited to a reception by the Foreign Ministry to commemorate the 99th National Celebration Day.

The red carpet!

I was a bit underdressed.

The handsome guy on the right is the President of Taiwan.

The view from the garden.

I later learned this is not a real Picasso (it's a copy).

We left while things were still hopping.

Saturday, October 09, 2010

Taipei photos 4

Gondola ride.

View of Taipei from a temple.

The temple.

Bodhidharma has come to the east.

Thursday, October 07, 2010

Luis Alvarez quotes

Someone posted the first quote below as a comment and I thought I'd share it, as well as some others. Alvarez was one of the greatest experimentalists of all time. He and Shockley both missed the Terman cut by a few points, which just goes to show you what a noisy predictor any test administered to kids is.

The world of mathematics and theoretical physics is hierarchical. That was my first exposure to it. There's a limit beyond which one cannot progress. The differences between the limiting abilities of those on successively higher steps of the pyramid are enormous. I have not seen described anywhere the shock a talented man experiences when he finds, late in his academic life, that there are others enormously more talented than he. I have personally seen more tears shed by grown men and women over this discovery than I would have believed possible. Most of those men and women shift to fields where they can compete on more equal terms. [I still shed the occasional tear today! Trivia question: what was Larry Summers' major when he entered MIT?] My observations of the young physicists who seem to be most like me and the friends I describe in this book tell me that they feel as we would if we had been chained to those same oars. Our young counterparts aren't going into nuclear or particle physics (they tell me it's too unattractive); they are going into condensed-matter physics, low-temperature physics, or astrophysics, where important work can still be done in teams smaller than ten and where everyone can feel that he has made an important contribution to the success of the experiment that every other member of the collaboration is aware of. Most of us do physics because it's fun and because we gain a certain respect in the eyes of those who know what we've done. Both of those rewards seem to me to be missing in the huge collaborations that now infest the world of particle physics. Most of us who become experimental physicists do so for two reasons; we love the tools of physics because to us they have intrinsic beauty, and we dream of finding new secrets of nature as important and as exciting as those uncovered by our scientific heroes. But we walk a narrow path with pitfalls on either side. If we spend all our time developing equipment, we risk the appellation of 'plumber', and if we merely use the tools developed by others, we risk the censure of our peers for being parasitic. With modern weapons-grade uranium, the background neutron rate is so low that terrorists, if they had such material, would have a good chance of setting off a high-yield explosion simply by dropping one half of the material onto the other half. Most people seem unaware that if separated U-235 is at hand, it's a trivial job to set off a nuclear explosion, whereas if only plutonium is available, making it explode is the most difficult technical job I know. Dirac politely refused Robert's [Robert Oppenheimer] two proffered books: reading books, the Cambridge theoretician announced gravely, 'interfered with thought'.

Here is a review of Alvarez's memoir Alvarez: Adventures of a Physicist, by the professor who first taught me quantum mechanics in 1982-3:

Luis Alvarez was likely the best American experimental physicist of the 20th century. ... He invented radar for instrument flight landing, discovered new elements, invented new accelerators, invented clever optical devices, built the big bubble chambers of particle physics, x-rayed the pyramid at Giza with cosmic rays, used elementary physics to conclusively solve all the unknowns of the Kennedy assassination in Dallas in 1963, looked in Moon rocks for magnetic monopoles, and hypothesized the cause of the disappearance of the dinosaurs 65 million years ago. He was an observer on the plane that dropped the first nuclear bomb on Japan, kept a few Moon rocks on the mantle in his living room where he hosted weekly gatherings of students and physicists, and "did physics" until he died of esophogeal cancer. This book is a joy for a physicist to read, but for anyone who is curious about what physicists really do and how they approach problems both large and small, this book is a treasure.

Wednesday, October 06, 2010

Some data on regression

See previous discussion here.

I grew up in a university town in the midwest. The population of the town was about 40-50K and that of the university about 30K. There was only one high school with a graduating class of just under 400, and about a quarter to a third of each class were faculty kids or children of administrators or people who worked at the nearby government labs (i.e., generally kids of PhDs).

About 20 kids each year score high enough on the PSAT/SAT to be National Merit Semi-Finalists (top .5 percentile). If our school were typical of the US as a whole, this number would be ten times smaller. Almost all of these 20 kids are children of people associated with the university or the labs. The cutoff for semi-finalist is about +2.5 SD (say, IQ 137), and since about 15% of the faculty kids are above this threshold one can obtain an average for the group of about +1.3 SD (IQ 124). Note I am assuming an SD of 13 for the faculty kids (rough estimate of residual variance given known parental midpoint; see link above), but stating everything in terms of the overall population SD of 15. I think a reasonable parental midpoint one can assign to the parents of this group is just over +2 SD (IQ=130) (e.g., the average of 135 + 125 or 140 + 120; remember -- this is the era before assortative mating). At the most extreme I suppose you could argue for midpoint as high as IQ 135 (e.g., the average of 145 + 125), or as low as 125 (135 + 115). I'd say at 95 percent confidence the parental midpoint is between 125 and 135.

These estimates are consistent with an additive heritability of at least .7, possibly much higher.

You could argue the kids are getting a boost from the environmental effect of being raised by eggheads, but adoption data suggests that shared environmental effects are relatively small. I suppose that environmental effects might reduce the additive heritability by .1 or .2 from the range given above.

Homework problem: how smart is the smartest kid at my high school at any given time? ;-)

If someone knows the figures for other similar places (e.g., Los Alamos has only one high school near the lab), please comment.

If you are from my high school and reading this on FaceBook, please don't be offended :-)

Tuesday, October 05, 2010

How the world works

I posted this as a comment on a GNXP thread about whether it is worthwhile to attend an elite university. I suppose you can get the main points by watching The Social Network (which I haven't seen yet) and thinking about what Zuckerberg's life would be like had he attended U Mass instead of Harvard. BTW, rumor has it that Zuckerberg's buddy from Exeter, who went to Caltech, is the main guy responsible for the ability of FaceBook to scale without collapsing (this is a nontrivial technical feat which Friendster -- remember them? -- failed to accomplish). Anyone know more details about this? Does the techer appear in the movie or in Mezrich's book?

Go to the web sites of venture capital, private equity or hedge funds, or of Goldman Sachs, and you’ll find that HYPS alums, plus a few Ivies, plus MIT and Caltech, are grossly overrepresented. (Equivalently, look at the founding teams of venture funded startups.)

Most top firms only recruit at a few schools. A kid from a non-elite UG school has very little chance of finding a job at one of these places unless they first go to grad school at, e.g., HBS, HLS, or get a PhD from a top place. (By top place I don’t mean “gee US News says Ohio State’s Aero E program is top 5!” — I mean, e.g., a math PhD from Berkeley or a PhD in computer science from MIT — the traditional top dogs in academia.)

This is just how the world works. I won’t go into detail, but it’s actually somewhat rational for elite firms to operate this way — a Harvard guy knows how the filtering works at his alma mater and at similar places so he trusts it. Plus, at the far tail of ability I would guess the top 10-20 UG schools grab almost 50 percent of the pool.

I teach at U Oregon and out of curiosity I once surveyed the students at our Honors College, which has SAT-HSGPA characteristics similar to Cornell or Berkeley. Very few of the kids knew what a venture capitalist or derivatives trader was. Very few had the kinds of life and career aspirations that are *typical* of HYPS or techer kids. At the time I took the survey almost 50 percent of the graduating class at Harvard was heading into finance. You can bet that the average senior at Harvard knows what Goldman Sachs is (and even what it means to make partner there), that McKinsey is so over (relative to careers in finance), what the difference is between a Rhodes, Marshall and Churchill scholarship, etc. etc. Very few state school kids do … Last year a physics student at Oregon won a Marshall to go to Cambridge. The administrators were happy about the PR. I had a conversation with a vice-provost about how to ensure a steady pipeline of such candidates — but there are not the resources, institutional understanding of the process, etc. (let alone pool of able kids) to turn UO into a Rhodes/Marshall/… machine like Harvard.

Now tell me that peer or network effects don’t matter. Controlling for SAT may account for much of the variance in well-established careers like medicine or even law, but for the very top jobs (which contribute disproportionately toward income inequality), kids at elite schools have huge advantages. Guess where I will send my kids (assuming they can get in)?

To see the elite / non-elite divide most starkly, look at the probability of (earned) net worth, say, $5-10M by age 40. This cuts out almost all doctors and lawyers and leaves finance, startups and entertainment (i.e., movies or television; let’s ignore sports). Even after controlling for SAT, I would guess elite grads are 3 or maybe even 10 times more likely to achieve this milestone.

Controlling for IQ doesn't account for differences in drive or life expectations or naked ambition, let alone social networks, signaling or information flow among elites.

Ferguson, Summers and the Inside Job

Academic turned tech entrepreneur turned documentary filmmaker Charles Ferguson takes on Larry Summers and academic economists' support of corporate and financial interests in the Chronicle of Higher Education.

Larry Summers and the Subversion of Economics:

... Summers is unquestionably brilliant, as all who have dealt with him, including myself, quickly realize. And yet rarely has one individual embodied so much of what is wrong with economics, with academe, and indeed with the American economy. For the past two years, I have immersed myself in those worlds in order to make a film, Inside Job, that takes a sweeping look at the financial crisis. And I found Summers everywhere I turned.

Consider: As a rising economist at Harvard and at the World Bank, Summers argued for privatization and deregulation in many domains, including finance. Later, as deputy secretary of the treasury and then treasury secretary in the Clinton administration, he implemented those policies. Summers oversaw passage of the Gramm-Leach-Bliley Act, which repealed Glass-Steagall, permitted the previously illegal merger that created Citigroup, and allowed further consolidation in the financial sector. He also successfully fought attempts by Brooksley Born, chair of the Commodity Futures Trading Commission in the Clinton administration, to regulate the financial derivatives that would cause so much damage in the housing bubble and the 2008 economic crisis. He then oversaw passage of the Commodity Futures Modernization Act, which banned all regulation of derivatives, including exempting them from state antigambling laws.

After Summers left the Clinton administration, his candidacy for president of Harvard was championed by his mentor Robert Rubin, a former CEO of Goldman Sachs, who was his boss and predecessor as treasury secretary. Rubin, after leaving the Treasury Department—where he championed the law that made Citigroup's creation legal—became both vice chairman of Citigroup and a powerful member of Harvard's governing board.

Over the past decade, Summers continued to advocate financial deregulation, both as president of Harvard and as a University Professor after being forced out of the presidency. During this time, Summers became wealthy through consulting and speaking engagements with financial firms. Between 2001 and his entry into the Obama administration, he made more than $20-million from the financial-services industry. (His 2009 federal financial-disclosure form listed his net worth as $17-million to $39-million.)

Summers remained close to Rubin and to Alan Greenspan, a former chairman of the Federal Reserve. When other economists began warning of abuses and systemic risk in the financial system deriving from the environment that Summers, Greenspan, and Rubin had created, Summers mocked and dismissed those warnings. In 2005, at the annual Jackson Hole, Wyo., conference of the world's leading central bankers, the chief economist of the International Monetary Fund, Raghuram Rajan, presented a brilliant paper that constituted the first prominent warning of the coming crisis. Rajan pointed out that the structure of financial-sector compensation, in combination with complex financial products, gave bankers huge cash incentives to take risks with other people's money, while imposing no penalties for any subsequent losses. Rajan warned that this bonus culture rewarded bankers for actions that could destroy their own institutions, or even the entire system, and that this could generate a "full-blown financial crisis" and a "catastrophic meltdown."

When Rajan finished speaking, Summers rose up from the audience and attacked him, calling him a "Luddite," dismissing his concerns, and warning that increased regulation would reduce the productivity of the financial sector. (Ben Bernanke, Tim Geithner, and Alan Greenspan were also in the audience.) ...

I particularly like this comment on the Chronicle site:

The profession is not being hypocritical -- indeed, the problem is that the profession is not being hypocritical! These professors actually BELIEVE what they spout. It is true that they would not likely have turned their attention to Iceland without the money but if they had, they would likely have said exactly what they said. Thus the problem is not a conflict of interests but a lack of accountability -- if the economics professional wants to argue that only predictions matter, not the assumptions (the classic [tenet] of the Friedman positivist agenda), then the results better match what the theory suggests will occur. Uh oh . . . they don't.

This interview with Ferguson is also worth a look.

Q. Did it feel weird to approach, somewhat confrontationally, government, academic, and business pooh-bahs when you've been an insider yourself in each of those arenas?

A. It didn't feel weird, although it certainly wasn't enjoyable, either. I tried to keep the interviews calm and substantive even when they became extremely confrontational, and I often felt that my interviewee was being dishonest or evasive.

Here's what I wrote in a post from 2008 on Ferguson's book about his startup Vermeer (acquired by Microsoft):

The best in-depth account of a startup I've read is High Stakes, No Prisoners by Charles Ferguson, who doesn't leave out any of the key details. I read it before I started my first company, and I'm very glad I did. Ferguson is a very interesting character (Times profile); I can't wait to see his new movie on Iraq.

An investment banker I knew once offered to introduce me to Ferguson (an old friend), and I regret not taking him up on the offer!

See related posts Expert Predictions and Intellectual honesty. I've been a Raghuram Rajan fan for some time.

No singularity here, move along please

Another dispatch from the long, hard road to AI :-)

This is how I see it going: machine learning with corrective input from mechanical turks (humans) will get us pretty far, at least as far as very useful tools that can amplify our human intelligence (prime example so far: Google).

But building an actual AI will be much harder. Is the ontology that NELL is populating general enough? Was it hard coded, or does it grow in an automated way? What's the right general structure within which all this guided learning should occur?

I suggest the researchers build an "Ask NELL?" web interface, which also allows users to submit corrections.

NYTimes: ... With NELL, the researchers built a base of knowledge, seeding each kind of category or relation with 10 to 15 examples that are true. In the category for emotions, for example: “Anger is an emotion.” “Bliss is an emotion.” And about a dozen more.

Then NELL gets to work. Its tools include programs that extract and classify text phrases from the Web, programs that look for patterns and correlations, and programs that learn rules. For example, when the computer system reads the phrase “Pikes Peak,” it studies the structure — two words, each beginning with a capital letter, and the last word is Peak. That structure alone might make it probable that Pikes Peak is a mountain. But NELL also reads in several ways. It will mine for text phrases that surround Pikes Peak and similar noun phrases repeatedly. For example, “I climbed XXX.”

NELL, Dr. Mitchell explains, is designed to be able to grapple with words in different contexts, by deploying a hierarchy of rules to resolve ambiguity. This kind of nuanced judgment tends to flummox computers. “But as it turns out, a system like this works much better if you force it to learn many things, hundreds at once,” he said.

For example, the text-phrase structure “I climbed XXX” very often occurs with a mountain. But when NELL reads, “I climbed stairs,” it has previously learned with great certainty that “stairs” belongs to the category “building part.” “It self-corrects when it has more information, as it learns more,” Dr. Mitchell explained.

NELL, he says, is just getting under way, and its growing knowledge base of facts and relations is intended as a foundation for improving machine intelligence. Dr. Mitchell offers an example of the kind of knowledge NELL cannot manage today, but may someday. Take two similar sentences, he said. “The girl caught the butterfly with the spots.” And, “The girl caught the butterfly with the net.”

A human reader, he noted, inherently understands that girls hold nets, and girls are not usually spotted. So, in the first sentence, “spots” is associated with “butterfly,” and in the second, “net” with “girl.”

“That’s obvious to a person, but it’s not obvious to a computer,” Dr. Mitchell said. “So much of human language is background knowledge, knowledge accumulated over time. That’s where NELL is headed, and the challenge is how to get that knowledge.”

A helping hand from humans, occasionally, will be part of the answer. For the first six months, NELL ran unassisted. But the research team noticed that while it did well with most categories and relations, its accuracy on about one-fourth of them trailed well behind. Starting in June, the researchers began scanning each category and relation for about five minutes every two weeks. When they find blatant errors, they label and correct them, putting NELL’s learning engine back on track.

When Dr. Mitchell scanned the “baked goods” category recently, he noticed a clear pattern. NELL was at first quite accurate, easily identifying all kinds of pies, breads, cakes and cookies as baked goods. But things went awry after NELL’s noun-phrase classifier decided “Internet cookies” was a baked good. (Its database related to baked goods or the Internet apparently lacked the knowledge to correct the mistake.)

NELL had read the sentence “I deleted my Internet cookies.” So when it read “I deleted my files,” it decided “files” was probably a baked good, too. “It started this whole avalanche of mistakes,” Dr. Mitchell said. He corrected the Internet cookies error and restarted NELL’s bakery education.

His ideal, Dr. Mitchell said, was a computer system that could learn continuously with no need for human assistance. “We’re not there yet,” he said. “But you and I don’t learn in isolation either.”

Sunday, October 03, 2010

Taiwan photos 3

Dessert at the Sogo department store.

An arts and design festival we attended.

Performance art for kids -- yes, that's a ballerina inside the plastic bubble.

Inside the house of glass and mirrors.

I drew the Feynman diagram.

Mr. Ferocious has finished his noodles and pork cutlet.

This is the medium-small smile.

Solar panels on the roof of the AS physics institute. (Not as fancy as this thing, but we're at pretty low latitude.)

About Me