Pessimism of the Intellect, Optimism of the Will     Archive   Favorite posts   Twitter: @steve_hsu

Tuesday, June 17, 2008

Asian-White IQ variance from PISA results

The vexing question of average differences between groups of humans has been the subject of scrutiny for a very long time. Differences in variance or standard deviation (SD) are less well understood, but have important implications as well. This point was emphasized during the Larry Summers debacle, in which he posited that the variance in male intelligence might be larger than for women, even though the averages are similar (more very dumb and very bright men than women). Summers argued that this effect might explain the preponderance of males in science and engineering, even for a very small difference in SD.

Summers NBER speech:
...If one supposes, as I think is reasonable, that if one is talking about physicists at a top twenty-five research university, one is not talking about people who are two standard deviations above the mean. And perhaps it's not even talking about somebody who is three standard deviations above the mean. But it's talking about people who are three and a half, four standard deviations above the mean in the one in 5,000, one in 10,000 class. Even small differences in the standard deviation will translate into very large differences in the available pool substantially out [on the tail].

I did a very crude calculation, which I'm sure was wrong and certainly was unsubtle, twenty different ways. I looked at the Xie and Shauman paper-looked at the book, rather-looked at the evidence on the sex ratios in the top 5% of twelfth graders. If you look at those-they're all over the map, depends on which test, whether it's math, or science, and so forth-but 50% women, one woman for every two men, would be a high-end estimate from their estimates. From that, you can back out a difference in the implied standard deviations that works out to be about 20%. And from that, you can work out the difference out several standard deviations. If you do that calculation-and I have no reason to think that it couldn't be refined in a hundred ways-you get five to one, at the high end.

I've occasionally heard a variant of the Summers argument applied to Europeans vs Asians (specifically, NE Asians such as Japanese, Koreans and Chinese): although NE Asians exhibit higher averages than whites in psychometric tests (SAT, IQ, etc.), some suspect a smaller variance, leading to fewer "geniuses" per capita, despite the higher mean. See, e.g., this article in National Review:

...The two populations also differ in the variability of their scores. A representative sample of Americans or Europeans will show more variability than will an East Asian sample. In the familiar bell-shaped distribution curve, the bell is much narrower for the Japanese--which is what you would expect from such a homogeneous population.

This difference is a major matter, and it is worth focusing hard on the data. Just about all Western populations report a standard deviation of 15 IQ points. (The SD, a basic measure of variability, quantifies the extent to which a series of figures deviates from its mean.) But the SD for the Japanese and other East Asian populations appears to be a shade under 13 IQ points. That difference does not sound like a big deal, and, in fact, it does not change things much in the center of the distribution. ...

...but it does make a big difference at the high end, and it affects estimates of elite human capital availability in different countries.

I've never seen any data to support the smaller NE Asian SD claim. Looking at SAT data shows a larger variance for the Asian-Pacific Islander category, but that is not surprising since it's a catch-all category that includes S. Asians, SE Asians, NE Asians and Pacific Islanders. I've found very little analysis specific to NE Asians, so I decided to produce some myself. I took the 2006 PISA (OECD Program for International Student Assessment) data, which is painstakingly assembled every 3 years by a huge team of psychologists and educators (400k students from 57 countries tested). The samples are supposed to be statistically representative of the various countries, and the tests are carefully translated into different languages. Most studies of national IQ are quite crude, and subject to numerous methodological uncertainties, although the overall results tend to correlate with PISA results.

Below is what I obtained from the 2006 PISA mathematics exam data (overall rankings by average score here). To get the data, scroll down this page and download the chapter 6 data in .xls spreadsheet format. Level 6 is the highest achievement category listed in the data. For most OECD countries, e.g., France, Germany, UK, only a few percent of students attained this level of performance. In NE Asian countries as many as 11% of students performed at this level. Using these percentages and the country averages, one can extract the SD. (Level 6 = raw score 669, or +1.88SD for OECD, +1.28SD for NE Asians.)

OECD AVG=500 SD=90

NE Asia (HK, Korea, Taiwan) AVG=548 SD=95

The NE Asians performed about .5 SD better on average (consistent with IQ test results), and exhibited similar (somewhat higher) variance. (After doing my calculations I realized that there is actually a table of means and SDs in the spreadsheet, that more or less agree with my results. The standard error for the given SDs is only 1-2 points, so I guess a gap of 5 or 10 points is statistically significant.)

Interestingly, the Finns performed quite well on the exam, posting a very high average, but their SD is smaller. The usual arguments about a (slightly) "narrow bell curve" might apply to the Finns, but apparently not to the NE Asians.

Finland AVG=548 SD=80

Returning to Summers' calculation, and boldly extrapolating the normal distribution to the far tail (not necessarily reliable, but let's follow Larry a bit further), the fraction of NE Asians at +4SD (relative to the OECD avg) is about 1 in 4k, whereas the fraction of Europeans at +4SD is 1 in 33k. So the relative representation is about 8 to 1. (This assumed the same SD=90 for both populations. The Finnish numbers might be similar, although it depends crucially on whether you use the smaller SD=80.) Are these results plausible? Have a look at the pictures here of the last dozen or so US Mathematical Olympiad teams (the US Asian population percentage is about 3 percent; the most recent team seems to be about half Asians). The IMO results from 2007 are here. Of the top 15 countries, half are East Asian (including tiny Hong Kong, which outperformed Germany, India and the UK).

Incidentally, again assuming a normal distribution, there are only about 10k people in the US who perform at +4SD (and a similar number in Europe), so this is quite a select population (roughly, the top few hundred high school seniors each year in the US). If you extrapolate the NE Asian numbers to the 1.3 billion population of China you get something like 300k individuals at this level, which is pretty overwhelming.

As for verbal abilities, we have the following 2006 PISA results: OECD reading avg is about 490; France 488, Germany 495, UK 495, Italy 469, Spain 461. NE Asian scores: Japan 498 Korea 556 HK 536 Taiwan 496. Again, slightly higher scores for NE Asians. Some interesting US data here shows that on 1995 SATs, low-income Asians have lower verbal scores than whites, but by family income of $60k have caught up and Asians with family income of >$70k outscore white families of similar affluence. This strikes me as an immigrant / bilingual family effect. Children raised in immigrant families, where the parents do not speak English at home, tend to score lower on the verbal part of the SAT.

Although it's all there in the data set, I didn't have time to examine the male-female variances in mathematical ability (and don't want to deal with the abuse that might be heaped on me based on what I might find), but I encourage any interested readers to have a look. The authors of the PISA report wisely only reported that male-female averages are similar ;-)

Note: as often happens with this kind of topic, a related discussion has broken out at GNXP.

blog comments powered by Disqus

Blog Archive


Web Statistics