Saturday, February 27, 2010

Gender differences in "extreme" mathematical ability, part 2

We had Andrew Penner of UC Irvine here last week to discuss his paper Gender Differences in Extreme Mathematical Achievement: An International Perspective on Biological and Social Factors. PDF.

I posted back in 2007 on some earlier research of Penner's which showed an 8 percent larger variance in male math ability already at the beginning of kindergarten. (This is not so different from the adult difference in variance.)

For a relatively balanced overview of this topic see Women's underrepresentation in science: Sociocultural and biological considerations; Ceci, Stephen J.; Williams, Wendy M.; Barnett, Susan M. Psychological Bulletin. Vol 135(2), Mar 2009, 218-261. Abstract, PDF.

In the more recent paper Penner claims that national variation in gender gaps in mathematical ability implies that the effect is culturally moderated. While I don't doubt that culture affects development of mathematical ability, and perhaps in such a way as to favor males, I question whether his paper or other recent papers relying on international tests like TIMSS and PISA really have the statistical power to investigate this issue very well. It is already hard to capture national differences in average ability level from tests of only a few thousand students (ensuring that these students are representative of the whole population is difficult); gender gaps are even smaller effects and therefore more sensitive to statistical and systematic error. See here (figure 4) for a convincing demonstration that PISA data on country by country gender gaps is noise dominated: the gaps are not stable between the 2003 and 2006 results. Only by aggregating the data over many countries do we arrive at a stable gap. This makes me suspicious of TIMSS results because PISA has significantly larger statistics. A meta-analysis suggests that cultural effects, while perhaps non-zero, are relatively small.

Andrew and I had an interesting discussion about his paper; my side is summarized in the message and two figures below.


Sorry I had to leave early from your talk and didn't get to discuss this in person. As I mentioned yesterday and in my earlier email, country level gender gaps are not stable between PISA 2003 and 2006, whereas the meta-analysis gap, averaging over all countries, is stable. This to me is clearly a signal that the PISA country level data on gender gaps is dominated by statistical error, and makes me strongly suspect the same is true for TIMSS.

In your talk you said that a biological model would imply the same gender gap in every country, and that country by country variation would undermine the biological model. However, you neglected to mention that statistical error would lead to country by country variation (of measured gaps) even in the biological model.

In the 1995 TIMSS table below there are 8 "gold standard" countries that complied with the statistical procedures. The data from the remaining countries would be suspect, since, as I mentioned, getting a representative sample for a country of millions is not an easy task. In particular, the standard error for countries outside the first group of 8 is likely to be much larger than quoted. (See the column labeled "Difference" in the table. The number in parenthesis is the standard error for the gender gap.)

For the gold standard countries, it appears that all gender gaps are within roughly 1-2 standard deviations (using the standard error given) of the group average, with the exception of Hungary which is an outlier. This suggests that the variation within this group could be entirely statistical. That is, if one formulated a "null model" with constant gender gap across countries, and asked whether TIMSS disfavors that model, the answer might be no, at least not in a statistically significant way. (Actually I suspect that the standard error given is an underestimate, because of systematic errors in the sampling procedures even in the gold standard countries.) Note within this set of countries there is a lot of variation on your societal indicators.

To summarize, I think the claim that TIMSS data supports country level variation in gender gaps has to be considered carefully for statistical significance. As I mentioned, I doubt one can really trust the TIMSS quoted standard errors, so a real test would be time stability of (measured) gender gaps -- a test which PISA fails.

One final comment on your talk: it seems to me that all of the societal variables you listed (labor force participation, wage gap, etc.) have changed significantly in the last 40 years in the US. Nevertheless, I believe gender gaps on the SAT-M (a truly large statistics measurement) have not narrowed during that time. (See second figure below.)


(Click for larger versions.)


Anonymous said...

I wonder what the PISA mathematics test even measures. In Finland, it has been noted that in the TIMMS test Finland was below average in Europe, whereas in the PISA mathematics test it was at the top. See this critique of the results by some Finnish mathematicians. Excerpt:

Out of the 85 assignments in the survey about 20 have been published. The assignments are simple numerical calculations, minor problems or deductions, interpretation of statistical graphics and evaluation of situations where text comprehension is an essential part. However, hardly any algebra or geometry is included. Nevertheless, the assignments are well in agreement with the goals of the survey; in fact, the goal was to study everyday mathematical knowledge.

The PISA-survey leaves us, thus, with unanswered questions regarding many skills, like computing with fractions, solving elementary equations, making geometrical deductions, computing volumes of solid objects, and handling algebraic expressions. Still algebra is perhaps the most important subtopic in mathematical studies after the compulsory comprehensive school

ben g said...

Also, even if social factors do cause variance between countries that doesn't mean that they'll cause similar variance within countries. E.g. there may be larger and highly imalleable disparities in countries where people are free to pursue their interests and vocations.

Ian Smith said...

There have been environmental changes since 1979, but it may be there are environmental causes of the gap which haven't changed.

Hungary is an outlier. So it can be ignored?

The Polgar sisters are outliers too.

The contradictory results in Finland are an example of the frequently made mistake of thinking that because there is a word or phrase there is a reality to which it refers. Here that phrase is "mathematical ability".

Whatever the mathematics test it measures acquired abilities.

To say the gap is biological, what does that mean really?

ben g said...

"To say the gap is biological, what does that mean really?"

It means that genetic variation independently predicts variation in math abilities.

Ian Smith said...

In what environment ben_g? All environments, everywhere, at all times past and present, and always to the same extent?

ziel said...

Hendrik - consider this statement:

"Men are taller than women."

In what environment? All environments, everywhere, at all times past and present, and always to the same extent?

Women basketball players are taller than male soccer players - does that disprove the above contention? Dutch women are taller than Mayan men - does that disprove it?

You also seemed to question the very concept of mathematical ability (or "mathematical ability", as you put it). Would you deny that Ramanujan had greater mathematical ability than, say, George W Bush? If so, do you feel that this trait happens purely by chance? Or do you think it occurs in utero but is not genetic?

Just trying to get an idea where your head is at on this matter.

Ian Smith said...

My head isn't anywhere, because there is nowhere for it to be.

I've heard the "It's like height" BS before.

No psychological trait is like height. I can see height. I can measure height with a tape measure. A metaphor isn't an argument.

But to use this absurd ideological mantra, consider the statement:

"Chinese people are shorter than Norwegians."

What's a Chinese person? If you're talking about a Manchu who isn't lactose intolerant move him to Norway and feed him a Norwegian diet.

ben g said...


The environment of significance here is the normal range of environments found in first world countries.

You could pose the question this way-- to what extent is female math ability in the first world worse because of the environments they receive vs. their genetic differences.

Now, if we were to talk about fundamentally restructuring society, then we'd have to consider other environmental ranges and we can ask how much genes matter vs. environments under those circumstances.

Ian Smith said...

"Now, if we were to talk about fundamentally restructuring society, then we'd have to consider other environmental ranges and we can ask how much genes matter vs. environments under those circumstances."

Good on ya ben.

But douches like Steve think that the way things are is pretty much the way they should be and that any non-biological change is pointless.

Wait did I say "think"? I meant print-out.

Steve is a college professor and therefore in a moral position similar to:

1. tobacco company execs
2. pornographers
3. fast food execs
4. drug dealers.

The whole point of his blog is that

physics professors

are the master race.

M said...

*** think that the way things are is pretty much the way they should be and that any non-biological change is pointless.***

Hendrik Verwoerd,

You mean the naturalistic fallacy?

Where does Steve say that non-biological change is pointless?

If you're going to accuse Steve of the naturalistic fallacy, I think I should suggest you might prone to the moralistic fallacy.

"The moralistic fallacy, coined by the Harvard microbiologist Bernard Davis in the 1970s, is the opposite of the naturalistic fallacy. It refers to the leap from ought to is, the claim that the way things should be is the way they are. This is the tendency to believe that what is good is natural; that what ought to be, is. For example, one might commit the error of the moralistic fallacy and say, “Because everybody ought to be treated equally, there are no innate genetic differences between people.” The science writer extraordinaire Matt Ridley calls it the reverse naturalistic fallacy...

Since academics, and social scientists in particular, are overwhelmingly left-wing liberals, the moralistic fallacy has been a much greater problem in academic discussions of evolutionary psychology than the naturalistic fallacy. Most academics are above committing the naturalistic fallacy, but they are not above committing the moralistic fallacy. The social scientists’ stubborn refusal to accept sex and race differences in behavior, temperament, and cognitive abilities, and their tendency to be blind to the empirical reality of stereotypes, reflect their moralistic fallacy driven by their liberal political convictions.

Ian Smith said...

That was the case when I was in school, but I don't think it is anymore. It is still a faux pas to speak in public as if you aren't adhering to the moralistic fallacy though.

I am guilty of neither fallacy.

But I do believe that psychology is a pseudoscience.

I should know. Look me up on google. Read my bio.

M said...

"I should know. Look me up on google. Read my bio."


Ian Smith said...

I do think the way things should be is the way they will be eventually.

Call that the "eschatological fallacy".

At the Eschaton Steve and Jacob Zuma will be licking each other's asses.

Ian Smith said...

Butt which one will be in Heaven?

Blog Archive