Monday, July 07, 2008

Annals of psychometry: IQs of eminent scientists

[ See this 2016 post for the original papers / book and the identities of the 64 scientists. ]

I recently came across a 1950s study of eminent scientists by Harvard psychologist Anne Roe: The Making of a Scientist (1952). Her study is by far the most systematic and sophisticated that I am aware of. She selected 64 eminent scientists -- well known, but not quite at the Nobel level -- in a more or less random fashion, using, e.g., membership lists of scholarly organizations and expert evaluators in the particular subfields. Roughly speaking, there were three groups: physicists (divided into experimental and theoretical subgroups), biologists (including biochemists and geneticists) and social scientists (psychologists, anthropologists).

The Making of a Scientist devotes only one chapter to psychometrics. The other chapters describe the motivation for the study, how the 64 scientists were selected, interviews with the scientists, details of their family history, work life, etc.

Roe devised her own high-end intelligence tests as follows: she obtained difficult problems in verbal, spatial and mathematical reasoning from the Educational Testing Service, which administers the SAT, but also performs bespoke testing research for, e.g., the US military. Using these problems, she created three tests (V, S and M), which were administered to the 64 scientists, and also to a cohort of PhD students at Columbia Teacher's College. The PhD students also took standard IQ tests and the results were used to norm the high-end VSM tests using an SD = 15. Most IQ tests are not good indicators of true high level ability (e.g., beyond +3 SD or so).

Average ages of subjects: mid-40s for physicists, somewhat older for other scientists

Overall normed scores:

Test (Low / Median / High)

V 121 / 166 / 177

S 123 / 137 / 164

M 128 / 154 / 194

Roe comments: (1) V test was too easy for some takers, so top score no ceiling. (2) S scores tend to decrease with age (correlation .4). Peak (younger) performance would have been higher. (3) M test was found to be too easy for the physicists; only administered to other groups.

It is unlikely that any single individual obtained all of the low scores, so each of the 64 would have been strongly superior in at least one or more areas.

Median scores (raw) by group:

group (V / S / M)

Biologists 56.6 / 9.4 / 16.8
Exp. Physics 46.6 / 11.7 / *
Theo. Physics 64.2 / 13.8 / *
Psychologists 57.7 / 11.3 / 15.6
Anthropologists 61.1 / 8.2 / 9.2

The lowest score in each category among the 12 theoretical physicists would have been roughly V 160 (!) S 130 M >> 150. (Ranges for all groups are given, but I'm too lazy to reproduce them here.) It is hard to estimate the M scores of the physicists since when Roe tried the test on a few of them they more or less solved every problem modulo some careless mistakes. Note the top raw scores (27 out of 30 problems solved) among the non-physicists (obtained by 2 geneticists and a psychologist), are quite high but short of a full score. The corresponding normed score is 194!

The lowest V scores in the 120-range were only obtained by 2 experimental physicists, all other scientists scored well above this level -- note the median is 166.

My comments:

The results strongly suggest that high IQ provides a significant advantage in science. It is sometimes claimed that IQ is irrelevant beyond a threshold: more precisely, that the advantage conferred by IQ above some threshold (e.g., 120) decreases significantly as other factors like drive or creativity take precedence. But, if that were the case it would be unlikely to have found such high scores in this group. The average IQ of a science PhD is roughly 130, and individuals with IQs in the higher range described above constitute a tiny fraction of all scientists. If IQ were irrelevant above 130 we would expect the most eminent group to have an average similar to the overall population of scientists.

Conversely, I think one should be impressed that a simple test which can be administered in a short period of time (e.g., 30 minutes for Roe's high-end exams) offers significant predictive power. While it is not true that anyone with a high IQ can or will become a great scientist (certainly other factors like drive, luck, creativity play a role), one can nevertheless easily identify the 95 percent (even 99 percent) of the population for whom success in science is highly improbable. Psychometrics works!

The scores for theoretical physicists confirm an estimate made to me by a famous colleague many years ago, that only 1 in 100,000 people could do high level theoretical physics.

Feynman's 124: in this context one often hears of Feynman's modest grade school IQ score of 124. To understand this score we have to remember that typical IQ tests (e.g., administered to public school children) tend to have low ceilings. They are not of the kind that Roe used in her study. One can imagine that the ceiling on Feynman's exam was roughly 135 (say, 99th percentile). If Feynman received the highest score on the mathematical portion, and a modest score of 115 on the verbal, we can easily understand the resulting average of 124. However, it is well known that Feynman was extremely strong mathematically. He was asked on short notice to take the Putnam exam for MIT as a senior, and received the top score in the country that year! On Roe's test Feynman's math score would presumably have been > 190, with a correspondingly higher composite IQ.

I thought I should put this post up now, as the new book by Malcolm Gladwell, Outliers: Why Some People Succeed and Some Don’t is out soon and will surely handicap the discourse on this subject for years to come :-)


Anonymous said...


I am afraid I don't get the point of all this.
If you want to know if somebody is good at physics see what papers she or he produced. I don't think more or less of Feynman if I know what his IQ was...
And in most countries (almost) everybody who wants to get into physics can try out if she or he is smart enough...

Steve Hsu said...

Imagine there was a 3 * 30 minute exam you could give which, at the 99% confidence level would exclude 99% of the population from consideration for a particular job category.

Isn't that (a) an interesting result and (b) an indication that the little test is measuring something real? (Some IQ skeptics doubt even that.)

Finally, access to training in science is a scarce resource even in developed countries. I'd prefer it if the most effective filters were used, as long as some alternative paths are kept in place for people who fall through the cracks.

If you've ever sat on a grad school admissions committee, you know that departments spend a lot of effort trying to figure out which kids to admit. But I've seen very few systematic studies trying to determine whether our methodologies are effective or not.

Steve Hsu said...

So, for example, an interesting result is that a low V score doesn't preclude success as an experimenter, but most eminent theoreticians have high V scores. If there was more data like this we might be able to make better admissions decisions.

Anonymous said...

Interesting that my grade school [middle school?] IQ score, my SAT score four years later, and my GRE score twelve years later result in IQ estimates within a point of each other. It also surprises me that the average physics Ph.D. only has an IQ of 130.

My IQ would qualify me for phd-study, but I was a TERRIBLE physics student.

RE: The social value problem. If you create a special class, a privileged class, at the cost of the collective, then you should also instill in them a feeling of indebtedness for the privilege, and a sense that they should give back in a more tangible way than The Advancement of Knowledge [because in a lot of circles, that's just an aesthetic value].

Anonymous said...


>> If you've ever sat on a grad school admissions committee

but do you think that computing IQ scores would be a good way of improving this?

>> access to training in science is a scarce resource even in developed countries

but at the entry level that is what SAT scores etc. are for?

Steve Hsu said...

In the grad school context, here is a simple question: what relative weights should I assign to GRE M,V, physics exams and course grades? A simple longitudinal study might answer the question.

On some committees people want to ignore the GRE general (M, V) scores entirely, some want to ignore V, etc. I think it's an interesting finding that probability of success might fall off significantly for a theorist with lower V, but not for an experimentalist. I'm not saying this is true, although it might be hinted at in the Roe data. Need more studies of this type!

I think the general subject of psychometrics is under-studied. We in science are obviously users of the tools but we don't really try to optimize.

I should add that (as David mentions above) IQ scores tend to be relatively stable over time, so they really do have predictive power if you can show, e.g., that all the successful people in a particular area have a certain IQ profile.

Anonymous said...

I'd like to see these tests to take myself. It'd be fun to compare to eminent scientists.

I sympathize with the importance to develop the best criteria for evaluating prospective students/employees, but I suspect a lot of this is just a way for us in physics to feel good about ourselves. Not that it's unwarranted -- it's very hard to survive in academia in physics and one should be very proud if s/he does -- meanwhile it doesn't seem the public at large appreciates what it takes.

But, I think when you get serious about these issues you decide the best measure of future success at something is past success at the same sort of thing. Rather than devise lots of different IQ-type test, perhaps it is better to develop grading techniques that more clearly distinguish the top students in class, and focus on curricula and tests that emphasize skills important to research.

Anonymous said...

Ummmm.... Surely all this is showing is correlation not causation (I'm not saying other studies don't show causation)? Perhaps it is working on all those hard problems that makes you smarter - and the variance by occupation is explained by the relative difficulty of said occupations?

Anonymous said...

Sorry just scanned upwards and saw your reply alluding to a consistency of IQ score over time. One could postulate that this was due to the fact that most people just pootle along in their lives and never put a serious effort into raising their IQ scores once they've made it into adulthood (or even before) so its not really surprising that mean variations in IQ are small. The exceptions would obviously be people like theoretical physicists who spend a lot of time on seriously original thinking. I think it would be interesting to find out (a) variation over lifetime of IQ in serious thinking professions (if I could think of a way to define that!) and (b) data on people who have tried to raise their IQ by concerted effort ie. is it possible to go from IQ 100 to IQ 150?

Steve Hsu said...

G: the data suggests that IQ does not increase drastically due to stimulation. Have a look at the Terman study, which followed a large group of gifted kids through late adulthood. They were tested again as adults, and no increase in IQ was found for, e.g., the scientists in the group. The Terman study is a good one for demonstrating that high IQ is not enough assure success. However, Roe's study seems to imply that it is a *necessary* precondition for eminence in science.

Anonymous said...

Hello Steve.

I think that an interested human animal can train itself to do anything another normal human animal can do. If this is true, considering the fact that correlation does not imply causality, the IQ/ profession prediction business is false, and should be left to some dystopia like Gattaca .

Consider the following:
In most domains of expertise, individuals begin in
childhood a regimen of effortful activities (deliberate practice)
designed to optimize improvement. Individual differences, even among
elite performers, are closely related to assessed amounts of amassed
deliberate practice. The superior performance is not general, but
domain specific and transfer outside their narrow area of expertise is
surprisingly limited. Many characteristics once believed to reflect
innate talent  (including those of idiot savants and child
prodigies) are actually the result of intense practice extended for a
minimum of 10 years. This seems to be true for a wide range of domains
like mathematics, swimming and chess. Deliberate practice designed to
improve preformance is different from work and play. Its results are
enjoyable, not the process itself, and its daily duration is limited.
and [Reference]


Anonymous said...


If I were a theoretical physicist, (I'm actually into computational complexity theory and such) I would pour myself into theoretical physics and not bother about the IQ stuff - which simply appears to be a symptom of a condition, rather than the condition itself.


Steve Hsu said...

V: that quote sounds like it could be from Anders Ericsson's research on expertise. I disagree with his conclusions. His studies only show that effortful practice (about 10 years worth) is typically required to reach the highest level of capability. But he then confuses the logic and asserts that practice alone is *sufficient*, when in fact it is only necessary. You need raw ability *and* lengthy practice to reach expertise.

Of course it is appealing for most people to think that Ericsson's model is correct and that effort is all that is required to produce capability, but this claim is very controversial in the psychology community, and I think implausible to anyone who has been around gifted kids/adults.

The Roe study, combined with other studies showing the age stability of IQ (certainly once adulthood is reached), also serves to refute Ericsson. There's clearly some measurable quality, usually present already at an early age, that is advantageous for intellectual achievement. Most people don't have it.

Anders is refuted quite well in papers by leading psychologists like Sternberg (Yale) and in Eysenck's book Genius.

By the way, also contra Ericsson, there are many credible examples of supreme raw talent that didn't require development through 10 years of practice (e.g., Mozart).

It wasn't just the Freako guys who reported on Ericsson-like research; Gladwell wrote about it too! It's one of those appealing but incorrect worldviews (i.e., like socialist economics).

Anonymous said...

As someone who is into computational complexity and such, you should appreciate Steve's argument for what it is.

Deliberate practice is how an expert is trained, but rarely is an expert trained by themself. They are guided and coached. I think that his argument is basically that if you put together a naive bayes classification scheme [exercise left to readers] for picking grad school candidates out of the applicant pool, IQ would be a useful feature.

I think that he acknowledges in a previous comment that IQ doesn't cause successful academic outcomes [and, uhh, my example is illustrative]. We could argue all day about what people can and can't do, but what is important, is finding the population that can 'get it' in the finite time that is grad school [as governed by funding constraints].

Classification of things is a fundamentally information theoretic idea. [This blog is called infoproc, after all; I do vote for a name change, though, to infoprof, a blog on Info Professing]. On top of that, he has a vested interest in finding a better way to skin a grad student.

The Gattaca reference is cheeseball. For someone who professes to be interested in computational complexity, you appear to be very well stuck in a linear programming mindset.

Anonymous said...


Sorry to be a contrarian, but I had a look around for some data on that Terman study you mentioned, and I don't really see that it supports what you're saying at all. As I understand it, Terman selected children who had high IQ's, and followed their progress through life and showed that in general they were more successful than people with lower IQ's, that the higher the IQ the higher the average disparity in achievement and that the IQ's of people from his group who had achieved less were broadly similar to those who had achieved more, and hence your conclusion that IQ is do a large degree immutable (the extra stimulation of their more successful careers not having raised their IQ relative to the rest of the group). From what I read there are all sorts of issues with his methodology, but they're probably beside the point in this instance. So I'll stick to one point - who's to say that the raised IQ's of the original group were not raised because of stimulation that they had already received? The most obvious candidate being parental encouragement, but there are plenty of others you could speculate about.

I also read that Terman rejected 2 Nobel prize winners because their IQ's were too low! I can't really see how that sits with the two pillars of your argument that high IQ is necessary for a career in science and IQ does not vary with age.

In regard to Feynman, I find it highly dubious that his verbal IQ was so (relatively) low by the time he was writing his lecture series - I suspect if it had been measured then it would have been a lot higher. So if your argument to explain his childhood IQ of 124 is correct, in my judgement it is likely that you cannot also say that his verbal IQ was constant.

I should probably confess that I am a bit of an IQ skeptic, however it is not an article of faith for me , so I am persuadable!

Steve Hsu said...

G, Re: Terman

The point is not how the kids got their high IQs in the first place. The question is whether knowing a kid (or, say, a high school senior) has a high IQ allows to predict anything about their future.

You were asking earlier whether the effort of becoming a scientist might actually increase your IQ. The answer, from Terman, is no. The late adulthood IQs were consistent with what was measured earlier in life, and there is no reported increase for people who became scientists rather than, e.g., administrators. i.e., adult career path did not seem to systematically increase or decrease IQ.

So, achievement does not cause high IQ, but high IQ might increase the chances for achievement.

More generally, the stability of IQ through adulthood has been established in many studies -- actually, it only goes down as we get older :-) IQ is, roughly speaking, an unchanging psychometric variable in adults.

The Terman study shows that IQ is not *by itself* sufficient for scientific achievement. It doesn't tell you how much your odds of success improve with higher IQ, but that is something that Roe addresses. If I randomly sample from the set of eminent scientists, and find that almost all have IQ >> 130 (130 is typical of the overall pool of scientists), then I can conclude that higher IQ increases the chances of success. The fact that two kids (Shockley and Alvarez) who didn't make the Terman cut went on to win Nobels means nothing statistically. I am sure if you tested a sample of Nobelists they would resemble Roe's sample.

Re: Feynman, yes IQs can change during childhood. The correlation of score at age X with score at (say) age 35 increases as X approaches 35. When X=8 (or whenever Feynman had his grade school test) the correlation with IQ at age 35 is significant, but certainly not 1. But by age 20 or so X is approaching 1.

BTW, if you are seriously interested in learning more about this subject, there is a ton of research worth looking at. Many people have strong opinions about IQ, but very few know what they are talking about. There is too much confirmation bias in subjects like this -- people believe what they want to believe, not what the data tells them.

Anonymous said...


Yikes! I imagine either you or I will get bored of this at some stage, so feel free to tell me you've got more important things to do :)

I would be interested in a reference for some of your statements ie. "But by age 20 or so X [sic] is approaching 1" - however there are of course other explanations for this in the general population; what do most people stop doing when they get to age 20?

Re: Terman and Roe - my interpretation would be that Terman does not even show that it is a *necessary* condition, and Roe doesn't show anything except a correlation.

>>If I randomly sample from the set of eminent scientists, and find that almost all have IQ >> 130 (130 is typical of the overall pool of scientists), then I can conclude that higher IQ increases the chances of success.

Without wishing to labour the point, thats like saying all blacksmiths have dirty hands, therefore having dirty hands increases your chances your chances of being a successful blacksmith.

>>The point is not how the kids got their high IQs in the first place.

Ummm to my mind that is exactly the point - if you want to use an IQ test to rule out 99% of the population from a particular job category, you'd better make damn sure that you're not discarding the wrong 99%!

My IQ skepticism is basically founded (somewhat shakily I'll admit!) on two things; one is some experience I had of tutoring underperforming school kids a while ago. It went something like this.... oh my god, kid knows nothing, this is going to be hard! Ok , ok start again....nope this is hopeless...ok start again...wait! you didn't know that? Explain the crucial piece of information, everything goes much better! In short, their supposed lack of ability was really down to a missing foundation somewhere so the whole intellectual structure they were building was architecturally unsound. I really feel that this extends to most realms of logical thought - the idea that pretty much anyone can understand something if they're taught in the correct way for them. All very waffly and unscientific I know, but at least informed by some personal experience.

The second is my experience of looking at some IQ test papers (I've never actually taken a test - scary - maybe I am stupid!) - and frankly the idea that you can't improve your score, even as an adult, with a little practice I find _highly_ dubious. Perhaps I should try... I'd certainly place a reasonably sized bet that I could coach someone with an IQ of 100 to get up to say 115 (ok I might have to shift that down depending on my own IQ!).

Anyway, thats enough from me for now - its a little late where I am. I hope you don't think I'm being facetious; I'm genuinely interested in what the right answer is (even if its not the answer I want) I'm just not convinced by these arguments.

All the best

ps. Where can I get hold of the questions from Roe - are they in the book? I'd like to have a go if only for my own vanity :)

Steve Hsu said...


Some of what you are asking about is standard information about IQ, which you could look up if sufficiently interested. Have a look at Jensen's The G Factor or Eysenck's Structure and Measurement of Intelligence (those are just off the top of my head). A simple question is: how do repeated IQ measurements on the same individual over time correlate with each other? The results show adult stability (albeit with eventual declines with aging), and even pretty good correlations between childhood and adult measurements. I've even seen correlations between IQ at 5 and adult IQ. The correlation is significant, although nowhere near 1. (For adult IQ correlations are very high, like .9 or .95 when measured a year apart, and for kids I think Jensen gives .8 but that is just from memory.)

Of course there might be some person whose IQ increases dramatically over a short period of time -- especially during childhood, puberty, etc. However, *averaged over populations* one doesn't see it.

I doubt whether coaching lends more than a *short term* effect. If you have some method of boosting someone's IQ for a week or two due to coaching, great. Kaplan's would love to have it. That would just mean that occasionally a *particular measurement* on a *particular person* is off because of a short term boost. (Like a weight survey being off because one guy hides lead blocks in his pockets.) It doesn't impact statistical predictions about populations, which is what this discussion is about. (There is also lots of data on how much family characteristics like income, education, etc. affect the adult IQs of twins adopted into different families -- answer: not much -- but that gets at heritability, which isn't central to this discussion.)

I think you still miss the main point about Terman: it shows that there is no systematic boost from, e.g., choosing a scientific career over something else. That was a specific question you brought up, and I mentioned Terman's study to address it. Ericsson and other "practice makes ability" proponents might claim that the high IQs of scientists are caused by their doing science. But a long term study like Terman's shows that IQs of individuals didn't go up as a result of choosing science over another career. *Terman recorded their IQs measured at different stages of life and also their career choices.* I don't know how much more simply I can explain this point to you.

All of my other conclusions can be obtained using Roe once you accept stability of IQ in adults. From her data you can conclude that Scientists with IQ >> 130 are many times more likely to be "eminent" than the average scientist with IQ=130. This follows from the fact that IQ >> 130 people dominate the ranks of the eminent, whereas they (the ones with IQs as in Roe's sample) are, say, less than 1% of the population of scientists.

Sure, I can't prove causation. It might be that having a IQ >> 130 is correlated with having ESP, and it's actually ESP that makes you a better scientist. But I don't believe that, and I can't see any other plausible confounding factor. Most people would agree that having IQ=160 at age 18 is a big advantage to pursuing a career in science.

BTW, I don't think you understand how big the gap is between 130 and 160. If you can get someone who consistently scores 130 to score 160 consistently on a well validated IQ test (one with a high g loading), it would be a miracle. (A guy who ordinarily scores at 98 percentile suddenly beats 99.997 of all test takers; like a 6"3 guy suddenly growing to 7 feet tall.) If you aren't familiar with "g loading", you should look into that before having a serious discussion about IQ.

ps Roe only gives one or two sample questions from the tests and they aren't that hard. I wouldn't worry about her tests unless you have a tendency to max out other tests like the WAIS!

Anonymous said...

If I remember from Gleick’s book correctly, Feynman took the Stanford Benet. The ceiling for the children's version on that test is way over 135 (I thought it was like 160 at least).

I'd also like to see how professional philosophers stack up against physicists on the GRE. Getting into grad school for philosophy is much more selective than for physics. The competition is absolutely fierce.

It is common nowadays for top ten philosophy programs to receive 300 applications and only admit around 8 applicants. So those that get in tend to be very good (though I am told that the GRE is the least important of the factors under consideration in the admission process, I know that median GRE scores are exceptionally high for admitted applicants). So, as you've pointed out, average GRE for those taking the test favors physics majors slightly over philosophy majors, that might not tell the whole story for those who are in grad school in the two respective disciplines.

Here are some numbers I found searching the internet on grad admissions for both physics and philosophy. Of course, I don't want to start some rivalry between the two groups and the GRE is only a moderate (if not somewhat weak) proxy for IQ, it makes for an interesting discussion.

For the University of TX at Austin's PhD philosophy program, the median GRE combined for verbal and Quantitative was 1470. Median GPA was 3.86.

This program is ranked #13 on the philosophicalgourmet's list of top philosophy programs (overall PhD ranking).

U. of Chicago's phil department is ranked 20th. Their median GRE for accepted students was 710 verbal and 740 quantitative and a median GPA of 3.9.

Now I've been told by some phil professors at top ten institutions that their median combined GRE for admitted applicants was 1500 or slightly higher.

Compare that with Caltech's (one of the top physics dep. in the world) admission for physics majors. Average verbal is 600 and 780 for quantitative. I am uncertain how much, if at all, non native speakers contributed to that score. SOmetimes the non-native speaker's scores are not factored in (see below).

They've also admitted a far greater percentage of their applicants than any to 20 or maybe even top 50 philosophy program I'm aware of.

For Upenn, averages were:

Verbal: 626 (for native English speakers) and 780 for quantitative.

Anonymous said...


The Terman study is flawed in that it had a flagrant selection bias. We cannot tell how much the achievement of the Termites were due to their IQs or to some other factor. The way the children were selected was biased towards students who were liked by their teachers. Terman asked teachers across CA to name who their best student was. From this named group, he gave IQ tests. Those that scored above 135 qualified for to the study. The methodological question is, perhaps the teachers only selected students that they liked or were good students in class as opposed to the really intelligent ones. So the process might have pre-selected for other traits that were responsible for their success (agreeableness, diligence, conscientiousness, sociability, etc) besides intelligence.

>> More generally, the stability of IQ through adulthood has been established in many studies<<

There is some evidence that you can raise IQ into adulthood.

Also, we don’t know enough about how much a good intellectually stimulating environment effects a child’s IQ.

>> actually, it only goes down as we get older <<

Actually, that’s false. IQ increases to about age 35. It remains stable until age 65. Then it goes down steadily. Only one or two aspects of IQ decreases after about 18. That’s Gf and calculation speed. Gc actually steadily increases as we age to age 65. See the Seattle Longitudinal Study which is the largest and most comprehensive study on age and IQ ever done.

>> Ericsson and other "practice makes ability" proponents might claim that the high IQs of scientists are caused by their doing science. But a long term study like Terman's shows that IQs of individuals didn't go up as a result of choosing science over another career. *Terman recorded their IQs measured at different stages of life and also their career choices.* <<

I think there’s a Bayesian fallacy in there somewhere. It’s perfectly possible that those with high IQs have a tendency to go into science. These people do not have much increases in their IQs because they were already pretty much maxed out before. That would not contradict findings that for many scientists, their IQs increased dramatically by doing science (or other intellectually stimulating stuff).

Steve Hsu said...

Re: Feynman, Gleick doesn't say what test it was. This would have been in the 1920s in Far Rockaway, Long Island. I doubt they had anything more than a simple written test given to all students. But who knows.

Re: Terman, I'm not citing the success of Termites in any of my arguments. I'm only citing the fact that the ones who went into science as opposed to some other, perhaps less stimulating, careers didn't seem to have a secular increase in adult IQ. Keep in mind many of the highest scoring Termites did anything but science.

If there existed results from Terman that show massive adult IQ increases for people who went into science, that would be relevant to my use of Roe. Otherwise, Terman is irrelevant. Roe finds that IQ >> 130 scientists are grossly overrepresented among the eminent. The most natural explanation of this is that their high IQ was advantageous. I have never seen any data indicating a possible 30 point increase in adult IQ, so it seems likely that their IQs when, e.g., college students were *already* highly exceptional. At correlation > .9 year to year, measured adult IQs are only fluctuating a fraction of a standard deviation.

Anyone who advocates Ericsson's claims in light of Roe's data would probably need to claim that huge increases in adult IQ (2 or more SD) are possible. I have never seen any evidence of that.

I don't think I'll respond further unless someone makes a substantive criticism of my interpretation of Roe.

Steve Hsu said...

Oh, regarding philosophy programs, it's possible they are quite selective. I can't imagine they have a lot of funds to support grad students.

Note the math GRE has a low ceiling -- the avg you quoted of 780 means the test is too easy. If you want to do a meaningful comparison of elite groups you probably need a better test.

Also note that theoreticians and experimentalists have different profiles and that most physicists are experimentalists.

You won't like this, but here is something on philosophy:

Anonymous said...

>>Oh, regarding philosophy programs, it's possible they are quite selective. I can't imagine they have a lot of funds to support grad students.<<

Yes, funding is definitely a problem.

>>Note the math GRE has a low ceiling -- the avg you quoted of 780 means the test is too easy. If you want to do a meaningful comparison of elite groups you probably need a better test.<<

I think this low ceiling works against both philosphers and physicists seeing that top institutions in phil probably accept those with comparable Quantitative scores.

I read the article on philosophy and I think it misses the main reasons to study philosophy. There are personal and ethical reasons (which were provided by Wittgenstein ironically) But aside from all that the usefulnes of philosophy is obvious to me. In fact, other than perhaps math, I don't think a single discipline has contribute to the changing of society as much as philosophy.

Anonymous said...

Hi Steve,

I haven't done my homework on g-loading yet but I found this critique of Roe, which may be of some interest. If you google it, the cached version is more readable.

Were you referring to the Melita Oden study of the Terman data above when you talked about IQ's not varying due to stimulation by occupation? Do you have a reference to the data? What little I've been able to find doesn't really break things down into intelligence by function e.g. variation of mathematical IQ for mathematicians.

Re: your comment about the gap between IQ 130 and 160 - I understand the normal distribution well enough, I just don't think (a) that IQ's are normally distributed (a priori I would expect a log normal distribution but thats beside the point) and (b) that they're not malleable. I'm not saying going from IQ 130 to 160 would be easy, I could only imagine it would take years of practice, kind of like doing a PhD; now where does that take me .... :)

On a slight tangent and back to the anecdotal evidence, I do find it slightly curious that we've got 3 Nobel prize winners with measured IQ's, all in childhood admittedly, which are relatively unremarkable (the 2 Terman, plus Feynman) but none (that I can find a reference for) with a ~160 IQ that you seem to be saying is necessary. Do you know of any named examples of famous scientists (or other professions where you might expect a high IQ) that have verifiably high IQ's? Obviously there are lots of social reasons why people might not want to disclose their IQ's, but I haven't been able to find any. On the other hand you can find plenty of people in these ultra high IQ societies who don't seem to have achieved a whole lot (again relatively speaking). I wonder if there is some kind of sweet spot for IQ in relation to academic achievement, and that if you stray too far above, you kind of lose interest at a young age because school is too easy. More idle speculation!

Steve Hsu said...

Re: Roe, I don't see why the Noesis writer is confused about how she normed the VSM tests. Once she (or her colleague) extracted the mean and SD of the Teacher's College group on the two tests she can get an estimate for the raw score to IQ conversion for the scientists. It's very possible that the tail isn't Gaussian. It has been claimed, although not explored rigorously, that there are an excess of high scorers, which eliminates the Noesis writer's complaint that 194 is too high.

Re: Terman. Many years ago I found in the library a multivolume set of books on the Terman study, and as I recall they contained (anonymized) case by case analyses of individuals in the study. I don't recall the precise reference, but it is probably one from the following list. There are also papers which you might find via google that describe the scores of adult Termites. (I've looked at some of those more recently.) The surprising thing is that the adult scores are on average somewhat *lower* than the childhood scores. No +2 SD jumps, although lots of slightly lower scores :-) If an IQ researcher found a +2 SD jump from late youth to adulthood I imagine they would remark on it quite prominently, since it violates the widely held belief that adult IQ is stable.

Terman, L.M. (1930). The promise of youth, follow-up studies of a thousand gifted children: Genetic studies of genius, III. Stanford, CA: Stanford University Press.

Terman, L.M. (1947). The gifted child grows up, twenty-five years follow up of a superior group: Genetic studies of genius, IV. Stanford, CA: Stanford University Press.

Terman, L.M., & Oden, M.H. (1959). The gifted group at mid-life, thirty-five years follow-up of the superior child: Genetic studies of genius, V.3. Stanford, CA: Stanford University Press.

Re: Nobel Laureates and high IQs, there are actually studies (I think one is by someone named Cox in 1926; referenced in Eysenck's book Genius) of childhood accomplishments of famous scientists and mathematicians that lead to IQ estimates. I don't find these very reliable as numerical estimates, but it is pretty clear that they typically exhibited precocity far in excess of persons with IQ=130 (i.e. people in the top few percent).

From my own experience, I can tell you that the dozen or so people I've known from their early 20s or before who went on to become established scientists (often theoretical physicists) all had IQs much higher than 130 as measured by SAT, GRE and other standardized tests. (Admittedly this is anecdotal.)

Oh, here is some Cox data:

Steve Hsu said...

PS Flynn effect is irrelevant to this discussion, as we are asking whether IQ >> 130 is advantageous to someone competing against others in his or her *own generation* in the field of science.

Steve Hsu said...

G: I found the following in a NYTimes article.

Please send me an email if you want to continue corresponding. You aren't by any chance Malcolm Gladwell? :-)


In 1968 Melita Oden, a research associate of Dr. Terman's, published a study of 100 Termites who at midlife had attained the most success and 100 whose careers had foundered. The successes, whom she called A's were in professions like law and medicine, or were university professors or business executives. The other group, the C's, were in occupations like sales clerks, far below their intellectual potential. One, who had earned an advanced degree in engineering, was working as a technician.

The A's, to be sure, on average had I.Q.'s seven points higher than the C's: 157 versus 150. But small differences in scores at the extreme high end of the I.Q. curve translate into little actual difference in ability. Such a difference is "meaningless," said Dr. Hastorf, the current shepherd of the Terman data and a psychologist retired from Stanford University.

But other differences were telling. The A's were more motivated from the start; they skipped more grades in grammar school, and went further in their education than the C's. As youngsters, the A's were rated as more lively and engaged than the C's, taking part in more extracurricular activities in school and, throughout their lives, in more sports.

Perhaps most significant in explaining the difference in career success, said Dr. Hastorf, were character traits. From childhood on, the C's showed a lack of persistence in pursuing their goals, whether in school or work; the A's, at an average age of 11, already showed greater "will power, perseverance and desire to excel."

Anonymous said...


Thanks for the references. I'll have to go away and have a dig through them before I can say anything that will carry any more weight.

>>You aren't by any chance Malcolm Gladwell? :-)

Nope. Was that an insult? :) Oh well.

Steve Hsu said...

G: Here's a good reference from the book Constancy and Change in Human Development, Brim et al., p.400 and p.445

It discusses the Terman follow up tests (p.400) and malleability of adult IQ (p.445). There is little evidence for malleability.

They developed several categories for adult Termites. The A group had achieved significant success (e.g., National Academy of Science) whereas the C group were undistinguished. (Many in the C group were probably women who had married and not pursued challenging careers. There were also firemen, policemen, etc.)

The childhood IQs of the C group averaged about 5 points higher than the A group, and the adult IQs of the two groups (on the Concept Mastery test) were similar. There is some question as to how to convert from the Stanford Binet administered to young Termites and the CMT administered later. If you google around you can find the CMT scores (see links below). The adult scores were not very different from the childhood scores, modulo the conversion uncertainty.

So, the A group, which had much more mental stimulus and successfully pursued challenging careers did not have an enhancement of adult IQ relative to the C group. I think this clearly refutes the Andersson "practice makes ability" hypothesis with regard to IQ. No +2 SD changes reported :-)

There were interesting differences in the A and C group with regard to parental background (A group parents more likely to be professionals) and personality factors (A group kids more achievement focused). I think the most plausible hypothesis is that IQ>>130 is a positive factor for success as a scientist, but certainly not sufficient. Personality factors also play an important role.

see also:

This first link has a description of A and C groups and also a 1961 Science article with IQ profiles of PhDs in various fields. You can see that Roe's group of eminent scientists tested well beyond the average PhD. Lacking a mechanism for drastically raising adult IQ, one would suspect that the IQ was at least partially causative of success.

Anonymous said...

I'm new to this blog and love it; your work is cited at Sailer's and other blogs that discuss I.Q.

I'm curious about the tact you take with those who deny the meaning of I.Q. Let's look at the comment which was posted above:
"G said...


Thanks for the references. I'll have to go away and have a dig through them before I can say anything that will carry any more weight."

Translation: Give me time to come up with counterarguments.

What I wonder is, "What's the point?". In my experience of reading through any kind of thread: social, political, scientific expert level, etc., discussion is hampered, not advanced, when much time and energy is devoted to trying to influence the impervious men. We do learn some from the debate, but I think we can learn the same things and then some from a free flowing discussion. For example, I gleaned from this discussion that an I.Q. test is more imperfect at the margins where you may have someone who is a math or physics savant. The result of a bogged down thread is frustratingly dull and it's assaulting to my senses to see fools suffered gladly.

In a political sense, I believe it is more effective to establish the premise and discuss implications of the topic at hand, but not the premise itself. I am an admirer of the handling of the blog, Gene Expression. It is extremely productive and thus informative. I.Q. deniers find it extremely intimidating and the few who are foolish enough to venture there and spout get ignored. Now what about changing minds? These people won't change their minds, or at least not suddenly, because the belief system that I.Q. challenges isn't rational, but more religious in nature. They need to be on the sidelines listening to discussions, whenever they choose, and allowing what they read and hear to penetrate their minds (along with real world experience) at whatever pace is psychologically best for them; for some, especially older people, this may mean never allowing their views to be challenged and staying away.

Steve Hsu said...


Thanks for the compliments!

I think the internet has a tendency to segment discussion into like-minded channels. This has the advantage (ex: GNXP) of allowing in-depth discussion and avoiding covering basic issues again and again. For the opposite extreme, see, where all genetic influences are more or less assumed to be zero and all left- hypotheses are valid :-) I force myself to go there once in a while to check my basic assumptions against their perspectives. (Note I'm an Obama supporter so I'm not exactly your typical Steve Sailer guy :-)

Since my blog is not (primarily) an "IQ blog" and since my readers tend to be pretty smart (quite a lot of PhDs in engineering and science, lots of quants), I don't mind addressing fundamental questions like What is IQ good for? As long as the commenter is sincerely trying to understand, and is willing to address the arguments and data on their basic merits, I think it's a worthwhile exercise once in a while.

It's also important that the conversations are archived and searchable for future readers/users. The arguments and data can be used in discussions that will take place elsewhere.

If I had had access to Google and the wealth of internet information when I was a kid I could have saved myself a lot of time :-)

Anonymous said...


Nice reading your blog, the studies were conducted in 1950, the fact might be true, but probably the number need some adjustment as the environment factors come in and education are more accessible.

It would be interesting if a new study add a category of "prominent computer scientists" to it, as there is no such group in the 1950s. The geeks might probably perform similar or better than physicist. :)

Michael's Resume said...


The Roe study is often cited as rebuttal to the general conclusion that elites, including scientists, are of moderately high, but not exceptional, intelligence. I direct you to 'The Empty Promise', Grady Towers for a summary of the topic. Generally, elites test with a mean IQ around 126 and a SD of about 6.5. (15 point deviation). The SD is the surprise. It suggests that the probability of membership in an elite peaks at 132 IQ and at about 150 IQ, members are more common in the general population than in the elites. I direct you to for a rather clever way to estimate Nobel Laureates at 144. This would suggest that the average Nobel Prize winner is at +2.8 SD in the group from which they are selected. So, to Roe. The 152 IQ is ambiguous. First, it is almost certainly a 16 point score. So, it should immediately be revised to 149. Next, it seems the test was normed against the results of college students. As you are probably aware, the IQ distribution has a significantly fat right tail. Consequently, applying the college students' results to the eminent scientists would tend to mimic a ratio, rather than deviation IQ. Fully ratio would translate 149 -> 142. However, that is probably extreme. The general conclusion would be that these eminent scientists probably had a mean IQ, as generally measured today, of between 142 and 149. This is generally consistent with the 144 of Nobel Prize winners. I applaud the interest in the measurement and distribution of intelligence.
Hope that clarifies.

Michael Ferguson

Michael's Resume said...


I note in rereading that you claim that the Roe test was 15 point SD. I'd like the reference. I read the book a long time ago and I don't remember that. I also recall that Linus Pauling was one of the subjects. Well known, but not quite Nobel level?

Now, as to Cox. This set of estimations were done in 1926, long before the introduction of either the 15 point SD or deviation IQs (Wechsler ~1944). If you adjust for these factors, nearly all first tier geniuses had IQs between 159 and 168. Goethe (179), Liebniz (176), Grotius and Wolsey (174), Pascal and Scarpi (171) had somewhat higher. Einstein's IQ was estimated several times from 160 to 165, compared with Cox's estimate of Newton at 168. Huh... I would have placed them about the same.

All of this is intended to reinforce the basic conclusion that, while a moderately high IQ is useful for success in scientific careers, as IQ increases above 132, it appears to be detrimental and above 160, statistically, it is comprehensively exclusionary.

Michael Ferguson

Steve Hsu said...


Thanks for your comments.

I have a copy of Roe (from the UO science library). The relevant chapter is 12. She says explicitly the group used to norm her special test is a group of PhD students at Columbia Teachers College. Their particular identity isn't so important, though -- she notes that this group is also required to take a battery of other standardized tests, so that the norming of the special test can be done via cross comparison. She also uses examples of 85, 115, 130 when giving a layman's description of the meaning of IQ, so I suspect she uses SD=15, but I don't think she says so explicitly anywhere. She is obviously not an expert on psychometrics and relies on a colleague (Dr. Irving Lorge) for statistical analysis.

Scientists are not a monolithic group, as her results show. Theoretical work generally requires more abstract ability than experimental work. I suspect if you tested Fields medalists and Nobelists in theoretical physics you would find an average much higher than 144. Success as an experimentalist depends on many other factors, including the ability to secure funding, manage a team, etc.

Finally, luck plays a huge role in success, so even if success as an elite scientist is highly correlated with intelligence you should find that the most successful (e.g., Nobelists) do not necessarily have the highest IQs in the group. See here:

On a broader note I suspect the predictive power of IQ or g is if anything more tenuous at the tail than within a few SD of the mean.

Anonymous said...

Didn't James Watson get a 125?

Anonymous said...

For those that commented on philosophers and how they stack against physicists, I actually blogged on this myself (I was a Phil undergrad; take that as a bias if you wish), and concluded that philosophy undergrads are as smart as physics and math undergrads.

Link here: Just how smart are philosophers?

Stephen said...


I wonder if you could comment on the distinction between g and the theory of multiple intelligences. People who support the g argument will say that the three main aptitudes, math, spatial and verbal, correlate with each other. They will often hasten to add that there will sometimes be a significant gap between sub test scores, but the pattern will be something like well above average in math and spatial but only modestly above average in verbal. However, at least one group, the Ashkenazi Jews, has a lot of members who score well above average in the verbal and math subtests, but below average in spatial. I am not Jewish, but I have the same pattern. (and I will note that I am a male) How would you reconcile these substantial imbalances, that include one below average aptitude, with the concept of g?

Yay4me said...

IQ tets are for kids. Once you've been institutionalized as all these people have, there's too many variables.

Pablo said...

Steve, I didn't remember the work of Ericsson being discussed in Eysenck's Genius, so I went to my copy and confirmed that my recollection was right: Ericsson's name does not appear in the index, and no work of his is listed in the bibliography.

steve hsu said...

Eysenck does not refer specifically to Ericsson but he presents data that contradicts Ericsson's hypothesis. Perhaps the largest data set that undermines Ericsson is the SMPY longitudinal study in which top 1% individuals can be compared to top 0.01% individuals over an extended period of time. These individuals are selected before age 13, and the g measurement at that early stage in life has clear predictive power for achievement later in life.

Pablo said...

Hi Steve. Thanks for your reply. Can you recall the chapter or section where Eysenck discusses the SMPY longitudinal study? There is a brief mention of Benbow & Lubiski's papers in the context of a discussion about gender differences, but the data he presents there doesn't bear on Ericsson's thesis, since it concerns personal values and vocational interests.

RJ said...

The counterpart of the 'ceiling effect' is the 'floor effect'. A test composed exclusively of very hard questions, and scored with this taken into account, would tend to overstate the consistency of the competence of those scoring at the low end. If these lowest scorers managed by, some strange alchemy, to answer even a few questions in a given section correctly they would instantly be awarded high scores (relative to the general population). But one can easily see how these high scores might fade away were they to take a test designed to capture a lower ability level. It's hard to say whether this is a problem for those scoring nearer to the high end on tests subject to the floor effect, but it could be.

Richard Seiter said...

Steve, I know this is a really old post, but watching the video of your talk you posted last week motivated me to order a copy (should be here in a week or two). Do you know if anyone has dug into Anne Roe's papers:
Many (most/all?) of her subjects have been deceased for 10 years so it seems like there should be some wonderful biographical material available in those papers. In particular, I think reading about Luis Alvarez and Linus Pauling would be fascinating.

gwern said...

> This would have been in the 1920s in Far Rockaway, Long Island. I doubt
they had anything more than a simple written test given to all students.

Worse, if you parse the passage in Gleick closely, you can infer that the test was done in middle school or earlier, and probably done in elementary school, given the placement of the IQ anecdote and how Feynman apparently had recently learned algebra but had not yet learned calculus.

Unknown said...

From where can I buy this book.

ztech said...

Some individuals who see significantly improved performance on tests which are more g loaded (though perhaps less Working memory loaded tests - or true untimed tests). More recent evidence suggest a high associative memory is important to solving inductive problems, and has also been associative with creativity measures; A recent analysis of the Raven's Matrices revealed loadings on associative memory, even exceeding working memory. Furthermore, a couple studies show that it is a distinct math factor , which is a better indicator of g , than even concept formation. However, considering Feynman's 'modest' (as Mr. Hsu suggests) verbal score, it is impossible that any conventional test of intelligence would put Feynman's IQ as high as 190. Verbal loadings on conventional IQ tests are very high, so Feynman's IQ would be dragged down considerably. And Concept loaded tests like the Raven's Matrices will not necessarily reflect the math factor either, which seems to rely more on complex working memory , than simple working memory (which the RPM seems loaded on). All said, conventional tests will, in fact, underestimate, intelligence, throughout the entire IQ spectrum. A good test of inductive mathematical problem solving ability, is known as the SIGMA test , for anyone who wants to give it a shot......

ztech said...

Of course all of this IQ talk, relies on the assumption of the constancy of intelligence. But it is pretty clear that the associative/heuristic network may provide it's (relational) boost , at differential rates. It certainly may be the case that people hit inductive peaks, at different times (though likely all in their late teens to early 20's). Associative memory (even the simplest tasks) have already been linked to g - and more specifically, fluid intelligence. And there is no reason why one should assume rates of mental development, and peak ages, are standard.

Anonymous said...

My IQ is 47 and I shove sharpened objects up my nose, but I wont let that stop my pursuit of science. Just because my IQ is low and I frequently have difficulty couting up to four does not give anyone the right to cap my capabilities. So there.

Blog Archive