Friday, October 21, 2011

IQ malleability

This paper got a lot of attention in the press. I don't have time to discuss it in detail, but it is worth emphasizing that the correlation between scores on WISC and WAIS for the n=33 subjects was around .8 -- typical for two different IQ tests administered years apart. (Note, the same test -- e.g., the SAT -- administered twice with a gap of 1 year might have correlation of .9 or even .95.) In other words, knowing an individual's WISC score gives you good power to predict their eventual WAIS score, but with a big chunk of variance (say, 30-40% of total variance) still unaccounted for. In light of this, a 1 SD shift in score is to be expected for a reasonable fraction of participants, as found in the study. What is a bit more novel is that they correlated the score shifts to actual MRI observations of the brain.

(One might also interpret this as saying that the residual uncertainty in "true g score" is a good chunk of an SD even after an individual is carefully tested using WAIS or WISC; personally I don't think "g" is much better defined than at this level of accuracy.)

I am very jet-lagged right now, so I hope what I wrote makes sense :-)

Verbal and non-verbal intelligence changes in the teenage brain (Nature):

... Our participants were 33 healthy and neurologically normal adolescents with a deliberately wide and heterogeneous mix of abilities (see Supplementary Information for details and the implications of our sampling for the generalizability of our conclusions). They were first tested in 2004 (‘time 1’) when they were 12–16 yr old (mean, 14.1 yr). Testing was repeated in 2007/2008 (‘time 2’) when the same indivi- duals were 15–20 yr old (mean, 17.7 yr). See Table 1 for further details of the participants. During the intervening years, there were no testing sessions, and participants and their parents had no knowledge that they would be invited back for further testing. On both test occasions, each participant had a structural brain scan using magnetic resonance imaging (MRI) and had their IQ measured using the Wechsler Intelligence Scale for Children (WISC-III) at time 1 and the Wechsler Adult Intelligence Scale (WAIS-III) at time 2 (see Supplementary Information for details). These two widely used, age-appropriate assess- ments5 produce strongly correlated results at a given time point, con- sistent with them measuring highly similar constructs6. Scores on individual subtests are standardized against age-specific norms and then grouped to produce separate measures of verbal IQ (VIQ) and performance IQ (PIQ), with VIQ encompassing those tests most related to verbal skills and PIQ being more independent of verbal skills. Nevertheless, VIQ and PIQ scores are very significantly correlated with each other across participants: in our sample, the correlations between VIQ and PIQ were r50.51 at time 1 and r50.55 at time 2 (in both cases, n 5 33; P , 0.01). Full-scale IQ (FSIQ) is the composite of VIQ and PIQ and is regarded as the best measure of general intellectual capacity (the g factor) that has previously been shown to correlate with brain size and cortical thickness in a wide variety of frontal, parietal and temporal brain regions7,8.

The wide range of abilities in our sample was confirmed as follows: FSIQ ranged from 77 to 135 at time 1 and from 87 to 143 at time 2, with averages of 112 and 113 at times 1 and 2, respectively, and a tight correlation across testing points (r 5 0.79; P , 0.001). Our interest was in the considerable variation observed between testing points at the individual level, which ranged from 220 to 123 for VIQ, 218 to 117 for PIQ and 218 to 121 for FSIQ. Even if the extreme values of the published 90% confidence intervals are used on both occasions, 39% of the sample showed a clear change in VIQ, 21% in PIQ and 33% in FSIQ. In terms of the overall distribution, 21% of our sample showed a shift of at least one population standard deviation (15) in the VIQ measure, and 18% in the PIQ measure. However, only one participant had a shift of this magnitude in both measures, and, for that particip- ant, one measure showed an increase and the other a decrease. This pattern is reflected in the absence of a significant correlation between the change in VIQ and the change in PIQ. The independence of changes in these two measures allows us to investigate the effect of each without confounding influences from the other.

... Using regression analysis, we studied the brain changes associated with a change in VIQ, PIQ or FSIQ (see Methods Summary for details). The results (Fig. 1) showed that changes in VIQ were positively corre- lated with changes in grey matter density (and volume) in a region of the left motor cortex that is activated by the articulation of speech10. Conversely, changes in PIQ were positively correlated with grey matter density in the anterior cerebellum (lobule IV), which is associated with motor movements of the hand


Steve Sailer said...

They should give both different kinds of IQ tests at the beginning and at the end, then take the average scores. There's a fair amount of random variation. 

LaurentMelchiorTellier said...

[[ IQ is generally considered to be stable across the lifespan.... Neuroimaging allows us to test whether unexpected longitudinal fluctuations in measured IQ are related to brain development. Here we show that verbal and non-verbal IQ can rise or fall in the teenage years ]]

I'd be happier if they'd removed the word "unexpected". N-space of 33 developing teen-brains, a high degree of development/random flux should not be unexpected... 

The formulation confuses two take-home messages: 
a) IQ is unexpectedly unstabile
b) IQ instability (which is to be expected...) is correlated with changes in brain structure (good science)

The title and first paragraph lend themselves unfortunately to a), enough that I've now had the study forwarded to me twice by people who took home message a), rather than message b). Inferring somewhat from a) that IQ is a bit pseudo-measurement-ish, HBD is wobbly, etc. 

As Steve mentions, minimizing the random variation would have minimized a).

Justin Loe said...

This is an interesting paper but an n = 33 simply isn't very credible for me. Too many MRI and fMRI results with this sample size have not been replicated (including one of my own projects). The results are worth pondering, but until they're replicated I think they should be viewed with caution.

Blog Archive