Monday, August 01, 2011

Predictive power of early childhood IQ

In the comments of this earlier post a father wondered to what extent one can predict adult IQ from measurements at age 5. The answer is that predictive power is fairly weak -- the correlation between a score obtained at 5 and the eventual adult score is probably no more than .5 or so. However, the main limitation seems to be unreliability of any single administration of the test to a child that young. Scores averaged over several administrations are a very good predictor already at a fairly young age. The average of three scores obtained at age 5, 6 and 7 correlates about .85 with adult score. This suggests that while it is difficult to measure a child's IQ in any single sitting, the IQ itself is relatively perdictable already by age 7 or so! Of course there are the usual caveats concerning range of environments, etc. I would like to see results from larger sample sizes.

From fig 4.7 in Eysenck's Structure and Measurement of Intelligence. This is using data in which the IQ was tested *three times* over the interval listed and the results averaged. A single measurement at age 5 would probably do worse than what is listed below. Unfortunately there are only 61 kids in the study.

age range       correlation with adult score

42,48,54 months               .55
5,6,7                               .85
8,9,10                             .87
11,12,13                          .95
14,15,16                          .95

The results do suggest that g is fixed pretty early and the challenge is actually in the measuring of it as opposed to secular changes that occur as the child grows up. That is consistent with the Fagan et al. paper cited above. But it doesn't remove the uncertainty that a parent has over the eventual IQ of their kid when he/she is only 5 years old.

Note added: I asked a psychometrician colleague about these results. He thought the correlations seemed a bit high. He looked up another study of 80 kids that appears in Bias In Mental Testing. They found a .7 correlation between scores at 7 and 17. If the score at 7 is noisy (looks like just a single measurement in this study) then the repeat measurement used above might raise the correlation slightly (e.g., by 10 percent?), so I think these results are not entirely inconsistent with each other. Also note I read the numbers above from a graph in a small figure, so there is some uncertainty in the values I reported.

48 comments:

  1. MtMoru1:51 PM

    What was the test? Should 3 administrations make such a difference? If there is a normal error distibution around a "true score" its SD is reduced by only sqrt(3).

    How about a result that isn't from Hans. I think he's a zoophile.

    ReplyDelete
  2. reservoir_dogs1:56 PM

    This opens up another question. It was said that one can not fake an IQ test. So someone tested and got a number like 150 is really demonstrating the ability of an IQ of 150. How is that different for kids?

    ReplyDelete
  3. It could simply be a fluctuation (luck). That's not the same as faking.

    ReplyDelete
  4. IIRC it wasn't Eysenck's research. He just quotes the result in the book.

    ReplyDelete
  5. MtMoru2:18 PM

    "It could be the kid is learning how to focus on the test and so the score better reflects ability."

    So "ability" isn't just naive ability, it's also the ability to improve?

    I'm sure you've had the same thought regarding athletics or MMA. Some may respond to training so well that they surpass their naive betters.

    ReplyDelete
  6. The point is that "ability" is even less well defined when you are talking about a 6 year old kid whose mind easily wanders off from the test he/she is taking. This problem goes away as they get older, but repeat administrations might also help.

    ReplyDelete
  7. MtMoru2:36 PM

    "...whose mind easily wanders off from the test he/she is taking. This problem goes away as they get older..."

    But if there is improvement in focus scores should improve for EVERYBODY. Did they?

    ReplyDelete
  8. I don't have the paper so I can't say.

    ReplyDelete
  9. reservoir_dogs3:28 PM

    Would not the fluctuation apply the adults as well as kids? What is special about kids that gives a higher standard deviation.

    ReplyDelete
  10. Have you ever tried getting a young kid (e.g., 6 years old) to do something you want them to do? Say for a full hour?

    ReplyDelete
  11. reservoir_dogs3:53 PM

    I concede that if a score is low, it could be due to such noise. If a kid score a high one, shouldn't that indicate a true measure of ability?

    ReplyDelete
  12. I would add that "the problem goes away as they get older ..." is optimistic.  The development of executive function is distinct from intellectual development, and highly variable.  Executive function development outcomes are a spectrum.

    ReplyDelete
  13. Dawg_from_Hell 20104:28 PM

    High and low are relative. The noise from smart, inattentive kids inflates the IQs of others.

    ReplyDelete
  14. Sometimes people get the correct answer by guessing. So, for example, some amount of luck is involved in any multiple choice test.

    ReplyDelete
  15. MtMoru6:37 PM

    I think the multiple testing should have the following effect:

    rho increases with more tests up to the limit of the rho for the "true score" and the adult score (or were there three of those too?).

    sqrt(sigma(T)^2+sigma(error)^2) / sqrt(sigma(T)^2+(sigma(error)^2)/(number of admins)) 

    It's not much of an effect, so maybe it's an example of low M.
     

    ReplyDelete
  16. ben_g7:00 PM

    An argument for early interventions?  From what I've read, most interventions have a temporary effect.. but it's definitely worth experimenting to see if a lasting effect can be made in this very malleable time

    ReplyDelete
  17. What values do you have in mind for sigma(T) and sigma(error)? If the latter term dominates then the increase in correlation is big = sqrt(3).

    ReplyDelete
  18. MtMoru9:00 PM

    For reliability .5 rho would go from .69 to .85 and in the limit of admin number .98.
    For .7, .76, and .91.
    For 1, .85, .85.

    This might not make sense, but I don't know how to say it. For greater reliability less of the .85's not being 1 can be explained by error so more is explained by the true childhood score not being the true adult score and hence the limit of rho decreases as reliability goes up.

    ReplyDelete
  19. Test re-test reliability is just over .95, so it seems that var(T) = .05 or a bit less is reasonable. On the other hand for a young kid the error in testing could easily be .5 SD or var(E) = .25 (not sure if I am following your notation properly).

    Then, decreasing var(E) by 3 causes a big change in rho, which increases by

    sqrt[ (.05 + .25) / (.05 + .25/3) ] = sqrt [ .3 / .13 ] = 1.5

    So the averaging of 3 results could plausibly (using my numbers) increase a correlation of .5 to .75 or so. If var(E) were larger (say .7 SD or var(E) = .49) you could get .8 or so.

    ReplyDelete
  20. MtMoru10:03 PM

    I think either I didn't explain or don't understand.

    T is the variable for all true scores so its SD will be like 15 points. error is for individual scores.

    When "global" T variance equals error reliability is .5.

    ReplyDelete
  21. Sorry, I think I was confused about the terminology.

    If var(T) = 1 (in units of population SDs) and (let's assume) var(E) = 1 just for convenience. (Not completely crazy for a little kid.) Then reliability is 1/2. If I test 3 times it decreases var(E) to 1/3, so reliability increases to 3/4 (i.e. 1/(1 + 1/3)). Then the correlation goes up by sqrt(3/4 / 1/2) = sqrt(3/2) which is about 22 percent. (Unless I am still confused.) Not as dramatic, but what is observed is probably a combination of the averaging and maturation of the kid.

    If var(E) = 2 for a little kid then reliability is 1/3 and the averaging increases it to 3/5. That increases the correlation by sqrt(9/5) = 1.34.

    ReplyDelete
  22. MtMoru10:48 PM

    Yeah. I think that's all right. I just think reliability less than .5 is too low. But maybe not.

    ReplyDelete
  23. Keep in mind that Blacks tend to mature the fastest, Whites, and then lastly Asians for any given age; this affects brain development. 

    ReplyDelete
  24. lovehorrorfilms12:34 AM

    But IQ tests given to kids often measure different parts of intelligence than IQ tests given to adults. The correlation might be even higher still if it were the same type of IQ test given at age 6 and at adulthood. 

    ReplyDelete
  25. lovehorrorfilms12:41 AM

    It's certainly possible to train people to perform better on IQ tests, but it might not be possible to increase the level of g through stimulation, education or any psychological means.  According to the book "the g Factor", the preponderance of evidence suggests g is an entirely physiological variable.  That's not to deny that environment plays a role,  but only the biological environment.  

    ReplyDelete
  26. Anonymous_IV12:44 AM

    Luck is involved in practically *any* test, including all that appear here.  Even ignoring fluctuations in the mental state of the testee, a test can do no more than sample your ability, knowledge, etc. at a (hopefully) well-distributed subset.  So there's inherent noise.  The only exceptions are test topics so circumscribed that a test can literally cover all the material (e.g. the six phrases I learned in the first week of grade-school Spanish, or the 50 US State capitals, or memorizing a poem or the first umpteen digits of π).

    ReplyDelete
  27. lovehorrorfilms12:52 AM

    It makes sense that intelligence would be stable by about age 6 since I believe the brain by this age has reached 90% of its adult size.  Also, if intelligence is stable by age 6, why does the heritability of IQ continue to increase?   How can shared environment be so relevant to IQ during childhood, but not relevant at all during adulthood when adult and childhood IQ are so correlated?

    ReplyDelete
  28. lovehorrorfilms12:55 AM

    A high score can be just as inaccurate as a low score.  A kid might overachieve on an IQ test because he is unusually persistent,  more practiced at sitting still and following instructions than other kids,  or the test may just happen to sample intellectual abilities he's good at or vocabulary words he just happens by chance to know.  

    ReplyDelete
  29. lovehorrorfilms12:59 AM

    Who says kids give a higher standard deviation???

    ReplyDelete
  30. Apparently IQ stabilizes at age 6 only if you use multiple testing to beat down the error rate. If you didn't (and I suspect most twin/adoption studies do not), then there is an initially large error term that decreases with age. (Single measurements of IQ are much noisier at an early age.) This would look like increasing heritability but actually it's just a reduction in the "other" error term (usually ascribed to non-shared environment as it doesn't correlate with other variables like SES).

    If what I just wrote is nuts it's because I just spent two hours buying a new bicycle, helmet, booster seat, and assorted other stuff for my kids 8-/

    ReplyDelete
  31. lovehorrorfilms1:12 AM

    You're assuming they are taking the same test three times.  It could be they are taking three very different tests, and averaging the three IQ tests which all measure very different parts of intelligence gives an especially accurate measure of g.

    ReplyDelete
  32. lovehorrorfilms1:14 AM

    Isn't executive function a type of intellectual ability?  Arguably the most important type, since its executive and thus manages the other types.

    ReplyDelete
  33. lovehorrorfilms1:23 AM

    So your arguing that IQ tests become more heritable with age simply because they become more reliable with age.  So it's only IQ that becomes more heritable with age because it becomes a better measure of g, but g itself is not so much becoming more heritable.  This is a logical theory however I'm aware of no evidence that IQ tests become more g loaded with age.  I also think there might be a more general explanation for the rising heritability of IQ.  Height and weight also become more heritable with age and I doubt these traits are less reliably measured in kids. It would be interesting to look at brain size.

    ReplyDelete
  34. Not arguing that's the whole effect, just that if you have a noisier measure of g early on it will depress the calculated heritability. There might be lots of other stuff going on...

    ReplyDelete
  35. MtMoru1:45 AM

    If you didn't get the edit. A reliability of < .465 is impossible given a three test rho of .85 --- the limit of rho is then > 1. 

    .465 is the minimum reliability for 3 admins, which means the minimum single test adult retest rho is sqrt(.465) = .68. Is that the low rho you had in mind?

    ReplyDelete
  36. What'd you get?  We used iBert seats for those fleeting times when the kids were small.  Now we have a pair of WeeHoo's.  Pricey but worth it.

    ReplyDelete
  37. I quote from the intelligence wiki:

    Many of the broad, recent IQ tests have been greatly influenced by the Cattell-Horn-Carroll theory. It is argued to reflect much of what is known about intelligence from research. A hierarchy of factors is used. g
    is at the top. Under it there are 10 broad abilities that in turn are
    subdivided into 70 narrow abilities. The broad abilities are:[24]

    Fluid Intelligence (Gf): includes the broad ability to reason, form
    concepts, and solve problems using unfamiliar information or novel
    procedures.Crystallized Intelligence (Gc): includes the breadth and depth of a
    person's acquired knowledge, the ability to communicate one's knowledge,
    and the ability to reason using previously learned experiences or
    procedures.Quantitative Reasoning (Gq): the ability to comprehend quantitative
    concepts and relationships and to manipulate numerical symbols.Reading & Writing Ability (Grw): includes basic reading and writing skills.Short-Term Memory (Gsm): is the ability to apprehend and hold
    information in immediate awareness and then use it within a few seconds.Long-Term Storage and Retrieval (Glr): is the ability to store
    information and fluently retrieve it later in the process of thinking.Visual Processing (Gv): is the ability to perceive, analyze,
    synthesize, and think with visual patterns, including the ability to
    store and recall visual representations.Auditory Processing (Ga): is the ability to analyze, synthesize, and
    discriminate auditory stimuli, including the ability to process and
    discriminate speech sounds that may be presented under distorted
    conditions.Processing Speed (Gs): is the ability to perform automatic cognitive
    tasks, particularly when measured under pressure to maintain focused
    attention.Decision/Reaction Time/Speed (Gt): reflect the immediacy with which
    an individual can react to stimuli or a task (typically measured in
    seconds or fractions of seconds; not to be confused with Gs, which
    typically is measured in intervals of 2–3 minutes). See Mental chronometry.

    ReplyDelete
  38. lovehorrorfilms11:40 AM

    I'm sure executive functioning is fairly g loaded too.  g by definition influences ALL mental abilities.

    ReplyDelete
  39. Allan Folz12:32 PM

    One thing I will add, since it seems there might be more non-parents than parents on this thread, is that kids don't grow in a nice, neat linear fashion. We are all familiar with growth spurts for height. Well, the same occurs cognitively, albeit it's a much more subtle effect.

    Since IQ tests for kids are normed in comparison to the average for the child's physical age, and physical age already being a small number growing quickly in percentage terms, a kid that happens to be a month or two late in cognitive growth spurt versus the mean is going to score lower than what their true adult score eventually will become; conversely a kid that hits a touch earlier will score considerable higher. So I think that's one source of noise parents should be aware of when dealing with their N=1 and testing at particularly young ages.

    Also, any one else note the coincidence that the correlation jumps right when school starts? Did our grandparents know something when deciding school should start at 5? Or, does the rigor of school focus and structure the kids' minds? If the latter, the wide-spread pre-schooling that occurs today should show the correlation jump creeping down in age.

    ReplyDelete
  40. MtMoru3:02 PM

    Because the adult test-retest reliability isn't 1 AND because there should be some change even if very small between 5, 6, 7 true score and adult true score (was everyone tested at the same "adult" age?) the 5, 6, 7 true score adult score rho should be no greater than .95.

    If it were .95 this would mean the single 5, 6, 7 score adult score rho would be AT LEAST .72 (and that for a test-retest reliability of only .55) ASSUMING that the nth test is no closer to the true score than the first.

    It's clear that:

    1. Learning the test and learning to focus on the test (what's the difference?) leads to scores closer to the true score T.

    OR

    2. This study is crap.

    I vote for 2.

    One explanation is that IQ tests aren't necessarily like the SAT, LSAT, whatever. The WISC and WAIS are partly subjective. That is, two different examiners may give two different scores. If the study wasn't at least single blind and such a test was used it's crap.

    ReplyDelete
  41. It's possible the study I quoted is nuts and the correlations are too high. The second study I mention in the note added gets .7 for single measurements at 7 and 17. I'm still a bit surprised that you can predict adult IQ from a measurement at age 7 so well. I thought the correlation would be even lower.

    ReplyDelete
  42. lovehorrorfilms10:48 PM

     The study you quoted sounds more or less correct.  On page 714 of "the bell curve" there's a simple formula for estimating the stability of IQ at different ages.  This formula only works up to age 10, but the book claims that beyond age 10, the stability of IQ falls between the product of the reliabilities of the two testings and the square root of the reliabilities.  Translation:  if you could find a test that was perfectly reliable, IQ would be perfectly stable after age 10.

    The formula is as follows:

    correlation between IQ at different ages = The square root of the product of the reliabilities of both tests multiplied by the square root of age at the first test divided by age at the second testing.

    So assuming two tests with perfect reliability, IQ at age 7 correlates 0.84 with IQ at age 10, and since IQ at age 10 correlates perfectly with adult IQ (assuming perfect reliability), true IQ at age 7 correlates 0.84 with true adult IQ

    ReplyDelete
  43. MtMoru2:32 AM

    "This formula only works up to age 10, but the book claims that beyond age 10, the stability of IQ falls between the product of the reliabilities of the two testings and the square root of the reliabilities.  Translation:  if you could find a test that was perfectly reliable, IQ would be perfectly stable after age 10."

    Theunder age 10 formula is an approximat fit to data which may change. It's purely empirical.

    The square root of the product of the reliabilities IS the test much later test correlation only if it is ASSUMED that the true score is fixed at age 10.

    ReplyDelete
  44. lovehorrorfilms4:12 AM

    "The square root of the product of the reliabilities IS the test much
    later test correlation only if it is ASSUMED that the true score is
    fixed at age 10."

    It is not assumed according to this source, they are saying it's a fact that the correlation between IQ at different ages is between the product of the reliabilities and its square root once people are older than 10.  That means that if a test at two ages has a perfect reliability of 1.0, then the product is 1.0, and the square root is 1.0.  Stability falls between 1.0 and 1.0.  In other words perfect.   

    ReplyDelete
  45. MtMoru4:44 PM

    That fact is impssoible unless the true score is absolutely fixed.

    "IQ at different ages is between the product of the reliabilities and its square root once people are older than 10."

    That's either redundant or wrong. The only way the correlaton between testings 1 and 2 can be the square root of one or the other is if one of them has perfect reliability, presumably testing 2.

    Derive the formula yourself. It's just plug and chug.

    ReplyDelete
  46. lovehorrorfilms12:56 AM

    "That fact is impssible unless the true score is absolutely fixed."

    Yes that's the point.  True IQ is fixed after age 10 or so "The Bell Curve" claims.  I imagine it destabilizes again in old age however but I don't know what the data would show.  The stability of IQ should not be that surprising when one considers the fact that your IQ doesn't just measure your current ability, but your past ability as well (acquired vocabulary for example).

    ReplyDelete
  47. The player's car is in a different color than the other automated cars. So these games required to keep the car on track without deviating from the normal track. If the car strike against any object or any other car on the track then the fuel and the time also wasted and they were deducted from the total time allotted to the player. This also put a negative impact on the car racer's total points.

    ReplyDelete