Thursday, February 27, 2014

Correlation and Variance

In social science a correlation of R = 0.4 between two variables is typically considered a strong result. For example, both high school GPA and SAT score predict college performance with R ~ 0.4. Combining the two, one can achieve R ~ 0.5 to 0.6, depending on major. See Table 2 in my paper Data Mining the University.

It's easy to understand why SAT and college GPA are not more strongly correlated: some students work harder than others in school, and effort level is largely independent of SAT score. (For psychometricians, Conscientiousness and Intelligence are largely uncorrelated.) Also, it's typically students in the upper half or quarter of cognitive ability relative to the general population that earn college degrees. If the entire range of students were enrolled in college the SAT-GPA correlation would be higher. Finally, there is, of course, inherent randomness in grading.

The figure below, from the Wikipedia entry on correlation, helps to visualize the meaning of various R values.

I often hear complaints of the type: "R = 0.4 is negligible! It only accounts for 16% percent of the total variance, leaving 84% unaccounted for!" (The fraction of variance unaccounted for is 1 - R^2.) This kind of remark even finds its way into quantitative genetics and genomics: "But the alleles so far discovered only account for 20% of total heritability! OMG GWAS is a failure!"

This is a misleading complaint. Variance is the sum of squared deviations, so it does not even carry the same units as the quantity of interest. Variance is a convenient quantity because it is additive for uncorrelated variables, but it leads to distorted intuition for effect size: SDs are the natural unit, not SD^2!

A less misleading way to think about the correlation R is as follows: given X,Y from a standardized bivariate distribution with correlation R, an increase in X leads to an expected increase in Y:  dY = R dX. In other words, students with +1 SD SAT score have, on average, roughly +0.4 SD college GPAs.  Similarly, students with +1 SD college GPAs have on average +0.4 SAT.

Alternatively, if we assume that Y is the sum of (standardized) X and a noise term (the sum rescaled so that Y remains standardized), the standard deviation of the noise term is given by  sqrt(1- R^2)/R ~ 1/R for modest correlations. That is, the standard deviation of the noise is about 1/R times larger than that of the signal X. When the correlation is 1/sqrt(2) ~ 0.7 the signal and noise terms have equal SD and variance. ("Half of the variance is accounted for by the predictor X"; see for comparison the figure above with R = 0.8.)

As another example, test-retest correlations of SAT or IQ are pretty high, R ~ 0.9 or more. What fluctuations in score does this imply? In the model above the noise SD = sqrt(1 - 0.81)/0.9 ~ 0.5, so we'd expect the test score of an individual to fluctuate by about half a population SD (i.e., ~7 points for IQ or ~50 points per SAT section). This is similar to what is observed in the SAT data of Oregon students.

I worked this out during a boring meeting. It was partially stimulated by this article in the New Yorker about training for the SAT (if you go there, come back and read this to unfog your brain), and activist nonsense like this. Let me know if I made mistakes ...  8-)

tl;dr Go back to bed. Big people are talking.

Peter Connor said...

There is also the fact that GPA at less selective colleges is not comparable to top 50 schools.

chartreuse1737 said...

"...some students work harder than others in school, and effort level is largely independent of SAT score..."

proving once again that those whom "the system" prefers (http://www.diclib.com/dignify/show/en/soulesynonyms/D/1331/600/0/0/4387#.Uw_3_M7vjB0 sense 2), believe in the system, OR are the sort who believe whatever is most expedient to believe. (btw, the system isn't the result of any conspiracy. it's rules haven't been determined by an elite group. and the system of one country is never the same as another's.)

Jay Ham said...

"When the correlation is sqrt(2) ~ 0.7 the signal and noise terms have equal SD and variance."

I'm confused. sqrt(2) !~ 0.7

chartreuse1737 said...

steve is confused. the fraction of variance explained OVERESTIMATES the quantity he's interested in. if 16% of the variance is explained this means only sqrt(.84) of the sd is "explained".

"omg! omg!" he said, "as giddy as a school girl", has steve realized that pearson's rho is merely the slope of the line defined by the conditional expectation for a bivariate elliptical distribution? that's math stats 400.

chartreuse1737 said...

this guy's got steve pegged.

http://www.karger.com/Article/PDF/96532

Dane said...

That can't be right. sqrt(.84) > 90%... Which means 16% of explained variance explains >90% of std deviation?
I think you meant sqrt(.16) = 40% which is more reasonable and exactly what Steve was getting at.

chartreuse1737 said...

one minus sqrt(.84).

Dane said...

No fair changing your original post!
Anyway, I see what you're getting at, but can we agree that "percent of sd explained" is an incoherent measure? By your logic, the noise term explains 1-sqrt(.16) = 60% of the sd... so together that only explains 68.3% of the sd... even though the signal and the noise put together gets you back to the variable you're trying to explain.

chartreuse1737 said...

of course, if one subscribes to the true score plus error theory of iq tests then the twin tiwn correlation needn't be squared to give the % of variance explained.

maybe steve would like to "hold forth" on true score vs error gpa.

chartreuse1737 said...

right. it was steve who seemed to be insisting on % of sd explained. btw, i changed the post before i left the page. maybe it showed up for me before it showed up on the blog?

if what you want is how much the sd is reduced when a certain variable is controlled for then this is:

1 - sqrt(1 - rho^2)

steve hsu said...

That was a typo! Thanks for catching it.

steve hsu said...

Y = college GPA, RX = SAT score, both regarded as standardized variables. (That's the natural way to organize the student data: z-score both GPA and SAT.) Then the relative SD of E (i.e., the noise for someone wanting to use SAT to predict GPA) compared to the standardized variable RX is what I wrote in the post.

nooffensebut said...

"’But the alleles so far discovered only account for 20% of total heritability! OMG GWAS is a failure!’"

GWAS Jihadists are victims of the monster they created. Let’s review. Sample-size funnel
charts
assume that exclusion criteria are never value-added. Increasing sample size is not the only way to improve research fidelity, and relatively low sample-size lines of research (like fMRI research) are not worthless. Assuming all human behaviors are “complex” was a mistake. It does not follow that expression levels of all molecules have complex etiologies just because HMG-CoA reductase does. Some people are making the leap from “there is no gene for” to “there is no molecule for,” despite neurotransmitters and psychiatry. There is nothing unscientific about raising a hypothesis without extraordinary evidence. Lastly, if you are going to attack an extensive line of specific candidate-gene research, at least get your facts straight, (instead of spreading a racist typo like Steven Pinker, Adrian Raine, and super-genius John Horgan did in their books).

Richard Seiter said...

Does anyone actually believe either nature or nurture exist in isolation? I'm having trouble recalling a case where someone has said that (excepting strawman arguments where it is common).

ben_g said...

Interesting StackOverflow question: Does Causation Imply Correlation? http://stats.stackexchange.com/questions/26300/does-causation-imply-correlation

chartreuse1737 said...

take a deep breath rs. this is an implicit assumption behind behavioral genetics. if there were always an environment, barring the pathological cases, where a given genome would be a world beater, behavioral genetics would just be stupid.

i don't doubt that every human culture has a word which can be translated approximately into "intelligence", but what is that intelligence in that context? do the same alleles which reduce or boost this trait in ottawa reduce or boost it in port moresby?

but traits like height and eye color mean the same thing everywhere and everywhen.

perhaps you missed my post on how the most heritable psychiatric condition, scz, simply doesn't exist in melanesia. http://www.greenmedinfo.com/article/schizophrenia-prevalence-correlated-gluten-grain-consumption-0.

the way it really is: every nation has its national game. it used to be in the 50s that all sports other than baseball were like dodgeball. but the ability to play baseball or rugby or association football or ski jump etc. are likely heritable and likely nearly unrelated. but within the particular nation/society the best of these are "sportsmen of the year".

David Coughlin said...

I don't really understand what motivated this: "Alternatively, if we assume that Y is the sum of (standardized) X and a
noise term (the sum rescaled so that Y remains standardized), the
standard deviation of the noise term is given by sqrt(1- R^2)/R ~ 1/R
for modest correlations."

Usually when I see something like Y = X + w, the assumption is that X is exact, and all of the variation of Y is due to w. Can you point me to a reference for your use of this model?

Pincher Martin said...

Is this idiot Jorgeous Jorge? He sure quacks like him.

Pincher Martin said...

" i don't doubt that every human culture has a word which can be translated approximately into "intelligence", but what is that intelligence in that context? do the same alleles which reduce or boost this trait in ottawa reduce or boost it in port moresby? that there is a word does not mean there is a thing. that there is a translation does not mean the extension is the same only the intension (not misspelled, btw).

but traits like height and eye color mean the same thing everywhere and every when.

Blah, blah, blah. When Ottawa aspires to be more like Port Moresby, then the definition for intelligence among New Guineans can govern the discussion.

But since it is Port Moresby which aspires to modern life, which wants more "cargo" - to use Diamond's definition of what New Guineans aspired to achieve - the genes which are likely to boost those traits matter for both cultures.

Ryne Sherman said...

Steve, I think you will find this article relevant. I'm sure it took Ozer much longer than your boring meeting to do all this work though (but in his defense, it was almost 30 years ago): http://psycnet.apa.org/index.cfm?fa=search.displayRecord&uid=1985-19147-001

dxie48 said...

"I often hear complaints of the type: "R = 0.4 is negligible! It only accounts for 16% percent of the total variance, leaving 84% unaccounted for!""

From the OECD PISA 2012 result, the understanding of the concept of arithmetic mean for the 15 years old.
Some of those in Cat 1 to 4 might get into university. I wonder how many of them will struggle with the concept of
correlation and variance in university, e.g. remark from a Political Science professor,

http://hawaii.edu/powerkills/UC.HTM

"I wrote this book for my students at the University of Hawaii who, as I was when I was an undergraduate, were puzzled by this strange animal called the correlation coefficient and the meaning of all those numbers called correlations."

Thinking about mathematical concepts: how familiar are you with the following terms? - Arithmetic Mean
Category:
1 Never heard of it
2 Heard of it once or twice
3 Heard of it a few times
4 Heard of it often
5 Know it well, understand the concept

Country Variable Category %
United States of America ST62Q17 1 41.00
United States of America ST62Q17 2 14.33
United States of America ST62Q17 3 11.83
United States of America ST62Q17 4 11.36
United States of America ST62Q17 5 18.03
United States of America ST62Q17 m 3.46
Massachusetts (USA) ST62Q17 1 37.79
Massachusetts (USA) ST62Q17 2 13.90
Massachusetts (USA) ST62Q17 3 12.39
Massachusetts (USA) ST62Q17 4 9.54
Massachusetts (USA) ST62Q17 5 23.09
Massachusetts (USA) ST62Q17 m 3.28
OECD Total ST62Q17 1 29.28
OECD Total ST62Q17 2 12.32
OECD Total ST62Q17 3 12.28
OECD Total ST62Q17 4 14.82
OECD Total ST62Q17 5 28.93
OECD Total ST62Q17 m 2.38

chartreuse1737 said...

you're the blatherer, the blah, blah, blaher. have you ever heard of diogenes (of sinope) or perhaps jorge videla?

what has "modern life" going for it other than longer life? NOTHING. nor will ANY mode of living have anything to recommend it other than, "we who live this way live longer."

life expectancy in the solomon islands is now > 70, but how many of them have the interwebs? how many have tv? they can't watch the academy awards or the superbowl. how many can eat fried coke or kfc? those poor savages.

Pincher Martin said...

what has "modern life" going for it other than longer life? NOTHING. nor has ANY mode of living anything to recommend it other than, "we who live this way live longer."

Funny. Most hunter-gatherers don't look at it that way.

life expectancy in the solomon islands is now > 70 http://en.wikipedia.org/wiki/D...

Sounds like a dream. Why don't you go live there? (You might discover those life expectancy figures are off.)

but how many of them have the interwebs? how many have tv? they can't watch the academy awards or the superbowl. how many can eat fried coke or kfc? how many can afford adhd medication for their kids? how many are obese?

You might add, how many of them have the pleasure of your intellectual company?

steve hsu said...

Y = realized GPAs, X = cognitive ability, W = effort (conscientiousness). We only measure Y and X and W is uncorrelated to X.

Richard Seiter said...

I've taken to thinking of him as the man of many aliases. The aliases change, but the discussions remain the same. It's unfortunate that some of the alias accounts seem to be deleted which makes some of the conversations in older threads hard to follow.

Richard Seiter said...

Just because some interpreters of behavioral genetics forget that their conclusions apply only (reliably) to the specific populations researched doesn't mean that is an implicit assumption of the field. Your take on this brings to mind the cliche about babies and bathwater. The fuzzier the category the bigger the problems, and athletic ability and intelligence are both very fuzzy. If you break either of those concepts into smaller and more universal (i.e. not society dependent) concepts I think behavioral genetics has useful insights to offer. We can still squabble endlessly about which specific concepts/abilities are most important in a given environment though ;-)

chartreuse1737 said...

"idiot" has pejorated. we now prefer the term "rotter" or "roteur".

chartreuse1737 said...

no. there's no need to squabble endlessly. if there is a universal extension of "intelligence" and a gwas finds hits for it that's the end of the story. it should be clear even to the hereditist klavern that the broader the sample the fewer the hits. it's a question of how many fewer. and it's my guess, intuition, that aside from the most deleterious retard-making alleles that number will be zero however large the sample. or if there are any reproducible hits, their total effect will be near zero.

panjoomby said...

as i go up 1 std dev. on X, i go up "r" std deviations on Y.

botti said...

Richard Seiter said...

I enjoy deconstructing how people create their strawmen. In this paper: 'The hereditarian position, recall, is that the so-called ‘‘IQ gap’’ between Black and White Americans is due at least in large part to differences in the ‘‘races’’ average genetic endowments directly relevant to the development of the sorts of abilities tested on IQ tests (and related performance measures). The exact details of the proposed pathways between the average genetic differences between the so called ‘‘races’’ and the average differences in performance on IQ tests (and related measures) are rarely explored in the literature, but, negatively, the hereditarian
hypothesis demands that racism not be a major mediating factor."'

How is that last "demands" justified? That seems like another of those claims most often found in strawman arguments (with the classic example being that a belief in the importance of nature or nurture implies a belief that the other doesn't matter)..

P.S. At least the earlier paper was published under the heading "Commentary."

Richard Seiter said...

I'm curious, do you think GWAS will fail to eventually find reproducible hits for things like nerve conduction velocity, neuronal branching, and myelination? FWIW, my money is on the Tay-Sachs alleles (or one of the others in Cochran et al. 2006) as an early hit. I believe this because the heterozygote advantage would result in balancing selection (i.e. it would not go to fixation). The big question IMHO is about the effect sizes. I'm having trouble finding a reference for population prevalences of the alleles (I can't find the usual frequency tables in SNPedia), does anyone have a link?

http://www.snpedia.com/index.php/Tay-Sachs_disease
http://www.snpedia.com/index.php/Rs28940871
http://omim.org/entry/606869#606869_AllelicVariant0056
http://web.mit.edu/fustflum/documents/papers/AshkenaziIQ.jbiosocsci.pdf

chartreuse1737 said...

i would expect that genes affecting/effecting total brain volume would be hits for all populations but given that i have a big head it can't make that much of a difference ;)

klochner said...

"A less misleading way to think about the correlation R is as follows:
given X,Y from a standardized bivariate distribution with correlation R,
an increase in X leads to an expected increase in Y: dY = R dX."

R=1 means perfect correlation, not that the slope is 1. You're treating R

as a regression coefficient rather than a measure of co-linearity.

Y = mX + b
dY = m dX

No?

(sorry if this is a double-post, discus is having issues with my account)

steve hsu said...

Note we are not talking about general X,Y but specifically: X,Y from a standardized bivariate distribution ...

Brian said...

I doubt many Solomon Islanders have heard of Diogenes or Videla, nor have the Islands produced people like them.

Emil Kirkegaard said...

>I'm impressed by the rebuttal of years of statistical research by people like Arthur Jensen without presenting any equations or data. Do I need a PhD after my name to do that?

Yes, common thing among 'people of words', especially philosophers. Sesardic covers it at depth with Jensen's enemies in his great book "Making sense of Heritability". https://www.goodreads.com/book/show/2105959.Making_Sense_of_Heritability

Emil Kirkegaard said...

Another useful way for comparing groups is to compute the point biserial correlation into effect sizes (d). A small point biserial r, e.g. .3, is quite a large effect size.

chartreuse1737 said...

i was reading jensen the other day. he's a moron.

Richard Seiter said...

Where have I disputed the "independent of the environment" part? It seems clear that genetics and environment both have an effect. I find it hard to justify asserting no effect from either one. I can't decide if you are being intentionally obtuse and argumentative or if there really is some portion of your argument I am missing.

chartreuse1737 said...

if for every (and different) tabulae rases there is an environment peculiarly suited such that it is a world-beater why would anyone give a damn about the outcome in a particular place and time? what use would it be, if any? it's just marginalia, triviality, academic, or ideology.

i think you may be confusing the fact that environment and genes make a difference within a prevailing environment and that the effect of some set of genes within one prevailing environment may negligible in another.

Emil Kirkegaard said...

You can find a free copy of the book here: http://gen.lib.rus.ec/book/index.php?md5=3D0E1F5A4D7E49657E273A6045ACDAF0&open=0

This might also interest you: http://openpsych.net/

ronthehedgehog said...

In the model above the noise SD = sqrt(1 - 0.81)/0.9 ~ 0.5, so we'd
expect the test score of an individual to fluctuate by about half a
population SD (i.e., ~7 points for IQ or ~50 points per SAT section).

The test-retest md is actually 5.35 points for a test-retest correlation of .9. It would be least for a score at the mean, where it would be sqrt((2/pi)*(1-.9^2))*15 = 5.22 points. It would increase to 9.5 points for a score +/- 6 SDs.

An md of 7.5 points would correspond to a correlation of .80.