Saturday, April 03, 2010

Data Mining the University

Here is a draft of my paper with Jim Schombert on University of Oregon GPA and SAT statistics. I posted previously on this research: Cognitive thresholds , The value of hard work. Introductory slides on g, SAT and all that.

Much of our data is available in the plots here.

Why did I get interested in this stuff? Obviously, I have a long-standing interest in psychometrics. In teaching 100 level courses, both Schombert and I have been flummoxed at the large population of students who have trouble with what we would consider elementary concepts (e.g., "scaling", or even "area" or "volume"!), yet seem to be successful in their chosen major ("I just can't seem to do these problems, but I need this class to graduate (fulfill a science requirement). I always get A's in English/History/Sociology/ ..."!) I'm sure every physics professor hears these things. Because I invest a lot of time in helping students, e.g., solving lots of problems during office hours, I have a pretty close view of their learning abilities. I began to wonder how students could have trouble with these basic concepts ("Didn't you have to know that for the SAT?"), yet have high GPAs in their major. In talking to Jim, who is director of the General Science program, we realized the data was actually available to investigate these questions further.

Data Mining the University: College GPA Predictions from SAT Scores


From the Conclusions:

1. SATs predict upper GPA with correlations in the 0.35 -- 0.50
range.

2. Overachievers exist in most majors, with low SAT scores but
very high GPAs. These overachievers are disproportionately female.

3. Underachievers exist in all majors, with high SAT scores but
very low GPAs. These underachievers are disproportionately male.

4. Some majors, like math and physics, may exhibit a cognitive threshold -- mastery of the material is unlikely below an ability threshold (as measured by SAT-M), no matter how hard the student works.

5. Students at public universities, like UO, with high upper GPA (e.g., 3.7 or greater) likely have subject mastery similar to graduates of elite universities. Elite college students who transferred to a state university would likely average upper division GPAs of 3.7 or greater.

Figure 8 caption: Underachievers and Overachievers (red = females, blue = males) isolated by SAT and upper GPA (1.25 standard deviations from the green ridgeline). The overachievers are mostly female students (64%) and the underachievers are mostly male students (79%). (Click for larger version.)

24 comments:

  1. kakarot3:39 AM

    Theres a huge "wait man remember this" component.

    Chicks often enter bullshit GPA inflation studies such as liberal arts courses,where everyone is handed a 4.0 for kissing ass, trying to tell the teacher what the want to here, not being pissed off about it, and avoiding any math fields like the plague.

    Guys dont do that.

    That probably explains half of this overachiever/gpa stuff. The other half, yeah i get it, guys are slackers, girls are suckups who do what they are told, and like the "sit down and study" part of school more.

    ReplyDelete
  2. Dave Backus10:19 AM

    This is great stuff -- both the results and the open discussion and debate it generates. Are there similar studies for other schools?

    ReplyDelete
  3. Dave,

    I have not seen similar studies but they probably exist. Until recently such studies would have been big projects requiring significant cooperation from the administration. Now that so much data is web-accessible, a couple of enterprising professors with some programming skills can just grab it and analyze it in their spare time! The biggest data sets I've seen analyzed are from projects in which ETS cooperates with a bunch of universities (i.e., validity studies of the SAT) or from UC's recent study of its own admissions practices. The papers by Kuncel that we reference make use of that data. But in the literature it is still often claimed that the SAT has little predictive power -- e.g., correlations as low as .2 or .25 are sometimes quoted for freshman GPA (visit the site http://fairtest.org for this kind of propaganda). Obviously, freshmen self-select into easier and harder classes so freshman GPA is hardly the best indicator for the SAT. But I've never seen studies done subject by subject, focusing on upper division courses.

    If I were to continue in this direction I would be interested in surveying the under/overachievers to understand them better, and I would like a lot more data to get at cognitive thresholds for different subjects. The initial response from real psychometricians is positive -- apparently there isn't much good data on thresholds!

    ReplyDelete
  4. catperson2:02 PM

    I don’t believe in thresholds. I believe that g (or more specific aptitudes) has predictive validity in all domains and that the lower your g the less likely you are to succeed in anything (especially g loaded activities), but I believe such relationships tend to be linear for the full range of ability and achievements. I’m sure there are exceptions but I’ve never seen any convincing evidence so why violate Occam’s razor by asserting a threshold when a simple linear model explains all that needs explaining. Even with a simple linear model you’ll eventually reach a point on the graph where the probability of failure exceeds 90% but to call this point a threshold is arbitrary & superfluous.
    Further, anyone with an IQ of 100 (2010 norms) is a genius by the standards of world history. The world average is only 90 (2010 norms) and that’s only because 20th century nutrition has driven human brain size and development to stratospheric heights. A century ago the world average was about 70, and 200,000 years ago (before modern evolution) it was in the 40s.
    It’s not that average IQ people can’t learn university physics, it’s that tiny differences in IQ translate into ENORMOUS difference in learning speed but this is true at all IQ levels, it doesn’t suddenly become true above certain thresholds.

    ReplyDelete
  5. catperson2:17 PM

    On the subject of thresholds, Jensen once stated that there were four thresholds from which IQ gets almost all of its significance.

    1. Can or can not attend a regular school (IQ 50)
    2. Can or can not master the traditonal subject matter of elementary school (IQ 75)
    3. Can or can not handle college prep courses (IQ 105)
    4. Can or can not get university grades high enough for grad school (IQ 115)

    The problem is Jensen never explained how these numbers were arrived at, or more importantly, WHEN they were arrived at, and with the Flynn Effect moving the IQ scale at a rate of 2 (or 3) points per decade, an IQ of 95 today equals an IQ about 115 in the early 20th century. That's why we shouldn't be surprised when we routinely discover entire universities with an average IQ of 85 with only the math students averaging 100.

    ReplyDelete
  6. Catperson,

    I'm not sure the existence of a threshold (as we define it) implies the nonlinearity in learning rate with g that you don't like. If the work factor for learning physics or pure math increases linearly with lower g, eventually mastery may require more than, e.g., 60 hours per week for the standard course load. Students in this situation, even the most dedicated, would probably then switch majors or end up with a lower GPA (< 3.5). The probability of mastery (GPA > 3.5) as a function of SAT-M might vary linearly but hit zero (or close to it) around SAT-M = 600. The point is that there is an upper cutoff on the number of hours per week that someone can invest.

    Note, though, as someone who has tried to teach basic physics 101 concepts to a variety of students, I would be surprised if learning rates weren't nonlinear in g.

    ReplyDelete
  7. catperson3:21 PM

    I think I see what you're saying. 2 variables could have a linear relationship in theory, but in practice there are limitations to the full expression of this relationship. In any event, linear was probably not the best word. What I really object to is the notion of thresholds mostly because they're so vaguely defined, there are always freak exceptions (i.e. a NBA player who is only 5'5"), and because the Flynn Effect implies that even people of mediocre intelligence are very bright by the standards of history.

    With respect to learning rates not being linear, there was a fascinating article in the journal Prometheus which hypothesized that the ability to learn or solve g loaded problems doubles every 10 IQ points (the author later revised it to every 5 IQ points) so only a 1 SD difference in IQ will allow one to learn physics nearly TEN TIMES faster. The explanation given was that the human brain operates in parallel. I wonder if this also explains the enormous non-linear inequality in wealth.

    ReplyDelete
  8. catperson3:37 PM

    Another interesting point your data may shed light on is whether the (math) SAT predicts higher GPA for the entire range of SAT scores or whether there is a point beyond which the SAT is not predictive. It has been argued (though I personally am a little skeptical of this argument) that the SAT is a good measure of g up to about IQ 135 but beyond IQ 135, the SAT is not discriminating very well with respect to g and this may or may not reflected in its predictive validity, depending on the g loading of the criterion at the highest levels.

    ReplyDelete
  9. catperson3:43 PM

    From a psychometric perspective, the existence of a disproportionate number of female overachievers implies that the math SAT is biased against women. When a test systematically underpredicts the achievement of an entire group, that is the technical defenition of bias in the psychometric literature, or at least one of the major criteria for concluding that a test is biased. Other criteria include the tendency for test items to have a different rank order of difficulty in group than another.

    ReplyDelete
  10. The simplest reasonable model for college performance has *two* input factors: ability and conscientiousness. Your claim of SAT bias would require that we know that the conscientiousness distribution is the same for men and women. I suspect it is not... (Women who are having trouble with physics 101 are more likely to come to office hours than men.)

    ReplyDelete
  11. Jose Farrentes7:43 PM

    Steve, in none of your posts do you address a critical hypothesis: could it be that, above a certain moderate threshold, increases in IQ have no influence on one's likelihood of getting laid? This is the true issue at hand.

    ReplyDelete
  12. In your paper,you claim that SATs fluctuate very little.How do you expain the fact that I raised my SAT score from 1990/2400 to 2120/2400.my math score soared from a 620 to a 740.my Critical reading score was constant at 690.

    ReplyDelete
  13. You can see exactly how rare a 120 point gain on SAT-M was in our data by looking at figure 1. It happens, but it's rare.

    Tools like SAT can only be used *statistically* -- i.e., applied to large groups of people. For a large group of people who take the test twice the average gain would be roughly 20-40 points.

    ReplyDelete
  14. Jirka Lahvicka2:08 AM

    Re: Female overachievers
    Another explanation could be that many females underperform during high-pressure competitive tests like SAT, so they ARE actually smarter than their SAT scores show.
    See for example "Gender Gap in Admission Performance Under Competitive Pressure": http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1483802

    ReplyDelete
  15. 3. Underachievers exist in all majors, with high SAT scores but
    very low GPAs. These underachievers are disproportionately male.


    I deal with one of these guys everyday. Slowly but surely I will get him to be as hardworking and achievement oriented as he is capable of.

    ReplyDelete
  16. Dave Kane1:50 PM

    Where do you plan to submit this for publication?

    ReplyDelete
  17. Any suggestions? One well-known psychometrician thought we might be able to get the threshold result into Science or Nature!

    ReplyDelete
  18. Dave Kane3:42 PM

    I am too far away from academics to offer sensible advice, except that you would want to make your (nice) graphics even prettier for Science/Nature. They love them pretty pictures!

    By the way, did the university offer any pushback on this work? That is, aren't there regulations about how professors can use the data that they can access in this way?

    Related, did you consider looking at race?

    ReplyDelete
  19. We blinded ourselves to gender and race until the very end. As we were completing the paper we thought we should have a look just in case something startling was going on. Re: race there isn't anything that surprising to the cognoscenti.

    The university is OK with this as long as there is no personally identifying information in the results.

    ReplyDelete
  20. Some Guy11:47 PM

    My oldest daughter increased her SAT score by 130 points on the older 1600 point SAT test. My youngest daughter got 800 on more than one of the SAT II math tests ...

    Both are half Chinese ... Their verbal skills and some of their math skills come from their father, the math boost comes from their mother who has a very male 2D/4D ratio

    ReplyDelete
  21. Some Guy11:53 PM

    I work in the software industry. Women tend to be under represented in the high-tech end of this (which is the only part of it that I am familiar with).

    However, the patterns are interesting.

    The largest group of females in the high-tech software industry is Chinese females. The next largest group is Indian females. Then Caucasian females, but they tend to be Jews.

    Anecdotal, I am sure, but suggests less male/female differentiation among some groups, although I am still suspect that Chinese males have a greater SD than Chinese females. It would be good to see some data on that, though.

    ReplyDelete
  22. catperson12:16 AM

    Yes I've noticed the same thing. There are very few white women in high-tech, and the few caucasoid women tend to be non-white caucasoids (East Indians & Jews). I'm pretty sure males of all races have a slightly higher mean and larger SD than their female counterparts but it hards to quantify in exact terms because it's different on different tests.

    ReplyDelete
  23. catperson12:21 AM

    The math boost may have also come from hybrid vigour.

    ReplyDelete
  24. you claim that SATs fluctuate very little.How do you expain the fact that I raised my SAT score from 1990/2400 to 2120/2400? my math score soared from a 620 to a 740.my Critical reading score was constant at 690

    ReplyDelete