Holistic evaluation of applicants = noise? (Or worse?)
Why no evidence-based admissions?
Is there any serious study of whether subjective criteria used to judge applicants actually predict success? In psychometrics this is referred to as test validity. In the article below, it is not even clear that the evaluation method satisfies the weaker criteria of consistency or stability: applicants passed through the system another time might generate significantly different scores. "Expert" evaluation often reduces the power of prediction relative to simple algorithms.
See Data mining the university and Nonlinear psychometric thresholds for physics and mathematics for plenty of evidence of validity, consistency and stability of traditional measures of intellectual ability.
NYTimes: A highly qualified student, with a 3.95 unweighted grade point average and 2300 on the SAT, was not among the top-ranked engineering applicants to the University of California, Berkeley. He had perfect 800s on his subject tests in math and chemistry, a score of 5 on five Advanced Placement exams, musical talent and, in one of two personal statements, had written a loving tribute to his parents, who had emigrated from India.
Why was he not top-ranked by the “world’s premier public university,” as Berkeley calls itself? Perhaps others had perfect grades and scores? They did indeed. Were they ranked higher? Not necessarily. What kind of student was ranked higher? Every case is different.
The reason our budding engineer was a 2 on a 1-to-5 scale (1 being highest) has to do with Berkeley’s holistic, or comprehensive, review, an admissions policy adopted by most selective colleges and universities. In holistic review, institutions look beyond grades and scores to determine academic potential, drive and leadership abilities. Apparently, our Indian-American student needed more extracurricular activities and engineering awards to be ranked a 1.
Now consider a second engineering applicant, a Mexican-American student with a moving, well-written essay but a 3.4 G.P.A. and SATs below 1800. His school offered no A.P. He competed in track when not at his after-school job, working the fields with his parents. His score? 2.5.
Both students were among “typical” applicants used as norms to train application readers like myself. And their different credentials yet remarkably close rankings illustrate the challenges, the ambiguities and the agenda of admissions at a major public research university in a post-affirmative-action world.
[ Despite Prop. 209, the nearly equal scores of these "typical" training cases suggests outcomes very similar to those produced by explicit affirmative action. ]
... I could see the fundamental unevenness in this process both in the norming Webinars and when alone in a dark room at home with my Berkeley-issued netbook, reading assigned applications away from enormously curious family members. First and foremost, the process is confusingly subjective, despite all the objective criteria I was trained to examine.
In norming sessions, I remember how lead readers would raise a candidate’s ranking because he or she “helped build the class.”
... After the next training session, when I asked about an Asian student who I thought was a 2 but had only received a 3, the officer noted: “Oh, you’ll get a lot of them.” She said the same when I asked why a low-income student with top grades and scores, and who had served in the Israeli army, was a 3. ...