Physicist, Startup Founder, Blogger, Dad

Saturday, June 16, 2018

Harvard discrimination lawsuit: data show penalization of Asian-Americans on subjective personality evaluation

Harvard and Students For Fair Admissions (SFFA), which is suing Harvard over discrimination against Asian-American applicants, have released a large set of documents related to the case, including statistical analysis of records of more than 160,000 applicants who applied for admission over six cycles from 2000 to 2015.

Documents here and here. NYTimes coverage.

The following point does not require any sophisticated modeling (with inherent assumptions) or statistical expertise to understand.

Harvard admissions evaluators -- staffers who are likely under pressure to deliver a target mix of ethnicities each year -- rate Asian-American applicants far lower on subjective personality traits than do alumni interviewers who actually meet the applicants. The easiest way to limit the number of A-A admits each year would be to penalize them on the most subjective aspects of the evaluation...
SFFA Memorandum: Professor Arcidiacono found that Harvard’s admissions system discriminates against Asian-American applicants in at least three respects. First, he found discrimination in the personal rating. Asian-American applicants are significantly stronger than all other racial groups in academic performance. They also perform very well in non-academic categories and have higher extracurricular scores than any other racial group. Asian-American applicants (unsurprisingly, therefore) receive higher overall scores from alumni interviewers than all other racial groupsAnd they receive strong scores from teachers and guidance counselors—scores that are nearly identical to white applicants (and higher than African-American and Hispanic applicants). In sum, Professor Arcidiacono found that “Asian-American applicants as a whole are stronger on many objective measures than any other racial/ethnic group including test scores, academic achievement, and extracurricular activities.

Yet Harvard’s admissions officials assign Asian Americans the lowest score of any racial group on the personal rating—a “subjective” assessment of such traits as whether the student has a “positive personality” and “others like to be around him or her,” has “character traits” such as “likability ... helpfulness, courage, [and] kindness,” is an “attractive person to be with,” is “widely respected,” is a “good person,” and has good “human qualities.” Importantly, Harvard tracks two different personal ratings: one assigned by the Admissions Office and another by alumni interviewers. When it comes to the score assigned by the Admissions Office, Asian-American applicants are assigned the lowest scores of any racial group. ... By contrast, alumni interviewers (who actually meet the applicants) rate Asian Americans, on average, at the top with respect to personal ratings—comparable to white applicants and higher than African-American and Hispanic applicants.
From the Crimson:
The report found that Asian American applicants performed significantly better in rankings of test scores, academics, and overall scores from alumni interviews. Of 10 characteristics, white students performed significantly better in only one—rankings of personal qualities, which are assigned by the Admissions Office. [italics added]
See also Too Many Asian Americans: Affirmative Discrimination in Elite College Admissions. (Source of figure at top; the peak in A-A representation at Harvard, in the early 1990s, coincides with external pressure from an earlier DOJ investigation of the university for discrimination.)

A very sad tweet:

For the statistically sophisticated, see Duke Professor Arcidiacono's rebuttal to David Card's analysis for Harvard. If these entirely factual and easily verified characterizations of Card's modeling (see below) are correct, the work is laughable.
Professor Card’s models are distorted by his inclusion of applicants for whom there is no reason to believe race plays any role.

As my opening report noted, there are several categories of applicants to whom Harvard extends preferences for reasons other than race: recruited athletes, children of faculty and staff, those who are on the Dean’s List or Director’s List [i.e., Big Donors], legacies, and those who apply for early admission.1 Because of the significant advantage that each of these categories confers on applicants, my report analyzed the effect of race on an applicant pool without these special categories of applicants (the baseline dataset), which allowed me to test for the effect of race on the bulk of the applicant pool that did not fall into one of these categories.2

Professor Card, however, includes all of these applicants in his model, taking the remarkable position that there is no penalty against Asian-American applicants unless Harvard imposes a penalty on every Asian-American applicant. But this is an untenable position. I do not assert that Harvard uses race to penalize Asian-American applicants who are recruited athletes, children of donors (or others identified on the Dean’s List), legacies, or other preferred categories. By including these special recruiting categories in his models, Professor Card obscures the extent to which race is affecting admissions decisions for all other applicants.

Professor Card further exacerbates this problem by including in his calculations the large majority of applicants whose characteristics guarantee rejection regardless of their race. Harvard admits a tiny fraction of applicants – only five or six percent in recent years. This means that a huge proportion of applicants have no realistic chance of admission. If an applicant has no chance of admission, regardless of his race, then Harvard obviously does not “discriminate” based on race in rejecting that applicant. Professor Card uses this obvious fact to assert that Harvard does not consider race at all in most of its admissions decisions. Further, he constructs his models in ways that give great weight to these applicants, again watering down the effect of race in Harvard’s decisions where it clearly does matter. (To put it in simple terms, it is akin to reducing the value of a fraction by substantially increasing the size of its denominator.)

Professor Card removes interaction terms, which has the effect of understating the penalty Harvard imposes on Asian-American applicants.

As Professor Card notes, his model differs from mine in that he removes the interaction terms. An interaction term allows the effects of a particular factor to vary with another distinct factor. In the context of racial discrimination, interaction terms are especially helpful (and often necessary) in revealing where certain factors operate differently for subgroups within a particular racial or ethnic group. For example, if a law firm singled out African-American women for discriminatory treatment but treated African-American males and other women fairly, a regression model would probably not pick up the discrimination unless it included an interaction between African-American and female.

Professor Card rightly recognizes that interaction terms should be included in a model when there is evidence that racial preferences operate differently for particular groups of applicants; yet he nonetheless removes interaction terms for variables that satisfy this condition. The most egregious instance of this is Professor Card’s decision not to interact race with disadvantaged status—even though the data clearly indicate that Harvard treats disadvantaged students differently by race.


Professor Card’s report changes none of my conclusions; to the contrary, given how easy it is to alter the results of his models and that my own models report the same results even incorporating a number of his controls, my opinions in this case have only been strengthened: Harvard penalizes Asian-American applicants; Harvard imposes heavy racial preferences in favor of Hispanic and African-American applicants; and Harvard has been manipulating its admission of single-race African-American applicants to ensure their admission rate approximates or exceeds the overall admission rate. Professor Card has demonstrated that it is possible to mask the true effects of race in Harvard’s admission process by changing the scope of the analysis in incorrect ways and choosing inappropriate combinations of control variables. But Professor Card cannot reach these results by applying accepted statistical methods and treating the data fairly.

Tuesday, June 12, 2018

Big Ed on Classical and Quantum Information Theory

I'll have to carve out some time this summer to look at these :-) Perhaps on an airplane...

When I visited IAS earlier in the year, Witten was sorting out Lieb's (nontrivial) proof of strong subadditivity. See also Big Ed.
A Mini-Introduction To Information Theory

This article consists of a very short introduction to classical and quantum information theory. Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information. A few more detailed topics are considered in the quantum case.
Notes On Some Entanglement Properties Of Quantum Field Theory

These are notes on some entanglement properties of quantum field theory, aiming to make accessible a variety of ideas that are known in the literature. The main goal is to explain how to deal with entanglement when – as in quantum field theory – it is a property of the algebra of observables and not just of the states.
Years ago at Caltech, walking back to Lauritsen after a talk on quantum information, with John Preskill and a famous string theorist not to be named. When I asked the latter what he thought of the talk, he laughed and said Well, after all, it's just linear algebra :-)

Sunday, June 10, 2018

The Life of this World

From this 2011 post:
I've been a fan of the writer James Salter (see also here) since discovering his masterpiece A Sport and a Pastime. Salter evokes Americans in France as no one since Hemingway in A Moveable Feast. The title comes from the Koran: Remember that the life of this world is but a sport and a pastime ... :-)

I can't think of higher praise than to say I've read every bit of Salter's work I could get my hands on.
For true Salter fans, a new (2017; he passed in 2015) collection of previously uncollected nonfiction: Don't Save Anything: Uncollected Essays, Articles, and Profiles. I especially liked the essay Younger Women, Older Men, originally published in Esquire in 1992.

From A Sport and a Pastime.
“When did you get out of Yale?”
“I didn’t,” he says. “I quit.”

He describes it casually, without stooping to explain, but the authority of the act overwhelms me. If I had been an underclassman he would have become my hero, the rebel who, if I had only had the courage, I might have also become. ... Now, looking at him, I am convinced of all I missed. I am envious. Somehow his life seems more truthful than mine, stronger, even able to draw mine to it like the pull of a dark star.

He quit. It was too easy for him, his sister told me, and so he refused it. He had always been extraordinary in math. He had a scholarship. He knew he was exceptional. Once he took the anthropology final when he hadn’t taken the course. He wrote that at the top of the page. His paper was so brilliant the professor fell in love with him. Dean was disappointed, of course. It only proved how ridiculous everything was. ... He lived with various friends in New York and began to develop a style. ... in the end he quit altogether. Then he began educating himself.


She stoops with the match, inserts it, and the heater softly explodes. A blue flame rushes across the jets, then burns with a steady sound. There’s no other light in the room but this, which reflects from the floor. She stands up again. She drops the burnt match on the table and begins to arrange clothing on the grill of the heater, pajamas, spreading them out so they can be warmed. Dean helps her a bit. The silk, if it’s that, is quite cold. And there, back from the Vox opposite the Citroen garage, its glass doors now closed, they stand in the roaring dark. In a fond, almost brotherly gesture, he puts his arms around her. They hardly know one another. She accepts it without a word, without a movement, and they wait in a pure silence, the faint sweetness of gas in the air. After a while she turns the pajamas over. Her back is towards him. In a single move she pulls off her sweater and then, reaching behind herself in that elbow-awkward way, unfastens her brassiere. Slowly he turns her around.

From a message to a friend, who knew Salter, and asked me to articulate what I most admire about his work.
About 5 years ago I became friends with the writer Richard Ford, who offered to introduce me to his friend Salter. I was less enthusiastic to meet him than I would have been when he was younger. I did not go out of my way, and we never met.

Since he lived in Aspen, and I was often there in the summers at the Physics institute, I have sometimes imagined that we crossed paths without knowing it.

I admire, of course, his prose style. Sentence for sentence, he is the master.

But perhaps even more I admire his view of the world -- of courage, honor, daring to attempt the impossible, men and women, what is important in life.

Saturday, June 09, 2018

The Rise of AI (Bloomberg Hello World documentary)

Great profile of Geoff Hinton, Yoshua Bengio, etc., but covers many other topics.

Note to readers: I'll be at the 35th International Conference on Machine Learning (ICML 2018) in Stockholm, Sweden (July 10-15, 2018), giving a talk at the Reproducibility in ML Workshop.

Let me know if you want to meet up!

Wednesday, May 30, 2018

Deep Learning as a branch of Statistical Physics

Via Jess Riedel, an excellent talk by Naftali Tishby given recently at the Perimeter Institute.

The first 15 minutes is a very nice summary of the history of neural nets, with an emphasis on the connection to statistical physics. In the large network (i.e., thermodynamic) limit, one observes phase transition behavior -- sharp transitions in performance, and also a kind of typicality (concentration of measure) that allows for general statements that are independent of some detailed features.

Unfortunately I don't know how to embed video from Perimeter so you'll have to click here to see the talk.

An earlier post on this work: Information Theory of Deep Neural Nets: "Information Bottleneck"

Title and Abstract:
The Information Theory of Deep Neural Networks: The statistical physics aspects

The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.

In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.

About Tishby:
Naftali (Tali) Tishby נפתלי תשבי

Physicist, professor of computer science and computational neuroscientist
The Ruth and Stan Flinkman professor of Brain Research
Benin school of Engineering and Computer Science
Edmond and Lilly Safra Center for Brain Sciences (ELSC)
Hebrew University of Jerusalem, 96906 Israel

I work at the interfaces between computer science, physics, and biology which provide some of the most challenging problems in today’s science and technology. We focus on organizing computational principles that govern information processing in biology, at all levels. To this end, we employ and develop methods that stem from statistical physics, information theory and computational learning theory, to analyze biological data and develop biologically inspired algorithms that can account for the observed performance of biological systems. We hope to find simple yet powerful computational mechanisms that may characterize evolved and adaptive systems, from the molecular level to the whole computational brain and interacting populations.

Saturday, May 26, 2018

Vinyl Sounds

Vinyl + Vacuum Tubes ... Still unsurpassed for warmth and richness of sound.

When I lived in New Haven in the 90s I took the train in on weekends to visit old friends from physics and mathematics, most of whom worked in finance. One Sunday morning in the spring I found myself with a friend of a friend, a big fixed income trader and devoted audiophile. His apartment in the Village had a large room with a balcony surrounded by leafy trees. In the room he kept only two things: a giant divan next to the balcony, on which several people at a time could recline, and the most expensive audio system I have ever seen. We spent hours listening to jazz and eating fresh cannoli with his actress girlfriend.

Off Grid Tiny Homes

This is the kind of thing I fantasize about doing after I retire :-)

Friday, May 25, 2018

Too Many Asian Americans: Affirmative Discrimination in Elite College Admissions

An updated analysis of discrimination against Asian-American applicants at elite universities. Figures below are from the paper. See also The Content of their Character: Ed Blum and Jian Li.
Too Many Asian Americans: Affirmative Discrimination in Elite College Admissions

Althea Nagai, Ph.D.

Asian Americans are “overrepresented” in certain elite schools relative to their numbers in the U.S. population. In pursuit of racial and ethnic diversity, these schools will admit some Asian American applicants but not as many as their academic qualifications would justify. As a case study, I examine three private universities and Asian American enrollment in those universities over time.

No “Ceiling” on Asian Americans at Caltech But One at MIT and Harvard.
Some basic facts: Caltech has race-blind admissions. The fraction of Asian-Americans enrolled there tends to track the growth in the overall applicant pool in recent decades. Harvard does use race as a factor, and is being sued for discrimination against Asian-Americans. The peak in A-A representation at Harvard, in the early 1990s, coincides with external pressure from an earlier DOJ investigation of the university for discrimination (dramatic race-based adjustments, revealing the craven subjectivity of holistic admissions!). Despite the much stronger and larger pool of applicants today (second figure below), A-A representation at Harvard has never recovered to those 1990s levels.

Wednesday, May 23, 2018

Dominic Cummings on Fighting, Physics, and Learning from tight feedback loops

Another great post from Dom.

Once something has become widely understood, it is difficult to recreate or fully grasp the mindset that prevailed before. But I can attest to the fact that until the 1990s and the advent of MMA, even "experts" (like boxing coaches, karate and kung fu instructors, Navy SEALs) did not know how to fight -- they were deeply confused as to which techniques were most effective in unarmed combat.

Soon our ability to predict heritable outcomes using DNA alone (i.e., Genomic Prediction) will be well-established. Future generations will have difficulty understanding the mindset of people (even, scientists) today who deny that it is possible.

The same will be true of AGI... For example, see the well-known "Chinese Room" argument against AGI, advanced by Berkeley Philosopher John Searle (discussed before in The Mechanical Turk and Searle's Chinese Room). Searle's confusion as to where, exactly, the understanding resides inside a complex computation seems silly to us today given recent developments with deep neural nets and, e.g., machine translation (the very problem used in his thought experiment). Understanding doesn't exist in any sub-portion of the network, it is embodied in the network. (See also Thought vectors and the dimensionality of the space of concepts :-)
Effective action #4a: ‘Expertise’ from fighting and physics to economics, politics and government

Extreme sports: fast feedback = real expertise

In the 1980s and early 1990s, there was an interesting case study in how useful new knowledge jumped from a tiny isolated group to the general population with big effects on performance in a community. Expertise in Brazilian jiu-jitsu was taken from Brazil to southern California by the Gracie family. There were many sceptics but they vanished rapidly because the Gracies were empiricists. They issued ‘the Gracie challenge’.

All sorts of tough guys, trained in all sorts of ways, were invited to come to their garage/academy in Los Angeles to fight one of the Gracies or their trainees. Very quickly it became obvious that the Gracie training system was revolutionary and they were real experts because they always won. There was very fast and clear feedback on predictions. Gracie jiujitsu quickly jumped from an LA garage to TV. At the televised UFC 1 event in 1993 Royce Gracie defeated everyone and a multi-billion dollar business was born.

People could see how training in this new skill could transform performance. Unarmed combat changed across the world. Disciplines other than jiu jitsu have had to make a choice: either isolate themselves and not compete with jiu jitsu or learn from it. If interested watch the first twenty minutes of this documentary (via professor Steve Hsu, physicist, amateur jiu jitsu practitioner, and predictive genomics expert).


[[ On politics, a field in which Dom has few peers: ]]

... The faster the feedback cycle, the more likely you are to develop a qualitative improvement in speed that destroys an opponent’s decision-making cycle. If you can reorient yourself faster to the ever-changing environment than your opponent, then you operate inside their ‘OODA loop’ (Observe-Orient-Decide-Act) and the opponent’s performance can quickly degrade and collapse.

This lesson is vital in politics. You can read it in Sun Tzu and see it with Alexander the Great. Everybody can read such lessons and most people will nod along. But it is very hard to apply because most political/government organisations are programmed by their incentives to prioritise seniority, process and prestige over high performance and this slows and degrades decisions. Most organisations don’t do it. Further, political organisations tend to make too slowly those decisions that should be fast and too quickly those decisions that should be slow — they are simultaneously both too sluggish and too impetuous, which closes off favourable branching histories of the future.

See also Kosen Judo and the origins of MMA.

Choking out a Judo black belt in the tatami room at the Payne Whitney gymnasium at Yale. My favorite gi choke is Okuri eri jime.

Training in Hawaii at Relson Gracie's and Enson Inoue's schools. The shirt says Yale Brazilian Jiujitsu -- a club I founded. I was also the faculty advisor to the already existing Judo Club :-)

Saturday, May 19, 2018

Deep State Update

It's been clear for well over a year now that the Obama DOJ-FBI-CIA used massive surveillance powers (FISA warrant, and before that, national security letters and illegal contractor access to intelligence data) against the Trump campaign. In addition to SIGINT (signals intelligence, such as email or phone intercepts), we now know that HUMINT (spies, informants) was also used.

Until recently one could still be called a conspiracy theorist by the clueless for stating the facts in the paragraph above. But a few days ago the NYTimes and WaPo finally gave up (in an effort to shape the narrative in advance of DOJ Inspector General report(s) and other document releases that are imminent) and admitted that all of these things actually happened. The justification advanced by the lying press is that this was all motivated by fear of Russian interference -- there was no partisan political motivation for the Obama administration to investigate the opposition party during a presidential election.

If the Times and Post were dead wrong a year ago, what makes you think they are correct now?

Here are the two recent NYTimes propaganda articles:

F.B.I. Used Informant to Investigate Russia Ties to Campaign, Not to Spy, as Trump Claims

Code Name Crossfire Hurricane: The Secret Origins of the Trump Investigation

Don't believe in the Deep State? Here is a 1983 Times article about dirty tricks HUMINT spook Stefan Halper (he's the CIA-FBI informant described in the recent articles above). Much more at the left of center Intercept.

Why doesn't Trump just fire Sessions/Rosenstein/Mueller or declassify all the docs?

For example, declassifying the first FISA application would show, as claimed by people like Chuck Grassley and Trey Gowdy, who have read the unredacted original, that it largely depends on the fake Steele Dossier, and that the application failed to conform to the required Woods procedures.

The reason for Trump's restraint is still not widely understood. There is and has always been strong GOP opposition to his candidacy and presidency ("Never Trumpers"). The anti-Trump, pro-immigration wing of his party would likely support impeachment under the right conditions. To their ends, the Mueller probe keeps Trump weak enough that he will do their bidding (lower taxes, help corporations and super-wealthy oligarchs) without straying too far from the bipartisan globalist agenda (pro-immigration, anti-nativism, anti-nationalism). If Trump were to push back too hard on the Deep State conspiracy against him, he would risk attack from his own party.

I believe Trump's strategy is to let the DOJ Inspector General process work its way through this mess -- there are several more reports coming, including one on the Hillary email investigation (draft available for DOJ review now; will be public in a few weeks), and another on FISA abuse and surveillance of the Trump campaign. The OIG is working with a DOJ prosecutor (John Huber, Utah) on criminal referrals emerging from the investigation. Former Comey deputy Andrew McCabe has already been referred for possible criminal charges due to the first OIG report. I predict more criminal referrals of senior DOJ/FBI figures in the coming months. Perhaps they will even get to former CIA Director Brennan (pictured at top), who seems to have lied under oath about his knowledge of the Steele dossier.

Trump may be saving his gunpowder for later, and if he has to expend some, it will be closer to the midterm elections in the fall.

Note added: For those who are not tracking this closely, one of the reasons the Halper story is problematic for the bad guys is explained in The Intercept:
... the New York Times reported in December of last year that the FBI investigation into possible ties between the Trump campaign and Russia began when George Papadopoulos drunkenly boasted to an Australian diplomat about Russian dirt on Hillary Clinton. It was the disclosure of this episode by the Australians that “led the F.B.I. to open an investigation in July 2016 into Russia’s attempts to disrupt the election and whether any of President Trump’s associates conspired,” the NYT claimed.

But it now seems clear that Halper’s attempts to gather information for the FBI began before that. “The professor’s interactions with Trump advisers began a few weeks before the opening of the investigation, when Page met the professor at the British symposium,” the Post reported. While it’s not rare for the FBI to gather information before formally opening an investigation, Halper’s earlier snooping does call into question the accuracy of the NYT’s claim that it was the drunken Papadopoulos ramblings that first prompted the FBI’s interest in these possible connections. And it suggests that CIA operatives, apparently working with at least some factions within the FBI, were trying to gather information about the Trump campaign earlier than had been previously reported.
Hmm.. so what made CIA/FBI assign Halper to probe Trump campaign staffers in the first place? It seems the cover story for the start of the anti-Trump investigation needs some reformulation...

Friday, May 18, 2018

Digital Cash in China

WSJ: "Are they ahead of us here?"

UK Expat in Shenzhen: "It's a strange realization, but Yes."

Thursday, May 17, 2018

Exponential growth in compute used for AI training

Chart shows the total amount of compute, in petaflop/s-days, used in training (e.g., optimizing an objective function in a high dimensional space). This exponential trend is likely to continue for some time -- leading to qualitative advances in machine intelligence.
AI and Compute (OpenAI blog): ... since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time (by comparison, Moore’s Law had an 18-month doubling period). Since 2012, this metric has grown by more than 300,000x (an 18-month doubling period would yield only a 12x increase). Improvements in compute have been a key component of AI progress, so as long as this trend continues, it’s worth preparing for the implications of systems far outside today’s capabilities.

... Three factors drive the advance of AI: algorithmic innovation, data (which can be either supervised data or interactive environments), and the amount of compute available for training. Algorithmic innovation and data are difficult to track, but compute is unusually quantifiable, providing an opportunity to measure one input to AI progress. Of course, the use of massive compute sometimes just exposes the shortcomings of our current algorithms. But at least within many current domains, more compute seems to lead predictably to better performance, and is often complementary to algorithmic advances.

...We see multiple reasons to believe that the trend in the graph could continue. Many hardware startups are developing AI-specific chips, some of which claim they will achieve a substantial increase in FLOPS/Watt (which is correlated to FLOPS/$) over the next 1-2 years. ...

Tuesday, May 15, 2018

AGI in the Alps: Schmidhuber in Bloomberg

A nice profile of AI researcher Jurgen Schmidhuber in Bloomberg. I first met Schmidhuber at SciFoo some years ago. See also Deep Learning in Nature.
Bloomberg: ... Schmidhuber’s dreams of an AGI began in Bavaria. The middle-class son of an architect and a teacher, he grew up worshipping Einstein and aspired to go a step further. “As a teenager, I realized that the grandest thing that one could do as a human is to build something that learns to become smarter than a human,” he says while downing a latte. “Physics is such a fundamental thing, because it’s about the nature of the world and how the world works, but there is one more thing that you can do, which is build a better physicist.”

This goal has been Schmidhuber’s all-consuming obsession for four decades. His younger brother, Christof, remembers taking long family drives through the Alps with Jürgen philosophizing away in the back seat. “He told me that you can build intelligent robots that are smarter than we are,” Christof says. “He also said that you could rebuild a brain atom by atom, and that you could do it using copper wires instead of our slow neurons as the connections. Intuitively, I rebelled against this idea that a manufactured brain could mimic a human’s feelings and free will. But eventually, I realized he was right.” Christof went on to work as a researcher in nuclear physics before settling into a career in finance.

... AGI is far from inevitable. At present, humans must do an incredible amount of handholding to get AI systems to work. Translations often stink, computers mistake hot dogs for dachshunds, and self-driving cars crash. Schmidhuber, though, sees an AGI as a matter of time. After a brief period in which the company with the best one piles up a great fortune, he says, the future of machine labor will reshape societies around the world.

“In the not-so-distant future, I will be able to talk to a little robot and teach it to do complicated things, such as assembling a smartphone just by show and tell, making T-shirts, and all these things that are currently done under slavelike conditions by poor kids in developing countries,” he says. “Humans are going to live longer, healthier, happier, and easier lives, because lots of jobs that are now demanding on humans are going to be replaced by machines. Then there will be trillions of different types of AIs and a rapidly changing, complex AI ecology expanding in a way where humans cannot even follow.” ...
Schmidhuber has annoyed many of his colleagues in AI by insisting on proper credit assignment for groundbreaking work done in earlier decades. Because neural networks languished in obscurity through the 1980s and 1990s, a lot of theoretical ideas that were developed then do not today get the recognition they deserve.

Schmidhuber points out that machine learning is itself based on accurate credit assignment. Good learning algorithms assign higher weights to features or signals that correctly predict outcomes, and lower weights to those that are not predictive. His analogy between science itself and machine learning is often lost on critics.

What is still missing on the road to AGI:
... Ancient algorithms running on modern hardware can already achieve superhuman results in limited domains, and this trend will accelerate. But current commercial AI algorithms are still missing something fundamental. They are no self-referential general purpose learning algorithms. They improve some system’s performance in a given limited domain, but they are unable to inspect and improve their own learning algorithm. They do not learn the way they learn, and the way they learn the way they learn, and so on (limited only by the fundamental limits of computability). As I wrote in the earlier reply: "I have been dreaming about and working on this all-encompassing stuff since my 1987 diploma thesis on this topic." However, additional algorithmic breakthroughs may be necessary to make this a practical reality.

Sunday, May 13, 2018

Feynman 100 at Caltech


AI, AGI, and ANI in The New Yorker

A good long read in The New Yorker on AI, AGI, and all that. Note the article appears in the section "Dept. of Speculation" :-)
How Frightened Should We Be of A.I.?

Precisely how and when will our curiosity kill us? I bet you’re curious. A number of scientists and engineers fear that, once we build an artificial intelligence smarter than we are, a form of A.I. known as artificial general intelligence, doomsday may follow. Bill Gates and Tim Berners-Lee, the founder of the World Wide Web, recognize the promise of an A.G.I., a wish-granting genie rubbed up from our dreams, yet each has voiced grave concerns. Elon Musk warns against “summoning the demon,” envisaging “an immortal dictator from which we can never escape.” Stephen Hawking declared that an A.G.I. “could spell the end of the human race.” Such advisories aren’t new. In 1951, the year of the first rudimentary chess program and neural network, the A.I. pioneer Alan Turing predicted that machines would “outstrip our feeble powers” and “take control.” In 1965, Turing’s colleague Irving Good pointed out that brainy devices could design even brainier ones, ad infinitum: “Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.” It’s that last clause that has claws.

Many people in tech point out that artificial narrow intelligence, or A.N.I., has grown ever safer and more reliable—certainly safer and more reliable than we are. (Self-driving cars and trucks might save hundreds of thousands of lives every year.) For them, the question is whether the risks of creating an omnicompetent Jeeves would exceed the combined risks of the myriad nightmares—pandemics, asteroid strikes, global nuclear war, etc.—that an A.G.I. could sweep aside for us.

The assessments remain theoretical, because even as the A.I. race has grown increasingly crowded and expensive, the advent of an A.G.I. remains fixed in the middle distance. In the nineteen-forties, the first visionaries assumed that we’d reach it in a generation; A.I. experts surveyed last year converged on a new date of 2047. A central tension in the field, one that muddies the timeline, is how “the Singularity”—the point when technology becomes so masterly it takes over for good—will arrive. Will it come on little cat feet, a “slow takeoff” predicated on incremental advances in A.N.I., taking the form of a data miner merged with a virtual-reality system and a natural-language translator, all uploaded into a Roomba? Or will it be the Godzilla stomp of a “hard takeoff,” in which some as yet unimagined algorithm is suddenly incarnated in a robot overlord?

A.G.I. enthusiasts have had decades to ponder this future, and yet their rendering of it remains gauzy: we won’t have to work, because computers will handle all the day-to-day stuff, and our brains will be uploaded into the cloud and merged with its misty sentience, and, you know, like that. ...

Thursday, May 10, 2018

Google Duplex and the (short) Turing Test

Click this link and listen to the brief conversation. No cheating! Which speaker is human and which is a robot?

I wrote about a "strong" version of the Turing Test in this old post from 2004:
When I first read about the Turing test as a kid, I thought it was pretty superficial. I even wrote some silly programs which would respond to inputs, mimicking conversation. Over short periods of time, with an undiscerning tester, computers can now pass a weak version of the Turing test. However, one can define the strong version as taking place over a long period of time, and with a sophisticated tester. Were I administering the test, I would try to teach the second party something (such as quantum mechanics) and watch carefully to see whether it could learn the subject and eventually contribute something interesting or original. Any machine that could do so would, in my opinion, have to be considered intelligent.
AI isn't ready to pass the strong Turing Test, yet. But humans will become increasing unsure about the machine intelligences proliferating in the world around them.

The key to all AI advances is to narrow the scope of the problem so that the machine can deal with it. Optimization/Learning in lower dimensional spaces is much easier than in high dimensional spaces. In sufficiently narrow situations (specific tasks, abstract games of strategy, etc.), machines are already better than humans.

Google AI Blog:
Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone

...Today we announce Google Duplex, a new technology for conducting natural conversations to carry out “real world” tasks over the phone. The technology is directed towards completing specific tasks, such as scheduling certain types of appointments. For such tasks, the system makes the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.

One of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively. Duplex can only carry out natural conversations after being deeply trained in such domains. It cannot carry out general conversations.

Here are examples of Duplex making phone calls (using different voices)...
I switched from iOS to Android in the last year because I could see that Google Assistant was much better than Siri and was starting to have very intriguing capabilities!

Friday, May 04, 2018

FT podcasts on US-China competition and AI

Two recent FT podcasts:

China and the US fight for AI supremacy (17min)
In the race to develop artificial intelligence technology, American engineers have long had an edge but access to vast amounts of data may prove to be China's secret weapon. Louise Lucas and Richard Waters report on the contest for supremacy in one of this century’s most important technologies.

Gideon Rachman: The dawn of the Chinese century (FT Big Picture podcast, 25min)

See also Machine intelligence threatens overpriced aircraft carriers.

Tuesday, May 01, 2018

Gary Shteyngart on Mike Novogratz and Wesley Yang on Jordan Peterson

Two excellent longform articles. Both highly recommended.

One lesson from Jordan Peterson's recent meteoric rise: the self-help market will never saturate.
Wesley Yang profile of Jordan Peterson (Esquire):

...The encouragement that the fifty-five-year-old psychology professor offers to his audiences takes the form of a challenge. To “take on the heaviest burden that you can bear.” To pursue a “voluntary confrontation with the tragedy and malevolence of being.” To seek a strenuous life spent “at the boundary between chaos and order.” Who dares speak of such things without nervous, self-protective irony? Without snickering self-effacement?

“It’s so sad,” he says. “Every time I go to these talks, guys come up and say, ‘Wow, you know, it’s working.’ And I think, Well, yeah. No kidding! Nobody ever fucking told you that.”

"...When he says, ‘Life is suffering,’ that resonates very deeply. You can tell he’s not bullshitting us."
This is a profile of a guy I happen to have met recently at a fancy event (thx for cigars, Mike!), but it's also a reflection on the evolution (or not) of finance over the last few decades.
Novelist Gary Shteyngart on Mike Novogratz (New Yorker):

... And yet the majority of the hedge funders I befriended were not living happier or more interesting lives than my friends who had been exiled from the city. They had devoted their intellects and energies to winning a game that seemed only to diminish the players. One book I was often told to read was “Reminiscences of a Stock Operator,” first published in 1923. Written by Edwin Lefèvre, the novel follows a stockbroker named Lawrence Livingston, widely believed to be based on Jesse Livermore, a colorful speculator who rose from the era of street-corner bucket shops. I was astounded by how little had changed between the days of ticker tape and our own world of derivatives and flash trading, but a facet that none of the book’s Wall Street fans had mentioned was the miserableness of its protagonist. Livingston dreams of fishing off the Florida coast, preferably in his new yacht, but he keeps tacking back up to New York for one more trade. “Trading is addictive,” Novogratz told me at the Princeton reunion. “All these guys get addicted.” Livermore fatally shot himself in New York’s Sherry-Netherland Hotel in 1940.

... Novogratz had described another idea to me, one several magnitudes more audacious—certainly more institutional, and potentially more durable—than a mere half-a-billion-dollar hedge fund. He wanted to launch a publicly traded merchant bank solely for cryptocurrencies, which, with characteristic immodesty, he described as “the Goldman Sachs of crypto,” and was calling Galaxy Digital. “I’m either going to look like a genius or an idiot,” he said.

... On the day we met at his apartment, a regulatory crackdown in China, preceded by one announced in South Korea, was pushing the price of bitcoin down. (It hasn’t returned to its December high, and is currently priced at around seven thousand dollars.) Meanwhile, it appeared that hedge funds, many of which had ended 2016 either ailing or dead, were reporting their best returns in years. After six years of exploring finance, I concluded that, despite the expertise and the intelligence on display, nobody really knows anything. “In two years, this will be a big business,” Novogratz said, of Galaxy Digital. “Or it won’t be.”

Saturday, April 28, 2018

A Brief History of the (Near) Future: How AI and Genomics Will Change What It Means To Be Human

I'll be giving the talk below to an audience of oligarchs in Los Angeles next week. This is a video version I made for fun. It cuts off at 17min even though the whole talk is ~25min, because my team noticed that I gave away some sensitive information :-( 

The slides are here.

A Brief History of the (Near) Future: How AI and Genomics Will
Change What It Means To Be Human

AI and Genomics are certain to have huge impacts on markets, health, society, and even what it means to be human. These are not two independent trends; they interact in important ways, as I will explain. Computers now outperform humans on most narrowly-defined tasks, such as face recognition, voice recognition, Chess, and Go. Using AI methods in genomic prediction, we can, for example, estimate the height of a human based on DNA alone, plus or minus an inch. Almost a million babies are born each year via IVF, and it is possible now to make nontrivial predictions about them (even, about their cognitive ability) from embryo genotyping. I will describe how AI, Genomics, and AI+Genomics will evolve in the coming decades.

Short Bio: Stephen Hsu is VP for Research and Professor of Theoretical Physics at Michigan State University. He is also a researcher in computational genomics and founder of several Silicon Valley startups, ranging from information security to biotech.

Friday, April 27, 2018

Keepin' it real with UFC fighter Kevin Lee (JRE podcast)

A great ~20 minutes starting at ~1:01 with UFC 155 contender Kevin Lee. Lee talks about self-confidence, growing up in an all-black part of Detroit, not knowing any white people his age until attending college, getting started in wrestling and MMA. If you don't believe early environment affects life outcomes you are crazy...

They also discuss Ability vs Practice: 10,000 hour rule is BS, in wrestling and MMA as with anything else. Lee was a world class fighter by his early twenties, having had no martial arts training until starting wrestling at age 16. He has surpassed other athletes who have had intensive training in boxing, kickboxing, wrestling, jiujitsu since childhood. It will be interesting to see him face Khabib Nurmagomedov, who has been trained, almost since birth, in wrestling, judo, and combat sambo. (His father is a famous coach and former competitor in Dagestan.)

Here are some highlights from Lee's recent domination of Edson Barboza.

Wednesday, April 18, 2018

New Statesman: "like it or not, the debate about whether genes affect intelligence is over"

Science writer Philip Ball, a longtime editor at Nature, writes a sensible article about the implications of rapidly improving genomic prediction for cognitive ability.
Philip Ball is a freelance science writer. He worked previously at Nature for over 20 years, first as an editor for physical sciences (for which his brief extended from biochemistry to quantum physics and materials science) and then as a Consultant Editor. His writings on science for the popular press have covered topical issues ranging from cosmology to the future of molecular biology.

Philip is the author of many popular books on science, including works on the nature of water, pattern formation in the natural world, colour in art, the science of social and political philosophy, the cognition of music, and physics in Nazi Germany.

... Philip has a BA in Chemistry from the University of Oxford and a PhD in Physics from the University of Bristol.
I recommend the whole article -- perhaps it will stimulate a badly needed discussion of this rapidly advancing area of science.
The IQ trap: how the study of genetics could transform education (New Statesman)

The study of the genes which affect intelligence could revolutionise education. But, haunted by the spectre of eugenics, the science risks being lost in a political battle.

... Researchers are now becoming confident enough to claim that the information available from sequencing a person’s genome – the instructions encoded in our DNA that influence our physical and behavioural traits – can be used to make predictions about their potential to achieve academic success. “The speed of this research has surprised me,” says the psychologist Kathryn Asbury of the University of York, “and I think that it is probable that pretty soon someone – probably a commercial company – will start to try to sell it in some way.” Asbury believes “it is vital that we have regulations in place for the use of genetic information in education and that we prepare legal, social and ethical cases for how it could and should be used.”

... Some kids pick things up in a flash, others struggle with the basics. This doesn’t mean it’s all in their genes: no one researching genes and intelligence denies that a child’s environment can play a big role in educational attainment. Of course kids with supportive, stimulating families and motivated peers have an advantage, while in some extreme cases the effects of trauma or malnutrition can compromise brain development.

... Robert Plomin of King’s College London, one of the leading experts on the genetic basis of intelligence, and his colleague Sheila Walker. They surveyed almost 2,000 primary school teachers and parents about their perceptions of genetic influence on a number of traits, including intelligence, and found that on the whole, both teachers and parents rated genetics as being just as important as the environment. This was despite the fact that 80 per cent of the teachers said there was no mention of genetics in their training. Plomin and Walker concluded that educators do seem to accept that genes influence intelligence.

Kathryn Asbury supports that view. When her PhD student Madeline Crosswaite investigated teachers’ beliefs about intelligence, Asbury says she found that “teachers, on average, believe that genetic factors are at least as important as environmental factors” and say they are “open to a role for genetic information in education one day, and that they would like to know more”.

... But now it’s possible to look directly at people’s genomes: to read the molecular code (sequence) of large proportions of an individual’s DNA. Over the past decade the cost of genome sequencing has fallen sharply, making it possible to look more directly at how genes correlate with intelligence. The data both from twin studies and DNA analysis are unambiguous: intelligence is strongly heritable. Typically around 50 per cent of variations in intelligence between individuals can be ascribed to genes, although these gene-induced differences become markedly more apparent as we age. As Ritchie says: like it or not, the debate about whether genes affect intelligence is over.

... Genome-wide polygenic scores can now be used to make such predictions about intelligence. They’re not really reliable at the moment, but will surely become better as the sample sizes for genome-wide studies increase. They will always be about probabilities, though: “Mrs Larkin, there is a 67 per cent chance that your son will be capable of reaching the top 10 per cent of GCSE grades.” Such exam results were indeed the measure Plomin and colleagues used for one recent study of genome-based prediction. They found that there was a stronger correlation between GPS and GCSE results for extreme outcomes – for particularly high or low marks.

... Using GPSs from nearly 5,000 pupils, the report assesses how exam results from different types of school – non-selective state, selective state grammar, and private – are correlated with gene-based estimates of ability for the different pupil sets. The results might offer pause for thought among parents stumping up eyewatering school fees: the distribution of exam results at age 16 could be almost wholly explained by heritable differences, with less than 1 per cent being due to the type of schooling received. In other words, as far as academic achievement is concerned, selective schools seem to add next to nothing to the inherent abilities of their pupils. ...

Monday, April 16, 2018

The Genetics of Human Behavior (The Insight podcast)

Intelligence researcher Stuart Ritchie interviewed by genomicists Razib Khan and Spencer Wells. Highly recommended! Thanks to a commenter for the link.

Sunday, April 15, 2018

Sweet Tweet Treats

For mysterious reasons, this old tweet has attracted almost 200k impressions in the last day or so:

If you like that tweet, this one might be of interest as well:

I'm always amazed that so many people have strong opinions on topics like Nature vs Nurture, How the World Works, How Civilization Advances (or does not), without having examined the evidence.

Friday, April 13, 2018

Evolution of Venture Capital: SV + Asia dominate

Comparison to dot com bubble of 2000 probably not appropriate as global pool of startup innovation is order of magnitude larger now.
WSJ: Silicon Valley Powered American Tech Dominance—Now It Has a Challenger

An exclusive WSJ analysis shows how venture-capital investment from Asia is skyrocketing, threatening to shift power over innovation ...

Blog Archive