Sunday, April 18, 2021

Francois Chollet - Intelligence and Generalization, Psychometrics for Robots (AI/ML)


If you have thought a lot about AI and deep learning you may find much of this obvious. Nevertheless I enjoyed the discussion. Apparently Chollet's views (below) are controversial in some AI/ML communities but I do not understand why. 

Chollet's Abstraction and Reasoning Corpus (ARC) = Raven's Matrices for AIs :-)
Show Notes: 
...Francois has a clarity of thought that I've never seen in any other human being! He has extremely interesting views on intelligence as generalisation, abstraction and an information conversation ratio. He wrote on the measure of intelligence at the end of 2019 and it had a huge impact on my thinking. He thinks that NNs can only model continuous problems, which have a smooth learnable manifold and that many "type 2" problems which involve reasoning and/or planning are not suitable for NNs. He thinks that many problems have type 1 and type 2 enmeshed together. He thinks that the future of AI must include program synthesis to allow us to generalise broadly from a few examples, but the search could be guided by neural networks because the search space is interpolative to some extent. 
Tim Intro [00:00:00​]
Manifold hypothesis and interpolation [00:06:15​]
Yann LeCun skit [00:07:58​]
Discrete vs continuous [00:11:12​]
NNs are not turing machines [00:14:18​]
Main show kick-off [00:16:19​]
DNN models are locally sensitive hash tables and only efficiently encode some kinds of data well [00:18:17​]
Why do natural data have manifolds? [00:22:11​]
Finite NNs are not "turing complete" [00:25:44​]
The dichotomy of continuous vs discrete problems, and abusing DL to perform the former [00:27:07​]
Reality really annoys a lot of people, and ...GPT-3 [00:35:55​]
There are type one problems and type 2 problems, but...they are enmeshed [00:39:14​]
Chollet's definition of intelligence and how to construct analogy [00:41:45​]
How are we going to combine type 1 and type 2 programs? [00:47:28​]
Will topological analogies be robust and escape the curse of brittleness? [00:52:04​]
Is type 1 and 2 two different physical systems? Is there a continuum? [00:54:26​]
Building blocks and the ARC Challenge [00:59:05​]
Solve ARC == intelligent? [01:01:31​]
Measure of intelligence formalism -- it's a whitebox method [01:03:50​]
Generalization difficulty [01:10:04​]
Lets create a marketplace of generated intelligent ARC agents! [01:11:54​]
Mapping ARC to psychometrics [01:16:01​]
Keras [01:16:45​]
New backends for Keras? JAX? [01:20:38​]
Intelligence Explosion [01:25:07​]
Bottlenecks in large organizations [01:34:29​]
Summing up the intelligence explosion [01:36:11​]
Post-show debrief [01:40:45​]
This is Chollet's paper which is the focus of much of the discussion.
On the Measure of Intelligence 
François Chollet 
To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.
Notes on the paper by Robert Lange (TU-Berlin), including illustrations like the ones below.

Friday, April 16, 2021

Academic Freedom in Crisis: Punishment, Political Discrimination, and Self-Censorship

Last week MSU hosted a virtual meeting on Freedom of Speech and Intellectual Diversity on Campus. I particularly enjoyed several of the talks, including the ones by Randall Kennedy (Harvard), Conor Friesdorf (The Atlantic), and Cory Clark (UPenn). Clark had some interesting survey data I had never seen before. I hope the video from the meeting will be available soon. 

In the meantime, here are some survey results from Eric Kaufmann (University of London). The full report is available at the link.

In this recent podcast interview Kaufmann discusses the woke takeover of academia and other institutions.

Stylized facts:

1. Academia has always been predominantly left, but has become more and more so over time. This imbalance is stronger in Social Science and Humanities (SSH) than in STEM, but even in STEM the faculty are predominantly left of center relative to the general population.

2. Leftists are becoming more and more intolerant of opposing views.

3. Young academics (PhD students and junior faculty) are the least tolerant of all.

In my opinion the unique importance of research universiites originates from their commitment to the search for Truth. This commitment is being supplanted by a focus on social justice, with extremely negative consequences.

Figure 1. Note: Excludes STEM academics. Labels refer to hypothetical scenarios in which respondents are asked whether they would support a campaign to dismiss a staff member who found the respective conclusions in their research. Brackets denote sample size.


Figure 2. Note: Includes STEM academics. Based on a direct question rather than a concealed list technique.


Figure 3. Note: SSH refers to social sciences and humanities. Sample size in brackets. STEM share of survey responses: US and Canada academic: 10%; UK mailout: zero; UK YouGov SSH active: zero; UK YouGov All: 53%; UK PhDs: 55%; North American PhDs: 63%.

Thursday, April 08, 2021

Freedom of Speech and Intellectual Diversity on Campus (MSU virtual conference)

The LeFrak Forum On Science, Reason, and Modern Democracy 
Department of Political Science 
Michigan State University 

Register here!

Thursday, April 8 -- Saturday, April 10; on ZOOM 
Conference Program: 
Keynote Address - Thursday, April 8, 
5:00-6:30pm EST 
Randall Kennedy, "The Race Question and Freedom of Expression." 
Randall Kennedy is the Michael R. Klein Professor at Harvard Law School, preeminent authority on the First Amendment in its relation to the American struggle for civil rights.


Day One: Intellectual Diversity - Friday, April 9  
11:30am - 1:00pm EST 
Panel 1: What are the empirical facts about lack of intellectual diversity in academia and what are the causes of existing imbalances? 
Paper: Lee Jussim, Distinguished Professor and Chair, Department of Psychology, Rutgers University, author of The Politics of Social Psychology. 
Discussant: Philip Tetlock, Annenberg University Professor, University of Pennsylvania, author of “Why so few conservatives and should we care?” and Cory Clark, Visiting Scholar, Department of Psychology, University of Pennsylvania, author of “Partisan Bias and its Discontents.” 
2:00pm - 3:30pm EST 
Panel 2: In what precise ways and to what degree is this imbalance a problem? 
Paper: Joshua Dunn, Professor and Chair, Department of Political Science, University of Colorado, co-author of Passing on the Right: Conservative Professors in the Progressive University. 
Discussant: Amna Khalid, Associate Professor of History, Carleton College, author of “Not A Vast Right-Wing Conspiracy: Why Left-Leaning Faculty Should Care About Threats to Free Expression on Campus." 
4:00pm - 5:45pm EST 
Panel 3: What is To Be Done? 
Paper: Musa Al-Gharbi, Paul F. Lazarsfeld Fellow in Sociology, Columbia University and Managing Editor, Heterodox Academy, author of “Why Care About Ideological Diversity in Social Research? The Definitive Response.” 
Paper: Conor Friedersdorf, Staff writer at The Atlantic and frequent contributor to its special series “The Speech Wars,” author of “Free Speech Will Survive This Moment.”


Day Two: Freedom of Speech - Saturday, April 10 
11:30am - 1:00pm EST 
Panel 1: An empirical accounting of the recent challenges to free speech on campus from left and right. What is the true character of the problem or problems here and do they constitute a “crisis”? 
Paper: Jonathan Marks, Professor and Chair, Department of Politics and International Relations, Ursinus College, author of Let's Be Reasonable: A Conservative Case for Liberal Education. 
Respondent: April Kelly-Woessner, Dean of the School of Public Service and Professor of Political Science at Elizabethtown College, author of The Still Divided Academy 
2:00pm - 3:45pm EST 
Panel 2: But is Free speech, as traditionally interpreted, even the right ideal? -- a Debate 
Ulrich Baer, University Professor of Comparative Literature, German, and English, NYU, author of What Snowflakes Get Right: Free Speech and Truth on Campus 
Keith Whittington, Professor of Politics, Princeton University, author of Speak Freely: Why Universities Must Defend Free Speech. 
4:30pm - 6:15pm EST  
Panel 3: What is To Be Done? 
Paper: Nancy Costello, Associate Clinical Professor of Law, MSU. Founder and Director of the First Amendment Law Clinic -- the only law clinic in the nation devoted to the defense of student press rights. Also, Director of the Free Expression Online Library and Resource Center. 
Paper: Jonathan Friedman, Project Director for campus free speech at PEN America – “a program of advocacy, analysis, and outreach in the national debate around free speech and inclusion at colleges and universities.”

Monday, April 05, 2021

Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank

These new results arose from initial investigations of blood biomarker predictions from DNA. The lipoprotein A predictor we built correlates almost 0.8 with the measured result, and this agreement would probably be even stronger if day to day fluctuations were averaged out. It is the most accurate genomic predictor for a complex trait that we are aware of.

We then became interested in the degree to which biomarkers alone could be used to predict disease risk. Some of the biomarker-based disease risk predictors we built (e.g., for kidney or liver problems) do not, as far as we know, have widely used clinical counterparts. Further research may show that predictors of this kind have broad utility. 

Statistical learning in a space of ~50 biomarkers is considered a "high dimensional" problem from the perspective of medical diagnosis, however compared to genomic prediction using a million SNP features, it is rather straightforward. 
Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank  
Erik Widen, Timothy G. Raben, Louis Lello, Stephen D.H. Hsu 
We use UK Biobank data to train predictors for 48 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, ... from SNP genotype. For example, our predictor correlates  ~ 0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information). Individuals who are at high risk (e.g., odds ratio of > 5x population average) can be identified for conditions such as coronary artery disease (AUC ~ 0.75, diabetes (AUC ~ 0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ~10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: (risk score | SNPs)) for common diseases to the risk predictors which result from the concatenation of learned functions (risk score | biomarkers) and (biomarker | SNPs).

Sunday, April 04, 2021

Inside Huawei, and Wuhan after the pandemic

The first three videos below are episodes of Japanese director Takeuchi Ryo's ongoing series on Huawei. 

Ryo lives in Nanjing and speaks fluent Mandarin. He became famous for his coverage of the lockdown and pandemic in Wuhan. The fourth video below tells the stories of 10 families: how they survived, and how their lives have changed.

The general consensus seems to be that Huawei is 2+ years ahead of other competitors in 5G technology, and has a very deep IP position (patent portfolio) as well. In AI applications my impression is that they are also strong, but not world leaders at the research frontier like Google Brain or DeepMind. Like most Chinese companies their strength is in practical deployment of systems at scale, not in publishing papers. In smartphones and laptops they compete head to head with Samsung, Apple, etc. in all areas, including chip design. Their HiSilicon subsidiary has designed Kirin CPUs that are on par with the best Qualcomm and Apple competitors used in flagship handsets. However, all three rely on TSMC to fabricate these designs.

Blog Archive