Thursday, June 03, 2021

Macroscopic Superpositions in Isolated Systems (talk video + slides)


This is video of a talk based on the paper
Macroscopic Superpositions in Isolated Systems 
R. Buniy and S. Hsu 
arXiv:2011.11661, to appear in Foundations of Physics 
For any choice of initial state and weak assumptions about the Hamiltonian, large isolated quantum systems undergoing Schrodinger evolution spend most of their time in macroscopic superposition states. The result follows from von Neumann's 1929 Quantum Ergodic Theorem. As a specific example, we consider a box containing a solid ball and some gas molecules. Regardless of the initial state, the system will evolve into a quantum superposition of states with the ball in macroscopically different positions. Thus, despite their seeming fragility, macroscopic superposition states are ubiquitous consequences of quantum evolution. We discuss the connection to many worlds quantum mechanics.
Slides for the talk.

See this earlier post about the paper:
It may come as a surprise to many physicists that Schrodinger evolution in large isolated quantum systems leads generically to macroscopic superposition states. For example, in the familiar Brownian motion setup of a ball interacting with a gas of particles, after sufficient time the system evolves into a superposition state with the ball in macroscopically different locations. We use von Neumann's 1929 Quantum Ergodic Theorem as a tool to deduce this dynamical result. 

The natural state of a complex quantum system is a superposition ("Schrodinger cat state"!), absent mysterious wavefunction collapse, which has yet to be fully defined either in logical terms or explicit dynamics. Indeed wavefunction collapse may not be necessary to explain the phenomenology of quantum mechanics. This is the underappreciated meaning of work on decoherence dating back to Zeh and Everett. See talk slides linked here, or the introduction of this paper.

We also derive some new (sharper) concentration of measure bounds that can be applied to small systems (e.g., fewer than 10 qubits). 

Related posts:

Wednesday, May 26, 2021

How Dominic Cummings And The Warner Brothers Saved The UK

Photo above shows the white board in the Prime Minister's office which Dominic Cummings and team (including the brothers Marc and Ben Warner) used to convince Boris Johnson to abandon the UK government COVID herd immunity plan and enter lockdown. Date: March 13 2020. 

Only now can the full story be told. In early 2020 the UK government had a COVID herd immunity plan in place that would have resulted in disaster. The scientific experts (SAGE) advising the government strongly supported this plan -- there are public, on the record briefings to this effect. These are people who are not particularly good at order of magnitude estimates and first-principles reasoning. 

Fortunately Dom was advised by the brothers Marc and Ben Warner (both physics PhDs, now working in AI and data science), DeepMind founder Demis Hassabis, Fields Medalist Tim Gowers, and others. In the testimony (see ~23m, ~35m, ~1h02m, ~1h06m in the video below) he describes the rather dramatic events that led to a switch from the original herd immunity plan to a lockdown Plan B. More details in this tweet thread.

I checked my emails with Dom during February and March, and they confirm his narrative. I wrote the March 9 blog post Covid-19 Notes in part for Dom and his team, and I think it holds up over time. Tim Gowers' document reaches similar conclusions.


Seven hours of riveting Dominic Cummings testimony from earlier today. 

Shorter summary video (Channel 4). Summary live-blog from the Guardian.

This is a second white board used in the March 14 meeting with Boris Johnson:

Saturday, May 22, 2021

Feynman Lectures on the Strong Interactions (Jim Cline notes)

Professor James Cline (McGill University) recently posted a set of lecture notes from Feynman's last Caltech course, on quantum chromodynamics. Cline, then a graduate student, was one of the course TAs and the notes were meant to be assembled into a monograph. Thanks to Tim Raben for pointing these out to me.

The content seems a bit more elementary than in John Preskill's Ph230abc, a special topics course on QCD taught in 1983-4. I still consider John's notes to be one of the best overviews of nonperturbative aspects of QCD, which is a rather deep subject. However as Cline remarks there is unsurprisingly something special about the lectures: Feynman was an inspiring teacher, presenting everything in an incisive and fascinating way, that obviously had his own mark on it.

The material on QFT in non-integer spacetime dimensions is, as far as I know, original to Feynman. Dimensional regularization of gauge theory was popularized by 't Hooft and Veltman, but the analytic continuation to d = 4 - ε is specifc to the loop integrals (i.e., concrete mathematical expressions) that appear in perturbation theory. Here Feynman is, more ambitiously, exploring whether the quantum gauge theory itself can be meaningfully extended to a non-integer number of spacetime dimensions. 
Feynman Lectures on the Strong Interactions  
Richard P. Feynman, James M. Cline 
These twenty-two lectures, with exercises, comprise the extent of what was meant to be a full-year graduate-level course on the strong interactions and QCD, given at Caltech in 1987-88. The course was cut short by the illness that led to Feynman's death. Several of the lectures were finalized in collaboration with Feynman for an anticipated monograph based on the course. The others, while retaining Feynman's idiosyncrasies, are revised similarly to those he was able to check. His distinctive approach and manner of presentation are manifest throughout. Near the end he suggests a novel, nonperturbative formulation of quantum field theory in D dimensions. Supplementary material is provided in appendices and ancillary files, including verbatim transcriptions of three lectures and the corresponding audiotaped recordings.
The image below is from some of Feynman's handwritten notes (in this case, about the Gribov ambiguity in Fadeev-Popov gauge fixing) that Cline included in the manuscript. There are also links to audio from some of the lectures. As in some earlier notebooks, Feynman sometimes writes "guage" instead of gauge.

Sunday, May 16, 2021

Ditchley Foundation meeting: China Today and Tomorrow

China Today and Tomorrow 
20 MAY 2021 - 21 MAY 2021 
This Ditchley conference will focus on China, its internal state and sense of self today, its role in the region and world, and how these might evolve in years to come. 
There are broadly two current divergent narratives about China. The first is that China’s successful response to the pandemic has accelerated China’s ascent to be the world’s pre-eminent economic power. The Made in China 2025 strategy will also see China take the lead in some technologies beyond 5G, become self-sufficient in silicon chip production and free itself largely of external constraints on growth. China’s internal market will grow, lessening dependence on exports and that continued growth will maintain the bargain between the Chinese people and the Chinese Communist Party through prosperity and stability. Retaining some elements of previous Chinese strategy though, this confidence is combined with a degree of humility: China is concerned with itself and its region, not becoming a global superpower or challenging the US. Economic supremacy is the aim but military strategy remains focused on defence, not increasing international leverage or scope of action. 
The second competing narrative is that China’s position is more precarious than it appears. The Belt and Road Initiative will bring diplomatic support from client countries but not real economic gains. Human rights violations will damage China abroad. Internally the pressures on natural resources will prove hard to sustain. Democratic and free-market innovation, combined with a bit more industrial strategy, will outstrip China’s efforts. Careful attention to supply chains in the West will meanwhile reduce critical reliance on China and curb China’s economic expansion. This perceived fragility is often combined though with a sense of heightened Chinese ambition abroad, not just through the Belt and Road Initiative but in challenging the democratic global norms established since 1989 by presenting technologically-enabled and effective authoritarian rule as an alternative model for the world, rather than just a Chinese solution. 
What is the evidence today for where we should settle between these narratives? What trends should we watch to determine likely future results? ...
[Suggested background reading at link above.] 
Unfortunately this meeting will be virtual. The video below gives some sense of the unique charm of in-person workshops at Ditchley.

See also this 2020 post about an earlier Ditchley meeting I attended: World Order Today
... analysis by German academic Gunnar Heinsohn. Two of his slides appear below.
1. It is possible that by 2050 the highly able STEM workforce in PRC will be ~10x larger than in the US and comparable to or larger than the rest of the world combined. Here "highly able" means roughly top few percentile math ability in developed countries (e.g., EU), as measured by PISA at age 15. 
[ It is trivial to obtain this kind of estimate: PRC population is ~4x US population and fraction of university students in STEM is at least ~2x higher. Pool of highly able 15 year olds as estimated by PISA or TIMMS international testing regimes is much larger than in US, even per capita. Heinsohn's estimate is somewhat high because he uses PISA numbers that probably overstate the population fraction of Level 6 kids in PRC. Current PISA studies disproportionately sample from more developed areas of China. At bottom (asterisk) he uses results from Taiwan/Macau that give a smaller ~20x advantage of PRC vs USA. My own ~10x estimate is quite conservative in comparison. ]

2. The trajectory of international patent filings shown below is likely to continue. 

Wednesday, May 12, 2021

Neural Tangent Kernels and Theoretical Foundations of Deep Learning

A colleague recommended this paper to me recently. See also earlier post Gradient Descent Models Are Kernel Machines.
Neural Tangent Kernel: Convergence and Generalization in Neural Networks 
Arthur Jacot, Franck Gabriel, Clément Hongler 
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function fθ (which maps input vectors to output vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and it stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We prove the positive-definiteness of the limiting NTK when the data is supported on the sphere and the non-linearity is non-polynomial. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function fθ follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stopping. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.
The results are remarkably well summarized in the wikipedia entry on Neural Tangent Kernels:

For most common neural network architectures, in the limit of large layer width the NTK becomes constant. This enables simple closed form statements to be made about neural network predictions, training dynamics, generalization, and loss surfaces. For example, it guarantees that wide enough ANNs converge to a global minimum when trained to minimize an empirical loss. ...

An Artificial Neural Network (ANN) with scalar output consists in a family of functions  parametrized by a vector of parameters .

The Neural Tangent Kernel (NTK) is a kernel  defined by

In the language of kernel methods, the NTK  is the kernel associated with the feature map  .

For a dataset  with scalar labels  and a loss function , the associated empirical loss, defined on functions , is given by

When training the ANN  is trained to fit the dataset (i.e. minimize ) via continuous-time gradient descent, the parameters  evolve through the ordinary differential equation:

During training the ANN output function follows an evolution differential equation given in terms of the NTK:

This equation shows how the NTK drives the dynamics of  in the space of functions  during training.

This is a very brief (3 minute) summary by the first author:

This 15 minute IAS talk gives a nice overview of the results, and their relation to fundamental questions (both empirical and theoretical) in deep learning. Longer (30m) version: On the Connection between Neural Networks and Kernels: a Modern Perspective.  

I hope to find time to explore this in more depth. Large width seems to provide a limiting case (analogous to the large-N limit in gauge theory) in which rigorous results about deep learning can be proved.

Some naive questions:

What is the expansion parameter of the finite width expansion?

What role does concentration of measure play in the results? (See 30m video linked above.)

Simplification seems to be a consequence of overparametrization. But the proof method seems to apply to a regularized (but still convex, e.g., using L1 penalization) loss function that imposes sparsity. It would be interesting to examine this specific case in more detail.

Notes to self:

The overparametrized (width ~ w^2) network starts in a random state and by concentration of measure this initial kernel K is just the expectation, which is the NTK. Because of the large number of parameters the effect of training (i.e., gradient descent) on any individual parameter is 1/w, and the change in the eigenvalue spectrum of K is also 1/w. It can be shown that the eigenvalue spectrum is positive and bounded away from zero, and this property does not change under training. Also, the evolution of f is linear in K up to corrections with are suppressed by 1/w. Hence evolution follows a convex trajectory and can achieve global minimum loss in a finite (polynomial) time. 

The parametric 1/w expansion may depend on quantities such as the smallest NTK eigenvalue k: the proof might require  k >> 1/w  or  wk large.

In the large w limit the function space has such high dimensionality that any typical initial f is close (within a ball of radius 1/w?) to an optimal f. 

These properties depend on specific choice of loss function.

[ Strangely, this post was flagged for Blogger review for violating their virus and malware policy (!?!) and so disappeared temporarily. After further review by their content team the post has been restored. Thanks to readers who pointed out that I could also have recovered it from the Internet Archive. ]

Saturday, May 08, 2021

Three Thousand Years and 115 Generations of 徐 (Hsu / Xu)

Over the years I have discussed economic historian Greg Clark's groundbreaking work on the persistence of social class. Clark found that intergenerational social mobility was much less than previously thought, and that intergenerational correlations on traits such as education and occupation were consistent with predictions from an additive genetic model with a high degree of assortative mating. 

See Genetic correlation of social outcomes between relatives (Fisher 1918) tested using lineage of 400k English individuals, and further links therein. Also recommended: this recent podcast interview Clark did with Razib Khan. 

The other day a reader familiar with Clark's work asked me about my family background. Obviously my own family history is not a scientific validation of Clark's work, being only a single (if potentially illustrative) example. Nevertheless it provides an interesting microcosm of the tumult of 20th century China and a window into the deep past...

I described my father's background in the post Hsu Scholarship at Caltech:
Cheng Ting Hsu was born December 1, 1923 in Wenling, Zhejiang province, China. His grandfather, Zan Yao Hsu was a poet and doctor of Chinese medicine. His father, Guang Qiu Hsu graduated from college in the 1920's and was an educator, lawyer and poet. 
Cheng Ting was admitted at age 16 to the elite National Southwest Unified University (Lianda), which was created during WWII by merging Tsinghua, Beijing, and Nankai Universities. This university produced numerous famous scientists and scholars such as the physicists C.N. Yang and T.D. Lee. 
Cheng Ting studied aerospace engineering (originally part of Tsinghua), graduating in 1944. He became a research assistant at China's Aerospace Research Institute and a lecturer at Sichuan University. He also taught aerodynamics for several years to advanced students at the air force engineering academy. 
In 1946 he was awarded one of only two Ministry of Education fellowships in his field to pursue graduate work in the United States. In 1946-1947 he published a three-volume book, co-authored with Professor Li Shoutong, on the structures of thin-walled airplanes. 
In January 1948, he left China by ocean liner, crossing the Pacific and arriving in San Francisco. ...
My mother's father was a KMT general, and her family related to Chiang Kai Shek by marriage. Both my grandfather and Chiang attended the military academy Shinbu Gakko in Tokyo. When the KMT lost to the communists, her family fled China and arrived in Taiwan in 1949. My mother's family had been converted to Christianity in the 19th century and became Methodists, like Sun Yat Sen. (I attended Methodist Sunday school while growing up in Ames IA.) My grandfather was a partner of T.V. Soong in the distribution of bibles in China in the early 20th century.

My father's family remained mostly in Zhejiang and suffered through the communist takeover, Great Leap Forward, and Cultural Revolution. My father never returned to China and never saw his parents again. 

When I met my uncle (a retired Tsinghua professor) and some of my cousins in Hangzhou in 2010, they gave me a four volume family history that had originally been printed in the 1930s. The Hsu (Xu) lineage began in the 10th century BC and continued to my father, in the 113th generation. His entry is the bottom photo below.
Wikipedia: The State of Xu (Chinese: 徐) (also called Xu Rong (徐戎) or Xu Yi (徐夷)[a] by its enemies)[4][5] was an independent Huaiyi state of the Chinese Bronze Age[6] that was ruled by the Ying family (嬴) and controlled much of the Huai River valley for at least two centuries.[3][7] It was centered in northern Jiangsu and Anhui. ...

Generations 114 and 115:

Four volume history of the Hsu (Xu) family, beginning in the 10th century BC. The first 67 generations are covered rather briefly, only indicating prominent individuals in each generation of the family tree. The books are mostly devoted to generations 68-113 living in Zhejiang. (Earlier I wrote that it was two volumes, but it's actually four. The printing that I have is two thick books.)

Sunday, May 02, 2021

40 Years of Quantum Computation and Quantum Information

This is a great article on the 1981 conference which one could say gave birth to quantum computing / quantum information.
Technology Review: Quantum computing as we know it got its start 40 years ago this spring at the first Physics of Computation Conference, organized at MIT’s Endicott House by MIT and IBM and attended by nearly 50 researchers from computing and physics—two groups that rarely rubbed shoulders. 
Twenty years earlier, in 1961, an IBM researcher named Rolf Landauer had found a fundamental link between the two fields: he proved that every time a computer erases a bit of information, a tiny bit of heat is produced, corresponding to the entropy increase in the system. In 1972 Landauer hired the theoretical computer scientist Charlie Bennett, who showed that the increase in entropy can be avoided by a computer that performs its computations in a reversible manner. Curiously, Ed Fredkin, the MIT professor who cosponsored the Endicott Conference with Landauer, had arrived at this same conclusion independently, despite never having earned even an undergraduate degree. Indeed, most retellings of quantum computing’s origin story overlook Fredkin’s pivotal role. 
Fredkin’s unusual career began when he enrolled at the California Institute of Technology in 1951. Although brilliant on his entrance exams, he wasn’t interested in homework—and had to work two jobs to pay tuition. Doing poorly in school and running out of money, he withdrew in 1952 and enlisted in the Air Force to avoid being drafted for the Korean War. 
A few years later, the Air Force sent Fredkin to MIT Lincoln Laboratory to help test the nascent SAGE air defense system. He learned computer programming and soon became one of the best programmers in the world—a group that probably numbered only around 500 at the time. 
Upon leaving the Air Force in 1958, Fredkin worked at Bolt, Beranek, and Newman (BBN), which he convinced to purchase its first two computers and where he got to know MIT professors Marvin Minsky and John McCarthy, who together had pretty much established the field of artificial intelligence. In 1962 he accompanied them to Caltech, where McCarthy was giving a talk. There Minsky and Fredkin met with Richard Feynman ’39, who would win the 1965 Nobel Prize in physics for his work on quantum electrodynamics. Feynman showed them a handwritten notebook filled with computations and challenged them to develop software that could perform symbolic mathematical computations. ... 
... in 1974 he headed back to Caltech to spend a year with Feynman. The deal was that Fredkin would teach Feynman computing, and Feynman would teach Fredkin quantum physics. Fredkin came to understand quantum physics, but he didn’t believe it. He thought the fabric of reality couldn’t be based on something that could be described by a continuous measurement. Quantum mechanics holds that quantities like charge and mass are quantized—made up of discrete, countable units that cannot be subdivided—but that things like space, time, and wave equations are fundamentally continuous. Fredkin, in contrast, believed (and still believes) with almost religious conviction that space and time must be quantized as well, and that the fundamental building block of reality is thus computation. Reality must be a computer! In 1978 Fredkin taught a graduate course at MIT called Digital Physics, which explored ways of reworking modern physics along such digital principles. 
Feynman, however, remained unconvinced that there were meaningful connections between computing and physics beyond using computers to compute algorithms. So when Fredkin asked his friend to deliver the keynote address at the 1981 conference, he initially refused. When promised that he could speak about whatever he wanted, though, Feynman changed his mind—and laid out his ideas for how to link the two fields in a detailed talk that proposed a way to perform computations using quantum effects themselves. 
Feynman explained that computers are poorly equipped to help simulate, and thereby predict, the outcome of experiments in particle physics—something that’s still true today. Modern computers, after all, are deterministic: give them the same problem, and they come up with the same solution. Physics, on the other hand, is probabilistic. So as the number of particles in a simulation increases, it takes exponentially longer to perform the necessary computations on possible outputs. The way to move forward, Feynman asserted, was to build a computer that performed its probabilistic computations using quantum mechanics. 
[ Note to reader: the discussion in the last sentences above is a bit garbled. The exponential difficulty that classical computers have with quantum calculations has to do with entangled states which live in Hilbert spaces of exponentially large dimension. Probability is not really the issue; the issue is the huge size of the space of possible states. Indeed quantum computations are strictly deterministic unitary operations acting in this Hilbert space. ] 

Feynman hadn’t prepared a formal paper for the conference, but with the help of Norm Margolus, PhD ’87, a graduate student in Fredkin’s group who recorded and transcribed what he said there, his talk was published in the International Journal of Theoretical Physics under the title “Simulating Physics with Computers.” ...

Feynman's 1981 lecture Simulating Physics With Computers.

Fredkin was correct about the (effective) discreteness of spacetime, although he probably did not realize this is a consequence of gravitational effects: see, e.g., Minimum Length From First Principles. In fact, Hilbert Space (the state space of quantum mechanics) itself may be discrete.


My paper on the Margolus-Levitin Theorem in light of gravity: 

We derive a fundamental upper bound on the rate at which a device can process information (i.e., the number of logical operations per unit time), arising from quantum mechanics and general relativity. In Planck units a device of volume V can execute no more than the cube root of V operations per unit time. We compare this to the rate of information processing performed by nature in the evolution of physical systems, and find a connection to black hole entropy and the holographic principle. 

Participants in the 1981 meeting:

Physics of Computation Conference, Endicott House, MIT, May 6–8, 1981. 1 Freeman Dyson, 2 Gregory Chaitin, 3 James Crutchfield, 4 Norman Packard, 5 Panos Ligomenides, 6 Jerome Rothstein, 7 Carl Hewitt, 8 Norman Hardy, 9 Edward Fredkin, 10 Tom Toffoli, 11 Rolf Landauer, 12 John Wheeler, 13 Frederick Kantor, 14 David Leinweber, 15 Konrad Zuse, 16 Bernard Zeigler, 17 Carl Adam Petri, 18 Anatol Holt, 19 Roland Vollmar, 20 Hans Bremerman, 21 Donald Greenspan, 22 Markus Buettiker, 23 Otto Floberth, 24 Robert Lewis, 25 Robert Suaya, 26 Stand Kugell, 27 Bill Gosper, 28 Lutz Priese, 29 Madhu Gupta, 30 Paul Benioff, 31 Hans Moravec, 32 Ian Richards, 33 Marian Pour-El, 34 Danny Hillis, 35 Arthur Burks, 36 John Cocke, 37 George Michaels, 38 Richard Feynman, 39 Laurie Lingham, 40 P. S. Thiagarajan, 41 Marin Hassner, 42 Gerald Vichnaic, 43 Leonid Levin, 44 Lev Levitin, 45 Peter Gacs, 46 Dan Greenberger. (Photo courtesy Charles Bennett)

Wednesday, April 28, 2021

Let The Bodies Pile High In Their Thousands (Boris Johnson)

In the UK:
Recording a conversation in secret is not a criminal offence and is not prohibited. As long as the recording is for personal use you don’t need to obtain consent or let the other person know.
The security man in the foyer of No 10 Downing Street asks that you turn off your phone and deposit it in a wooden cubby shelf built into the wall. I sometimes wondered what the odds were that someone might walk out with my phone -- a disaster, obviously.

But it is not difficult to keep your phone as close attention is not paid. (Or, one could enter with more than one phone.) I'm not saying I have ever disobeyed the rules but I know that it is possible. 

Of course the No 10 staffers all have their phones, which are necessary for their work throughout the day. Thus every meeting at the heart of British government is in danger of being surreptitiously but legally recorded.
Dominic Cummings 'has audio recordings of key government conversations', ally claims (Daily Mail
Dominic Cummings 'has audio recordings of key government conversations' and 'can back up a lot of his claims', ally of the former chief adviser says. 
Dominic Cummings kept audio recordings of key conversations, an ally claims Former chief adviser is locked in an explosive war of words with Boris Johnson. 
Whitehall source said officials did not know extent of material Mr Cummings has. 
Dominic Cummings kept audio recordings of key conversations in government, an ally claimed last night. The former chief adviser is locked in an explosive war of words with Boris Johnson after Downing Street accused him of a string of damaging leaks. 
No 10 attempted to rubbish his claims on Friday night, saying it was not true that the Prime Minister had discussed ending a leak inquiry after a friend of his fiance Carrie Symonds was identified as the likely suspect. But an ally of Mr Cummings said the PM's former chief adviser had taken a treasure trove of material with him when he left Downing Street last year, including audio recordings of discussions with senior ministers and officials. 
'Dom has stuff on tape,' the ally said. 'They are mad to pick a fight with him because he will be able to back up a lot of his claims.
Dom is an admirer of Bismarck. Never underestimate him.
"With a gentleman I am always a gentleman and a half, and when I have to do with a pirate, I try to be a pirate and a half."
Tories scramble to defend Johnson: Politics Weekly podcast (Guardian)

Note the media have no idea what is really going on, as usual.

Friday, April 23, 2021

How a Physicist Became a Climate Truth Teller: Steve Koonin


I read an early draft of Koonin's new book discussed in the WSJ article excerpted below, and I highly recommend it. 

Video above is from a 2019 talk discussed in this earlier post: Certainties and Uncertainties in our Energy and Climate Futures: Steve Koonin.
My own views (consistent, as far as I can tell, with what Steve says in the talk): 
1. Evidence for recent warming (~1 degree C) is strong. 
2. There exist previous eras of natural (non-anthropogenic) global temperature change of similar magnitude to what is happening now. 
3. However, it is plausible that at least part of the recent temperature rise is due to increase of atmospheric CO2 due to human activity. 
4. Climate models still have significant uncertainties. While the direct effect of CO2 IR absorption is well understood, second order effects like clouds, distribution of water vapor in the atmosphere, etc. are not under good control. The increase in temperature from a doubling of atmospheric CO2 is still uncertain to a factor of 2-3 and at the low range (e.g., 1.5 degree C) is not catastrophic. The direct effect of CO2 absorption is modest and at the low range (~1 degree C) of current consensus model predictions. Potentially catastrophic outcomes are due to second order effects that are not under good theoretical or computational control. 
5. Even if a catastrophic outcome is only a low probability tail risk, it is prudent to explore technologies that reduce greenhouse gas production. 
6. A Red Team exercise, properly done, would clarify what is certain and uncertain in climate science. 
Simply stating these views can get you attacked by crazy people.
Buy Steve's book for an accessible and fairly non-technical explanation of these points.
WSJ: ... Barack Obama is one of many who have declared an “epistemological crisis,” in which our society is losing its handle on something called truth. 
Thus an interesting experiment will be his and other Democrats’ response to a book by Steven Koonin, who was chief scientist of the Obama Energy Department. Mr. Koonin argues not against current climate science but that what the media and politicians and activists say about climate science has drifted so far out of touch with the actual science as to be absurdly, demonstrably false. 
This is not an altogether innocent drifting, he points out in a videoconference interview from his home in Cold Spring, N.Y. In 2019 a report by the presidents of the National Academies of Sciences claimed the “magnitude and frequency of certain extreme events are increasing.” The United Nations Intergovernmental Panel on Climate Change, which is deemed to compile the best science, says all such claims should be treated with “low confidence.” 
... Mr. Koonin, 69, and I are of one mind on 2018’s U.S. Fourth National Climate Assessment, issued in Donald Trump’s second year, which relied on such overegged worst-case emissions and temperature projections that even climate activists were abashed (a revolt continues to this day). “The report was written more to persuade than to inform,” he says. “It masquerades as objective science but was written as—all right, I’ll use the word—propaganda.” 
Mr. Koonin is a Brooklyn-born math whiz and theoretical physicist, a product of New York’s selective Stuyvesant High School. His parents, with less than a year of college between them, nevertheless intuited in 1968 exactly how to handle an unusually talented and motivated youngster: You want to go cross the country to Caltech at age 16? “Whatever you think is right, go ahead,” they told him. “I wanted to know how the world works,” Mr. Koonin says now. “I wanted to do physics since I was 6 years old, when I didn’t know it was called physics.” 
He would teach at Caltech for nearly three decades, serving as provost in charge of setting the scientific agenda for one of the country’s premier scientific institutions. Along the way he opened himself to the world beyond the lab. He was recruited at an early age by the Institute for Defense Analyses, a nonprofit group with Pentagon connections, for what he calls “national security summer camp: meeting generals and people in congress, touring installations, getting out on battleships.” The federal government sought “engagement” with the country’s rising scientist elite. It worked. 
He joined and eventually chaired JASON, an elite private group that provides classified and unclassified advisory analysis to federal agencies. (The name isn’t an acronym and comes from a character in Greek mythology.) He got involved in the cold-fusion controversy. He arbitrated a debate between private and government teams competing to map the human genome on whether the target error rate should be 1 in 10,000 or whether 1 in 100 was good enough. 
He began planting seeds as an institutionalist. He joined the oil giant BP as chief scientist, working for John Browne, now Baron Browne of Madingley, who had redubbed the company “Beyond Petroleum.” Using $500 million of BP’s money, Mr. Koonin created the Energy Biosciences Institute at Berkeley that’s still going strong. Mr. Koonin found his interest in climate science growing, “first of all because it’s wonderful science. It’s the most multidisciplinary thing I know. It goes from the isotopic composition of microfossils in the sea floor all the way through to the regulation of power plants.” 
From deeply examining the world’s energy system, he also became convinced that the real climate crisis was a crisis of political and scientific candor. He went to his boss and said, “John, the world isn’t going to be able to reduce emissions enough to make much difference.” 
Mr. Koonin still has a lot of Brooklyn in him: a robust laugh, a gift for expression and for cutting to the heart of any matter. His thoughts seem to be governed by an all-embracing realism. Hence the book coming out next month, Unsettled: What Climate Science Tells Us, What It Doesn’t, and Why It Matters.
Any reader would benefit from its deft, lucid tour of climate science, the best I’ve seen. His rigorous parsing of the evidence will have you questioning the political class’s compulsion to manufacture certainty where certainty doesn’t exist. You will come to doubt the usefulness of centurylong forecasts claiming to know how 1% shifts in variables will affect a global climate that we don’t understand with anything resembling 1% precision. ...

Note Added from comments:

If you're older like Koonin or myself you can remember a time when climate change was entirely devoid of tribal associations -- it was not in the political domain at all. It is easier for us just to concentrate on where the science is, and indeed we can remember where it was in the 1990s or 2000s.

Koonin was MUCH more concerned about alternative energy and climate than the typical scientist and that was part of his motivation for supporting the Berkeley Energy Biosciences Institute, created 2007. The fact that it was a $500M partnership between Berkeley and BP was a big deal and much debated at the time, but there was never any evidence that the science they did was negatively impacted. 

It is IRONIC that his focus on scientific rigor now gets him labeled as a climate denier (or sympathetic to the "wrong" side). ALL scientists should be sceptical, especially about claims regarding long term prediction in complex systems.

Contrast the uncertainty estimates in the IPCC reports (which are not defensible and did not change for ~20y!) vs the (g-2) anomaly that was in the news recently.

When I was at Harvard the physics department and applied science and engineering school shared a coffee lounge. I used to sit there and work in the afternoon and it happened that one of the climate modeling labs had their group meetings there. So for literally years I overheard their discussions about uncertainties concerning water vapor, clouds, etc. which to this day are not fully under control. This is illustrated in Fig1 at the link: https://infoproc.blogspot.c...

The gap between what real scientists say in private and what the public (or non-specialists) gets second hand through the media or politically-focused "scientific policy reports" is vast...

If you don't think we can have long-lasting public delusions regarding "settled science" (like a decade long stock or real estate bubble), look up nuclear winter, which has a lot of similarities to greenhouse gas-driven climate change. Note, I am not claiming that I know with high confidence that nuclear winter can't happen, but I AM claiming that the confidence level expressed by the climate scientists working on it at the time was absurd and communicated in a grotesquely distorted fashion to political leaders and the general public. Even now I would say the scientific issue is not settled, due to its sheer complexity, which is LESS than the complexity involved in predicting long term climate change! 

Sunday, April 18, 2021

Francois Chollet - Intelligence and Generalization, Psychometrics for Robots (AI/ML)


If you have thought a lot about AI and deep learning you may find much of this familiar. Nevertheless I enjoyed the discussion. Apparently Chollet's views (below) are controversial in some AI/ML communities but I do not understand why. 

Chollet's Abstraction and Reasoning Corpus (ARC) = Raven's Matrices for AIs :-)
Show Notes: 
...Francois has a clarity of thought that I've never seen in any other human being! He has extremely interesting views on intelligence as generalisation, abstraction and an information conversation ratio. He wrote on the measure of intelligence at the end of 2019 and it had a huge impact on my thinking. He thinks that NNs can only model continuous problems, which have a smooth learnable manifold and that many "type 2" problems which involve reasoning and/or planning are not suitable for NNs. He thinks that many problems have type 1 and type 2 enmeshed together. He thinks that the future of AI must include program synthesis to allow us to generalise broadly from a few examples, but the search could be guided by neural networks because the search space is interpolative to some extent. 
Tim Intro [00:00:00​]
Manifold hypothesis and interpolation [00:06:15​]
Yann LeCun skit [00:07:58​]
Discrete vs continuous [00:11:12​]
NNs are not turing machines [00:14:18​]
Main show kick-off [00:16:19​]
DNN models are locally sensitive hash tables and only efficiently encode some kinds of data well [00:18:17​]
Why do natural data have manifolds? [00:22:11​]
Finite NNs are not "turing complete" [00:25:44​]
The dichotomy of continuous vs discrete problems, and abusing DL to perform the former [00:27:07​]
Reality really annoys a lot of people, and ...GPT-3 [00:35:55​]
There are type one problems and type 2 problems, but...they are enmeshed [00:39:14​]
Chollet's definition of intelligence and how to construct analogy [00:41:45​]
How are we going to combine type 1 and type 2 programs? [00:47:28​]
Will topological analogies be robust and escape the curse of brittleness? [00:52:04​]
Is type 1 and 2 two different physical systems? Is there a continuum? [00:54:26​]
Building blocks and the ARC Challenge [00:59:05​]
Solve ARC == intelligent? [01:01:31​]
Measure of intelligence formalism -- it's a whitebox method [01:03:50​]
Generalization difficulty [01:10:04​]
Lets create a marketplace of generated intelligent ARC agents! [01:11:54​]
Mapping ARC to psychometrics [01:16:01​]
Keras [01:16:45​]
New backends for Keras? JAX? [01:20:38​]
Intelligence Explosion [01:25:07​]
Bottlenecks in large organizations [01:34:29​]
Summing up the intelligence explosion [01:36:11​]
Post-show debrief [01:40:45​]
This is Chollet's paper which is the focus of much of the discussion.
On the Measure of Intelligence 
François Chollet 
To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.
Notes on the paper by Robert Lange (TU-Berlin), including illustrations like the ones below.

Friday, April 16, 2021

Academic Freedom in Crisis: Punishment, Political Discrimination, and Self-Censorship

Last week MSU hosted a virtual meeting on Freedom of Speech and Intellectual Diversity on Campus. I particularly enjoyed several of the talks, including the ones by Randall Kennedy (Harvard), Conor Friesdorf (The Atlantic), and Cory Clark (UPenn). Clark had some interesting survey data I had never seen before. I hope the video from the meeting will be available soon. 

In the meantime, here are some survey results from Eric Kaufmann (University of London). The full report is available at the link.

In this recent podcast interview Kaufmann discusses the woke takeover of academia and other institutions.

Stylized facts:

1. Academia has always been predominantly left, but has become more and more so over time. This imbalance is stronger in Social Science and Humanities (SSH) than in STEM, but even in STEM the faculty are predominantly left of center relative to the general population.

2. Leftists are becoming more and more intolerant of opposing views.

3. Young academics (PhD students and junior faculty) are the least tolerant of all.

In my opinion the unique importance of research universiites originates from their commitment to the search for Truth. This commitment is being supplanted by a focus on social justice, with extremely negative consequences.

Figure 1. Note: Excludes STEM academics. Labels refer to hypothetical scenarios in which respondents are asked whether they would support a campaign to dismiss a staff member who found the respective conclusions in their research. Brackets denote sample size.


Figure 2. Note: Includes STEM academics. Based on a direct question rather than a concealed list technique.


Figure 3. Note: SSH refers to social sciences and humanities. Sample size in brackets. STEM share of survey responses: US and Canada academic: 10%; UK mailout: zero; UK YouGov SSH active: zero; UK YouGov All: 53%; UK PhDs: 55%; North American PhDs: 63%.

Thursday, April 08, 2021

Freedom of Speech and Intellectual Diversity on Campus (MSU virtual conference)

The LeFrak Forum On Science, Reason, and Modern Democracy 
Department of Political Science 
Michigan State University 

Register here!

Thursday, April 8 -- Saturday, April 10; on ZOOM 
Conference Program: 
Keynote Address - Thursday, April 8, 
5:00-6:30pm EST 
Randall Kennedy, "The Race Question and Freedom of Expression." 
Randall Kennedy is the Michael R. Klein Professor at Harvard Law School, preeminent authority on the First Amendment in its relation to the American struggle for civil rights.


Day One: Intellectual Diversity - Friday, April 9  
11:30am - 1:00pm EST 
Panel 1: What are the empirical facts about lack of intellectual diversity in academia and what are the causes of existing imbalances? 
Paper: Lee Jussim, Distinguished Professor and Chair, Department of Psychology, Rutgers University, author of The Politics of Social Psychology. 
Discussant: Philip Tetlock, Annenberg University Professor, University of Pennsylvania, author of “Why so few conservatives and should we care?” and Cory Clark, Visiting Scholar, Department of Psychology, University of Pennsylvania, author of “Partisan Bias and its Discontents.” 
2:00pm - 3:30pm EST 
Panel 2: In what precise ways and to what degree is this imbalance a problem? 
Paper: Joshua Dunn, Professor and Chair, Department of Political Science, University of Colorado, co-author of Passing on the Right: Conservative Professors in the Progressive University. 
Discussant: Amna Khalid, Associate Professor of History, Carleton College, author of “Not A Vast Right-Wing Conspiracy: Why Left-Leaning Faculty Should Care About Threats to Free Expression on Campus." 
4:00pm - 5:45pm EST 
Panel 3: What is To Be Done? 
Paper: Musa Al-Gharbi, Paul F. Lazarsfeld Fellow in Sociology, Columbia University and Managing Editor, Heterodox Academy, author of “Why Care About Ideological Diversity in Social Research? The Definitive Response.” 
Paper: Conor Friedersdorf, Staff writer at The Atlantic and frequent contributor to its special series “The Speech Wars,” author of “Free Speech Will Survive This Moment.”


Day Two: Freedom of Speech - Saturday, April 10 
11:30am - 1:00pm EST 
Panel 1: An empirical accounting of the recent challenges to free speech on campus from left and right. What is the true character of the problem or problems here and do they constitute a “crisis”? 
Paper: Jonathan Marks, Professor and Chair, Department of Politics and International Relations, Ursinus College, author of Let's Be Reasonable: A Conservative Case for Liberal Education. 
Respondent: April Kelly-Woessner, Dean of the School of Public Service and Professor of Political Science at Elizabethtown College, author of The Still Divided Academy 
2:00pm - 3:45pm EST 
Panel 2: But is Free speech, as traditionally interpreted, even the right ideal? -- a Debate 
Ulrich Baer, University Professor of Comparative Literature, German, and English, NYU, author of What Snowflakes Get Right: Free Speech and Truth on Campus 
Keith Whittington, Professor of Politics, Princeton University, author of Speak Freely: Why Universities Must Defend Free Speech. 
4:30pm - 6:15pm EST  
Panel 3: What is To Be Done? 
Paper: Nancy Costello, Associate Clinical Professor of Law, MSU. Founder and Director of the First Amendment Law Clinic -- the only law clinic in the nation devoted to the defense of student press rights. Also, Director of the Free Expression Online Library and Resource Center. 
Paper: Jonathan Friedman, Project Director for campus free speech at PEN America – “a program of advocacy, analysis, and outreach in the national debate around free speech and inclusion at colleges and universities.”

Monday, April 05, 2021

Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank

These new results arose from initial investigations of blood biomarker predictions from DNA. The lipoprotein A predictor we built correlates almost 0.8 with the measured result, and this agreement would probably be even stronger if day to day fluctuations were averaged out. It is the most accurate genomic predictor for a complex trait that we are aware of.

We then became interested in the degree to which biomarkers alone could be used to predict disease risk. Some of the biomarker-based disease risk predictors we built (e.g., for kidney or liver problems) do not, as far as we know, have widely used clinical counterparts. Further research may show that predictors of this kind have broad utility. 

Statistical learning in a space of ~50 biomarkers is considered a "high dimensional" problem from the perspective of medical diagnosis, however compared to genomic prediction using a million SNP features, it is rather straightforward. 
Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank  
Erik Widen, Timothy G. Raben, Louis Lello, Stephen D.H. Hsu 
We use UK Biobank data to train predictors for 48 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, ... from SNP genotype. For example, our predictor correlates  ~ 0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information). Individuals who are at high risk (e.g., odds ratio of > 5x population average) can be identified for conditions such as coronary artery disease (AUC ~ 0.75), diabetes (AUC ~ 0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ~10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: (risk score | SNPs)) for common diseases to the risk predictors which result from the concatenation of learned functions (risk score | biomarkers) and (biomarker | SNPs).

Blog Archive