## Monday, June 14, 2021

### Japan and The Quad (Red Line geostrategy podcast)

I recommend this episode of The Red Line geostrategy podcast: Japan and The Quad

Serious analysts from the Asia-Pacific region (e.g., Australia, India, Japan, etc.) are often much better than their US counterparts. US analysts tend to misperceive local political and economic realities, and can be captives of Washington DC groupthink (e.g., about weapons systems like aircraft carriers or the F35 or missile defense).

For example, Australian analysts acknowledged the vulnerability of US aircraft carriers to PRC ASBM and cruise missiles well before it became common for US analysts to openly admit the problem. The earliest technical analysis I could find of PRC satellite capability to track US surface ships in the pacific came from an Indian military think tank (see below maps), at a time when many US "experts" denied that it was possible.

In this podcast Japan's reliance on sea lanes for energy, food, and raw materials is given proper emphasis. Japan imports ~60% of its food calories and essentially all of its oil. The stituation is similar for S. Korea and Taiwan. It is important to note that blocking sea transport to Taiwan and Japan does not require PLAN blue water dominance. ASBM and cruise missiles which threaten aircraft carriers can also hold oil tankers and global shipping at risk from launch sites which are on or near the Asian mainland. Missile + drone + AI/ML technology completely alters the nature of sea blockade, but most strategic planners do not yet realize this. Serious conflict in this region would likely wreak havoc on the economies of Taiwan, S. Korea, and Japan.

Red Line: ... In seeking to counter an ever-expanding China, Tokyo is turning abroad in search of allies. Key to this is the recent revival of "The Quad", a strategic dialogue between the US, Australia, Japan and India. Will it be enough to counter their rising neighbour across the East China Sea? Is this the first step to creating an "Asian NATO", and how will China respond?
Guests:
Owen Swift
Geopolitics and defence analyst specialising in Australian & East Asian Defence Written with organisations including The Australian Strategic Policy Institute and Monash University. Senior Producer and resident Asia-Pacific expert at The Red Line
John Nilsson-Wright
Senior Lecturer on Japanese Geopolitics and International Relations for Cambridge University. Senior Research Fellow for Northeast Asia for Chatham House. Author of the book Unequal Allies about the post-war relationship between Japan and the United States.
John Coyne
Head of the Northern Australia Strategic Policy Centre at the Australian Strategic Policy Institute (ASPI). Head of Strategic Policing and Law Enforcement at ASPI. One of the most trusted experts when it comes to the dynamics of East Asia for Australia and the United States.

Part 1: The Return from Armageddon (02:52) Owen Swift overviews Japan's place in East Asia and the fundamental geographic challenges that inform its geopolitics. We tackle Japan's inability to domestically provide the resources and food that its population needs, and how it has historically dealt with this insecurity. The consequences of World War 2 wreaked havoc on Japan's economy, political system and territorial holdings. We analyse the short and long term consequences of this, and seek to understand why it was that Japan took a positive view of the US occupation, comparing it to the option of a possible partial USSR occupation. In the 1980s some thought Japan was on a path to overtake the US economically. While that hasn't come to pass, we look at what it was that made Japan's economic miracle, and the effect that US involvement has today. We look at domestic issues in Japan, including the drastic demographic decline, their ongoing 'defensive only' posture, and the policy options on the table for balancing against the rise of China. Finally we overview Japan's involvement in the Asia-Pacific region as a whole, analysing who it has the best relations with. We look at the extensive investments and infrastructure development Japan is undertaking in ASEAN states, and its cooperation with India and Australia in recent years.
Part 2: The Grand Dilemma (17:56) John Nilsson-Wright helps us understand the fundamental shifts in Japanese politics and foreign policy, including Article 9, the tension between the Yoshida doctrine, public opinion and US pressure within Japan, and the country's re-entry into the sphere of great power competition. We examine the extent of Japan's military presence in the Indo-Pacific; looking at its exercises with other powers and the concerns its neighbours have, some of whom still bear significant scars from World War 2. South Korea's relationship with Japan is one that, on the surface, seems like it should be closer than it is. We analyse why it is that despite their mutual interest in countering North Korea and China, their close geography and both being under the US umbrella, the two states have been unable to overcome enormous domestic resentment and historic scars. Japan's constitution has very tight constraints on what it can do militarily. Nilsson-Wright helps us understand the details of these restrictions and their history over the past few decades. We look at how the legal interpretation of the article has changed as Japan's needs have changed. We also look at Japan's expanding concept of national interest, which began as a purely defensive, geographically limited concept, but that has continued to expand in recent years. We contrast that with the difficulty the government has had with domestic views of Japan's role on the global stage. We tackle territorial issues including the Kuril and Sakhalin Islands, and look at Japan's role in a potential Taiwanese conflict.
Part 3: A United Front? (43:48) John Coyne a takes us through the details of the Quad, and the roles that its constituent members play. We look at Japan's re-examination of their supply chains, their development of strategic depth and the recent news that they are considering abolishing the 1% of GDP cap on military spending. Coyne helps us understand what the Quad actually is - It is not and does not seek to be an Asian NATO - just as ASEAN is not an Asian EU. We look at what the limitations are for each of the states involved, particularly India. We look at the actual relationships and cooperation that has been seen between Quad members. With Japan's newfound willingness to be involved in military operations, we examine how closely they will work with Australia, India and the United States, and the extent to which the Quad is more than just symbolic. We then turn to China's response. Is China likely to seek a grouping like the Quad in opposition to it? Can the Quad actually contain China or its navy in any practical sense? What will China do if cooperation tightens? We look at how China has already sought to hit back, targeting Australia in particular with the "14 Grievances", which were delivered as a consequence of the deterioration of their relationship. Australia's membership and participation in the Quad is a key part of this deterioration. Finally we look at how the Quad members have worked to strategically separate from China, such as Japan's work to defeat China's monopoly on the rare earths industry.

The map below appeared in the 2017 blog post On the military balance of power in the Western Pacific.

Here is another map:

Excerpt below from China’s Constellation of Yaogan Satellites and the Anti-Ship Ballistic Missile – An Update, International Strategic and Security Studies Programme (ISSSP), National Institute of Advanced Studies (NIAS -- India), December 2013. With present technology it is easy to launch LEO (Low Earth Orbit) micro-satellites on short notice to track ships, but PRC has had a much more sophisticated system in place for almost a decade.
Authors: Professor S. Chandrashekar and Professor Soma Perumal
We can state with confidence that the Yaogan satellite constellation and its associated ASBM system provide visible proof of Chinese intentions and capabilities to keep ACG strike groups well away from the Chinese mainland.
Though the immediate purpose of the system is to deter the entry of a hostile aircraft carrier fleet into waters that directly threatens its security interests especially during a possible conflict over Taiwan, the same approach can be adopted to deter entry into other areas of strategic interest
Viewed from this perspective the Chinese do seem to have in place an operational capability for denying or deterring access into areas which it sees as crucial for preserving its sovereignty and security.

Bonus: This political cartoon about the G7 meeting has been widely shared in the sinosphere. Some of the esoteric meaning may be lost on a US audience, but see here for an explanation. Note the device on the table which turns toilet paper into US dollars. The Japanese dog is serving radioactive water to the co-conspirators. Italy, the BRI participant, refuses the drink. France and Germany seem to be thinking about it carefully. Who is the little frog? (Hint: NTD)

## Sunday, June 13, 2021

### An Inconvenient Minority: The Attack on Asian American Excellence and the Fight for Meritocracy (Kenny Xu)

Kenny Xu is a brave young man. His new book An Inconvenient Minority: The Attack on Asian American Excellence and the Fight for Meritocracy expertly documents a number of unpleasant facts about American society that most major media outlets, education leaders, and social justice advocates have been obfuscating or outright suppressing for decades.

1. Asian Americans (not foreign students from Asia, but individuals of Asian heritage who are US citizens or permanent residents) have been discriminated against in admission to elite institutions of higher education for over 30 years.

To put it bluntly, Asian Americans must, on average, outperform all other groups in order to have an equal chance of admission to universities like Harvard or Yale. If one were to replace Asian Americans with Jews in the previous sentence, it would describe the situation in the early 20th century. Looking back, we are rightfully ashamed and outraged at the conduct of elite universities during this period. Future Americans, and observers all over the world, will eventually have the same reaction to how Asian Americans are treated today by these same institutions.

2. Asian American success, e.g., as measured using metrics such as income, wealth, or education, is problematic for simplistic narratives that emphasize race and "white supremacy" over a more realistic and multifaceted analysis of American society.

3. Efforts to guarantee equal outcomes, as opposed to equal opportunities, are anti-meritocratic and corrosive to social cohesion, undermine basic notions of fairness, and handicap the United States in scientific and technological competition with other nations.

The Table of Contents, reproduced below, gives an idea of the important topics covered. Xu had an insider's view of the Students for Fair Admission v. Harvard trial, now awaiting appeal to the Supreme Court. He also describes the successful effort by a grass roots coalition of Asian Americans to defeat CA Proposition 16, which would have reinstated racial preferences in the public sector (including college admissions) which were prohibited by Proposition 209 in 1996.

Over the years I have had many conversations on this topic with well-meaning (but often poorly informed) parents of all ethnic and cultural backgrounds. I cannot help but ask these people
Are you OK with discrimination against your child? What did they do to deserve it?
Are you going to let virtue-signaling administrators at the university devalue the hard work and hard-won accomplishments of your son or daughter? Are you going to do anything about it?
and I cannot help but think
If you won't do anything about it, then f*ck you. Your kids deserve better parents.

Kenny calls it a Fight for Meritocracy. That's what it is -- a fight. Don't forget that Meritocracy is just a fancy word for fairness. It's a fight for your kid, and all kids, to be treated fairly.

I highly recommend the book. These issues are of special concern to Asian Americans, but should be of interest to anyone who wants to know what is really happening in American education today.

Related posts: discrimination against Asian Americans at elite US universities, on meritocracy, and UC faculty report on the use of SAT in admissions.

## Thursday, June 03, 2021

### Macroscopic Superpositions in Isolated Systems (talk video + slides)

This is video of a talk based on the paper
Macroscopic Superpositions in Isolated Systems
R. Buniy and S. Hsu
arXiv:2011.11661, to appear in Foundations of Physics
For any choice of initial state and weak assumptions about the Hamiltonian, large isolated quantum systems undergoing Schrodinger evolution spend most of their time in macroscopic superposition states. The result follows from von Neumann's 1929 Quantum Ergodic Theorem. As a specific example, we consider a box containing a solid ball and some gas molecules. Regardless of the initial state, the system will evolve into a quantum superposition of states with the ball in macroscopically different positions. Thus, despite their seeming fragility, macroscopic superposition states are ubiquitous consequences of quantum evolution. We discuss the connection to many worlds quantum mechanics.
Slides for the talk.

See this earlier post about the paper:
It may come as a surprise to many physicists that Schrodinger evolution in large isolated quantum systems leads generically to macroscopic superposition states. For example, in the familiar Brownian motion setup of a ball interacting with a gas of particles, after sufficient time the system evolves into a superposition state with the ball in macroscopically different locations. We use von Neumann's 1929 Quantum Ergodic Theorem as a tool to deduce this dynamical result.

The natural state of a complex quantum system is a superposition ("Schrodinger cat state"!), absent mysterious wavefunction collapse, which has yet to be fully defined either in logical terms or explicit dynamics. Indeed wavefunction collapse may not be necessary to explain the phenomenology of quantum mechanics. This is the underappreciated meaning of work on decoherence dating back to Zeh and Everett. See talk slides linked here, or the introduction of this paper.

We also derive some new (sharper) concentration of measure bounds that can be applied to small systems (e.g., fewer than 10 qubits).

Related posts:

## Wednesday, May 26, 2021

### How Dominic Cummings And The Warner Brothers Saved The UK

Photo above shows the white board in the Prime Minister's office which Dominic Cummings and team (including the brothers Marc and Ben Warner) used to convince Boris Johnson to abandon the UK government COVID herd immunity plan and enter lockdown. Date: March 13 2020.

Only now can the full story be told. In early 2020 the UK government had a COVID herd immunity plan in place that would have resulted in disaster. The scientific experts (SAGE) advising the government strongly supported this plan -- there are public, on the record briefings to this effect. These are people who are not particularly good at order of magnitude estimates and first-principles reasoning.

Fortunately Dom was advised by the brothers Marc and Ben Warner (both physics PhDs, now working in AI and data science), DeepMind founder Demis Hassabis, Fields Medalist Tim Gowers, and others. In the testimony (see ~23m, ~35m, ~1h02m, ~1h06m in the video below) he describes the rather dramatic events that led to a switch from the original herd immunity plan to a lockdown Plan B. More details in this tweet thread.

I checked my emails with Dom during February and March, and they confirm his narrative. I wrote the March 9 blog post Covid-19 Notes in part for Dom and his team, and I think it holds up over time. Tim Gowers' document reaches similar conclusions.

Seven hours of riveting Dominic Cummings testimony from earlier today.

Shorter summary video (Channel 4). Summary live-blog from the Guardian.

This is a second white board used in the March 14 meeting with Boris Johnson:

## Saturday, May 22, 2021

### Feynman Lectures on the Strong Interactions (Jim Cline notes)

Professor James Cline (McGill University) recently posted a set of lecture notes from Feynman's last Caltech course, on quantum chromodynamics. Cline, then a graduate student, was one of the course TAs and the notes were meant to be assembled into a monograph. Thanks to Tim Raben for pointing these out to me.

The content seems a bit more elementary than in John Preskill's Ph230abc, a special topics course on QCD taught in 1983-4. I still consider John's notes to be one of the best overviews of nonperturbative aspects of QCD, which is a rather deep subject. However as Cline remarks there is unsurprisingly something special about the lectures: Feynman was an inspiring teacher, presenting everything in an incisive and fascinating way, that obviously had his own mark on it.

The material on QFT in non-integer spacetime dimensions is, as far as I know, original to Feynman. Dimensional regularization of gauge theory was popularized by 't Hooft and Veltman, but the analytic continuation to d = 4 - ε is specifc to the loop integrals (i.e., concrete mathematical expressions) that appear in perturbation theory. Here Feynman is, more ambitiously, exploring whether the quantum gauge theory itself can be meaningfully extended to a non-integer number of spacetime dimensions.
Feynman Lectures on the Strong Interactions
Richard P. Feynman, James M. Cline
These twenty-two lectures, with exercises, comprise the extent of what was meant to be a full-year graduate-level course on the strong interactions and QCD, given at Caltech in 1987-88. The course was cut short by the illness that led to Feynman's death. Several of the lectures were finalized in collaboration with Feynman for an anticipated monograph based on the course. The others, while retaining Feynman's idiosyncrasies, are revised similarly to those he was able to check. His distinctive approach and manner of presentation are manifest throughout. Near the end he suggests a novel, nonperturbative formulation of quantum field theory in D dimensions. Supplementary material is provided in appendices and ancillary files, including verbatim transcriptions of three lectures and the corresponding audiotaped recordings.
The image below is from some of Feynman's handwritten notes (in this case, about the Gribov ambiguity in Fadeev-Popov gauge fixing) that Cline included in the manuscript. There are also links to audio from some of the lectures. As in some earlier notebooks, Feynman sometimes writes "guage" instead of gauge.

## Sunday, May 16, 2021

### Ditchley Foundation meeting: China Today and Tomorrow

China Today and Tomorrow
20 MAY 2021 - 21 MAY 2021
This Ditchley conference will focus on China, its internal state and sense of self today, its role in the region and world, and how these might evolve in years to come.
There are broadly two current divergent narratives about China. The first is that China’s successful response to the pandemic has accelerated China’s ascent to be the world’s pre-eminent economic power. The Made in China 2025 strategy will also see China take the lead in some technologies beyond 5G, become self-sufficient in silicon chip production and free itself largely of external constraints on growth. China’s internal market will grow, lessening dependence on exports and that continued growth will maintain the bargain between the Chinese people and the Chinese Communist Party through prosperity and stability. Retaining some elements of previous Chinese strategy though, this confidence is combined with a degree of humility: China is concerned with itself and its region, not becoming a global superpower or challenging the US. Economic supremacy is the aim but military strategy remains focused on defence, not increasing international leverage or scope of action.
The second competing narrative is that China’s position is more precarious than it appears. The Belt and Road Initiative will bring diplomatic support from client countries but not real economic gains. Human rights violations will damage China abroad. Internally the pressures on natural resources will prove hard to sustain. Democratic and free-market innovation, combined with a bit more industrial strategy, will outstrip China’s efforts. Careful attention to supply chains in the West will meanwhile reduce critical reliance on China and curb China’s economic expansion. This perceived fragility is often combined though with a sense of heightened Chinese ambition abroad, not just through the Belt and Road Initiative but in challenging the democratic global norms established since 1989 by presenting technologically-enabled and effective authoritarian rule as an alternative model for the world, rather than just a Chinese solution.
What is the evidence today for where we should settle between these narratives? What trends should we watch to determine likely future results? ...
Unfortunately this meeting will be virtual. The video below gives some sense of the unique charm of in-person workshops at Ditchley.

... analysis by German academic Gunnar Heinsohn. Two of his slides appear below.
1. It is possible that by 2050 the highly able STEM workforce in PRC will be ~10x larger than in the US and comparable to or larger than the rest of the world combined. Here "highly able" means roughly top few percentile math ability in developed countries (e.g., EU), as measured by PISA at age 15.
[ It is trivial to obtain this kind of estimate: PRC population is ~4x US population and fraction of university students in STEM is at least ~2x higher. Pool of highly able 15 year olds as estimated by PISA or TIMMS international testing regimes is much larger than in US, even per capita. Heinsohn's estimate is somewhat high because he uses PISA numbers that probably overstate the population fraction of Level 6 kids in PRC. Current PISA studies disproportionately sample from more developed areas of China. At bottom (asterisk) he uses results from Taiwan/Macau that give a smaller ~20x advantage of PRC vs USA. My own ~10x estimate is quite conservative in comparison. ]

2. The trajectory of international patent filings shown below is likely to continue.

## Wednesday, May 12, 2021

### Neural Tangent Kernels and Theoretical Foundations of Deep Learning

A colleague recommended this paper to me recently. See also earlier post Gradient Descent Models Are Kernel Machines.
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot, Franck Gabriel, Clément Hongler
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function fθ (which maps input vectors to output vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and it stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We prove the positive-definiteness of the limiting NTK when the data is supported on the sphere and the non-linearity is non-polynomial. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function fθ follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stopping. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.
The results are remarkably well summarized in the wikipedia entry on Neural Tangent Kernels:

For most common neural network architectures, in the limit of large layer width the NTK becomes constant. This enables simple closed form statements to be made about neural network predictions, training dynamics, generalization, and loss surfaces. For example, it guarantees that wide enough ANNs converge to a global minimum when trained to minimize an empirical loss. ...

An Artificial Neural Network (ANN) with scalar output consists in a family of functions ${\displaystyle f\left(\cdot ,\theta \right):\mathbb {R} ^{n_{\mathrm {in} }}\to \mathbb {R} }$ parametrized by a vector of parameters ${\displaystyle \theta \in \mathbb {R} ^{P}}$.

The Neural Tangent Kernel (NTK) is a kernel ${\displaystyle \Theta :\mathbb {R} ^{n_{\mathrm {in} }}\times \mathbb {R} ^{n_{\mathrm {in} }}\to \mathbb {R} }$ defined by

${\displaystyle \Theta \left(x,y;\theta \right)=\sum _{p=1}^{P}\partial _{\theta _{p}}f\left(x;\theta \right)\partial _{\theta _{p}}f\left(y;\theta \right).}$
In the language of kernel methods, the NTK ${\displaystyle \Theta }$ is the kernel associated with the feature map ${\displaystyle \left(x\mapsto \partial _{\theta _{p}}f\left(x;\theta \right)\right)_{p=1,\ldots ,P}}$ .

For a dataset ${\displaystyle \left(x_{i}\right)_{i=1,\ldots ,n}\subset \mathbb {R} ^{n_{\mathrm {in} }}}$ with scalar labels ${\displaystyle \left(z_{i}\right)_{i=1,\ldots ,n}\subset \mathbb {R} }$ and a loss function ${\displaystyle c:\mathbb {R} \times \mathbb {R} \to \mathbb {R} }$, the associated empirical loss, defined on functions ${\displaystyle f:\mathbb {R} ^{n_{\mathrm {in} }}\to \mathbb {R} }$, is given by

${\displaystyle {\mathcal {C}}\left(f\right)=\sum _{i=1}^{n}c\left(f\left(x_{i}\right),z_{i}\right).}$

When training the ANN ${\displaystyle f\left(\cdot ;\theta \right):\mathbb {R} ^{n_{\mathrm {in} }}\to \mathbb {R} }$ is trained to fit the dataset (i.e. minimize ${\displaystyle {\mathcal {C}}}$) via continuous-time gradient descent, the parameters ${\displaystyle \left(\theta \left(t\right)\right)_{t\geq 0}}$ evolve through the ordinary differential equation:

${\displaystyle \partial _{t}\theta \left(t\right)=-\nabla {\mathcal {C}}\left(f\left(\cdot ;\theta \right)\right).}$

During training the ANN output function follows an evolution differential equation given in terms of the NTK:

${\displaystyle \partial _{t}f\left(x;\theta \left(t\right)\right)=-\sum _{i=1}^{n}\Theta \left(x,x_{i};\theta \right)\partial _{w}c\left(w,z_{i}\right){\Big |}_{w=f\left(x_{i};\theta \left(t\right)\right)}.}$

This equation shows how the NTK drives the dynamics of ${\displaystyle f\left(\cdot ;\theta \left(t\right)\right)}$ in the space of functions ${\displaystyle \mathbb {R} ^{n_{\mathrm {in} }}\to \mathbb {R} }$ during training.

This is a very brief (3 minute) summary by the first author:

This 15 minute IAS talk gives a nice overview of the results, and their relation to fundamental questions (both empirical and theoretical) in deep learning. Longer (30m) version: On the Connection between Neural Networks and Kernels: a Modern Perspective.

I hope to find time to explore this in more depth. Large width seems to provide a limiting case (analogous to the large-N limit in gauge theory) in which rigorous results about deep learning can be proved.

Some naive questions:

What is the expansion parameter of the finite width expansion?

What role does concentration of measure play in the results? (See 30m video linked above.)

Simplification seems to be a consequence of overparametrization. But the proof method seems to apply to a regularized (but still convex, e.g., using L1 penalization) loss function that imposes sparsity. It would be interesting to examine this specific case in more detail.

Notes to self:

The overparametrized (width ~ w^2) network starts in a random state and by concentration of measure this initial kernel K is just the expectation, which is the NTK. Because of the large number of parameters the effect of training (i.e., gradient descent) on any individual parameter is 1/w, and the change in the eigenvalue spectrum of K is also 1/w. It can be shown that the eigenvalue spectrum is positive and bounded away from zero, and this property does not change under training. Also, the evolution of f is linear in K up to corrections with are suppressed by 1/w. Hence evolution follows a convex trajectory and can achieve global minimum loss in a finite (polynomial) time.

The parametric 1/w expansion may depend on quantities such as the smallest NTK eigenvalue k: the proof might require  k >> 1/w  or  wk large.

In the large w limit the function space has such high dimensionality that any typical initial f is close (within a ball of radius 1/w?) to an optimal f.

These properties depend on specific choice of loss function.

[ Strangely, this post was flagged for Blogger review for violating their virus and malware policy (!?!) and so disappeared temporarily. After further review by their content team the post has been restored. Thanks to readers who pointed out that I could also have recovered it from the Internet Archive. ]

## Saturday, May 08, 2021

### Three Thousand Years and 115 Generations of 徐 (Hsu / Xu)

Over the years I have discussed economic historian Greg Clark's groundbreaking work on the persistence of social class. Clark found that intergenerational social mobility was much less than previously thought, and that intergenerational correlations on traits such as education and occupation were consistent with predictions from an additive genetic model with a high degree of assortative mating.

See Genetic correlation of social outcomes between relatives (Fisher 1918) tested using lineage of 400k English individuals, and further links therein. Also recommended: this recent podcast interview Clark did with Razib Khan.

The other day a reader familiar with Clark's work asked me about my family background. Obviously my own family history is not a scientific validation of Clark's work, being only a single (if potentially illustrative) example. Nevertheless it provides an interesting microcosm of the tumult of 20th century China and a window into the deep past...

I described my father's background in the post Hsu Scholarship at Caltech:
Cheng Ting Hsu was born December 1, 1923 in Wenling, Zhejiang province, China. His grandfather, Zan Yao Hsu was a poet and doctor of Chinese medicine. His father, Guang Qiu Hsu graduated from college in the 1920's and was an educator, lawyer and poet.
Cheng Ting was admitted at age 16 to the elite National Southwest Unified University (Lianda), which was created during WWII by merging Tsinghua, Beijing, and Nankai Universities. This university produced numerous famous scientists and scholars such as the physicists C.N. Yang and T.D. Lee.
Cheng Ting studied aerospace engineering (originally part of Tsinghua), graduating in 1944. He became a research assistant at China's Aerospace Research Institute and a lecturer at Sichuan University. He also taught aerodynamics for several years to advanced students at the air force engineering academy.
In 1946 he was awarded one of only two Ministry of Education fellowships in his field to pursue graduate work in the United States. In 1946-1947 he published a three-volume book, co-authored with Professor Li Shoutong, on the structures of thin-walled airplanes.
In January 1948, he left China by ocean liner, crossing the Pacific and arriving in San Francisco. ...
My mother's father was a KMT general, and her family related to Chiang Kai Shek by marriage. Both my grandfather and Chiang attended the military academy Shinbu Gakko in Tokyo. When the KMT lost to the communists, her family fled China and arrived in Taiwan in 1949. My mother's family had been converted to Christianity in the 19th century and became Methodists, like Sun Yat Sen. (I attended Methodist Sunday school while growing up in Ames IA.) My grandfather was a partner of T.V. Soong in the distribution of bibles in China in the early 20th century.

My father's family remained mostly in Zhejiang and suffered through the communist takeover, Great Leap Forward, and Cultural Revolution. My father never returned to China and never saw his parents again.

When I met my uncle (a retired Tsinghua professor) and some of my cousins in Hangzhou in 2010, they gave me a four volume family history that had originally been printed in the 1930s. The Hsu (Xu) lineage began in the 10th century BC and continued to my father, in the 113th generation. His entry is the bottom photo below.
Wikipedia: The State of Xu (Chinese: 徐) (also called Xu Rong (徐戎) or Xu Yi (徐夷)[a] by its enemies)[4][5] was an independent Huaiyi state of the Chinese Bronze Age[6] that was ruled by the Ying family (嬴) and controlled much of the Huai River valley for at least two centuries.[3][7] It was centered in northern Jiangsu and Anhui. ...

Generations 114 and 115:

Four volume history of the Hsu (Xu) family, beginning in the 10th century BC. The first 67 generations are covered rather briefly, only indicating prominent individuals in each generation of the family tree. The books are mostly devoted to generations 68-113 living in Zhejiang. (Earlier I wrote that it was two volumes, but it's actually four. The printing that I have is two thick books.)

## Sunday, May 02, 2021

### 40 Years of Quantum Computation and Quantum Information

This is a great article on the 1981 conference which one could say gave birth to quantum computing / quantum information.
Technology Review: Quantum computing as we know it got its start 40 years ago this spring at the first Physics of Computation Conference, organized at MIT’s Endicott House by MIT and IBM and attended by nearly 50 researchers from computing and physics—two groups that rarely rubbed shoulders.
Twenty years earlier, in 1961, an IBM researcher named Rolf Landauer had found a fundamental link between the two fields: he proved that every time a computer erases a bit of information, a tiny bit of heat is produced, corresponding to the entropy increase in the system. In 1972 Landauer hired the theoretical computer scientist Charlie Bennett, who showed that the increase in entropy can be avoided by a computer that performs its computations in a reversible manner. Curiously, Ed Fredkin, the MIT professor who cosponsored the Endicott Conference with Landauer, had arrived at this same conclusion independently, despite never having earned even an undergraduate degree. Indeed, most retellings of quantum computing’s origin story overlook Fredkin’s pivotal role.
Fredkin’s unusual career began when he enrolled at the California Institute of Technology in 1951. Although brilliant on his entrance exams, he wasn’t interested in homework—and had to work two jobs to pay tuition. Doing poorly in school and running out of money, he withdrew in 1952 and enlisted in the Air Force to avoid being drafted for the Korean War.
A few years later, the Air Force sent Fredkin to MIT Lincoln Laboratory to help test the nascent SAGE air defense system. He learned computer programming and soon became one of the best programmers in the world—a group that probably numbered only around 500 at the time.
Upon leaving the Air Force in 1958, Fredkin worked at Bolt, Beranek, and Newman (BBN), which he convinced to purchase its first two computers and where he got to know MIT professors Marvin Minsky and John McCarthy, who together had pretty much established the field of artificial intelligence. In 1962 he accompanied them to Caltech, where McCarthy was giving a talk. There Minsky and Fredkin met with Richard Feynman ’39, who would win the 1965 Nobel Prize in physics for his work on quantum electrodynamics. Feynman showed them a handwritten notebook filled with computations and challenged them to develop software that could perform symbolic mathematical computations. ...
... in 1974 he headed back to Caltech to spend a year with Feynman. The deal was that Fredkin would teach Feynman computing, and Feynman would teach Fredkin quantum physics. Fredkin came to understand quantum physics, but he didn’t believe it. He thought the fabric of reality couldn’t be based on something that could be described by a continuous measurement. Quantum mechanics holds that quantities like charge and mass are quantized—made up of discrete, countable units that cannot be subdivided—but that things like space, time, and wave equations are fundamentally continuous. Fredkin, in contrast, believed (and still believes) with almost religious conviction that space and time must be quantized as well, and that the fundamental building block of reality is thus computation. Reality must be a computer! In 1978 Fredkin taught a graduate course at MIT called Digital Physics, which explored ways of reworking modern physics along such digital principles.
Feynman, however, remained unconvinced that there were meaningful connections between computing and physics beyond using computers to compute algorithms. So when Fredkin asked his friend to deliver the keynote address at the 1981 conference, he initially refused. When promised that he could speak about whatever he wanted, though, Feynman changed his mind—and laid out his ideas for how to link the two fields in a detailed talk that proposed a way to perform computations using quantum effects themselves.
Feynman explained that computers are poorly equipped to help simulate, and thereby predict, the outcome of experiments in particle physics—something that’s still true today. Modern computers, after all, are deterministic: give them the same problem, and they come up with the same solution. Physics, on the other hand, is probabilistic. So as the number of particles in a simulation increases, it takes exponentially longer to perform the necessary computations on possible outputs. The way to move forward, Feynman asserted, was to build a computer that performed its probabilistic computations using quantum mechanics.
[ Note to reader: the discussion in the last sentences above is a bit garbled. The exponential difficulty that classical computers have with quantum calculations has to do with entangled states which live in Hilbert spaces of exponentially large dimension. Probability is not really the issue; the issue is the huge size of the space of possible states. Indeed quantum computations are strictly deterministic unitary operations acting in this Hilbert space. ]

Feynman hadn’t prepared a formal paper for the conference, but with the help of Norm Margolus, PhD ’87, a graduate student in Fredkin’s group who recorded and transcribed what he said there, his talk was published in the International Journal of Theoretical Physics under the title “Simulating Physics with Computers.” ...

Feynman's 1981 lecture Simulating Physics With Computers.

Fredkin was correct about the (effective) discreteness of spacetime, although he probably did not realize this is a consequence of gravitational effects: see, e.g., Minimum Length From First Principles. In fact, Hilbert Space (the state space of quantum mechanics) itself may be discrete.

Related:

My paper on the Margolus-Levitin Theorem in light of gravity:

We derive a fundamental upper bound on the rate at which a device can process information (i.e., the number of logical operations per unit time), arising from quantum mechanics and general relativity. In Planck units a device of volume V can execute no more than the cube root of V operations per unit time. We compare this to the rate of information processing performed by nature in the evolution of physical systems, and find a connection to black hole entropy and the holographic principle.

Participants in the 1981 meeting:

Physics of Computation Conference, Endicott House, MIT, May 6–8, 1981. 1 Freeman Dyson, 2 Gregory Chaitin, 3 James Crutchfield, 4 Norman Packard, 5 Panos Ligomenides, 6 Jerome Rothstein, 7 Carl Hewitt, 8 Norman Hardy, 9 Edward Fredkin, 10 Tom Toffoli, 11 Rolf Landauer, 12 John Wheeler, 13 Frederick Kantor, 14 David Leinweber, 15 Konrad Zuse, 16 Bernard Zeigler, 17 Carl Adam Petri, 18 Anatol Holt, 19 Roland Vollmar, 20 Hans Bremerman, 21 Donald Greenspan, 22 Markus Buettiker, 23 Otto Floberth, 24 Robert Lewis, 25 Robert Suaya, 26 Stand Kugell, 27 Bill Gosper, 28 Lutz Priese, 29 Madhu Gupta, 30 Paul Benioff, 31 Hans Moravec, 32 Ian Richards, 33 Marian Pour-El, 34 Danny Hillis, 35 Arthur Burks, 36 John Cocke, 37 George Michaels, 38 Richard Feynman, 39 Laurie Lingham, 40 P. S. Thiagarajan, 41 Marin Hassner, 42 Gerald Vichnaic, 43 Leonid Levin, 44 Lev Levitin, 45 Peter Gacs, 46 Dan Greenberger. (Photo courtesy Charles Bennett)

## Wednesday, April 28, 2021

### Let The Bodies Pile High In Their Thousands (Boris Johnson)

In the UK:
Recording a conversation in secret is not a criminal offence and is not prohibited. As long as the recording is for personal use you don’t need to obtain consent or let the other person know.
The security man in the foyer of No 10 Downing Street asks that you turn off your phone and deposit it in a wooden cubby shelf built into the wall. I sometimes wondered what the odds were that someone might walk out with my phone -- a disaster, obviously.

But it is not difficult to keep your phone as close attention is not paid. (Or, one could enter with more than one phone.) I'm not saying I have ever disobeyed the rules but I know that it is possible.

Of course the No 10 staffers all have their phones, which are necessary for their work throughout the day. Thus every meeting at the heart of British government is in danger of being surreptitiously but legally recorded.
Dominic Cummings 'has audio recordings of key government conversations', ally claims (Daily Mail
Dominic Cummings 'has audio recordings of key government conversations' and 'can back up a lot of his claims', ally of the former chief adviser says.
Dominic Cummings kept audio recordings of key conversations, an ally claims Former chief adviser is locked in an explosive war of words with Boris Johnson.
Whitehall source said officials did not know extent of material Mr Cummings has.
Dominic Cummings kept audio recordings of key conversations in government, an ally claimed last night. The former chief adviser is locked in an explosive war of words with Boris Johnson after Downing Street accused him of a string of damaging leaks.
No 10 attempted to rubbish his claims on Friday night, saying it was not true that the Prime Minister had discussed ending a leak inquiry after a friend of his fiance Carrie Symonds was identified as the likely suspect. But an ally of Mr Cummings said the PM's former chief adviser had taken a treasure trove of material with him when he left Downing Street last year, including audio recordings of discussions with senior ministers and officials.
'Dom has stuff on tape,' the ally said. 'They are mad to pick a fight with him because he will be able to back up a lot of his claims.
Dom is an admirer of Bismarck. Never underestimate him.
"With a gentleman I am always a gentleman and a half, and when I have to do with a pirate, I try to be a pirate and a half."
Tories scramble to defend Johnson: Politics Weekly podcast (Guardian)

Note the media have no idea what is really going on, as usual.

## Friday, April 23, 2021

### How a Physicist Became a Climate Truth Teller: Steve Koonin

I read an early draft of Koonin's new book discussed in the WSJ article excerpted below, and I highly recommend it.

Video above is from a 2019 talk discussed in this earlier post: Certainties and Uncertainties in our Energy and Climate Futures: Steve Koonin.
My own views (consistent, as far as I can tell, with what Steve says in the talk):
1. Evidence for recent warming (~1 degree C) is strong.
2. There exist previous eras of natural (non-anthropogenic) global temperature change of similar magnitude to what is happening now.
3. However, it is plausible that at least part of the recent temperature rise is due to increase of atmospheric CO2 due to human activity.
4. Climate models still have significant uncertainties. While the direct effect of CO2 IR absorption is well understood, second order effects like clouds, distribution of water vapor in the atmosphere, etc. are not under good control. The increase in temperature from a doubling of atmospheric CO2 is still uncertain to a factor of 2-3 and at the low range (e.g., 1.5 degree C) is not catastrophic. The direct effect of CO2 absorption is modest and at the low range (~1 degree C) of current consensus model predictions. Potentially catastrophic outcomes are due to second order effects that are not under good theoretical or computational control.
5. Even if a catastrophic outcome is only a low probability tail risk, it is prudent to explore technologies that reduce greenhouse gas production.
6. A Red Team exercise, properly done, would clarify what is certain and uncertain in climate science.
Simply stating these views can get you attacked by crazy people.
Buy Steve's book for an accessible and fairly non-technical explanation of these points.
WSJ: ... Barack Obama is one of many who have declared an “epistemological crisis,” in which our society is losing its handle on something called truth.
Thus an interesting experiment will be his and other Democrats’ response to a book by Steven Koonin, who was chief scientist of the Obama Energy Department. Mr. Koonin argues not against current climate science but that what the media and politicians and activists say about climate science has drifted so far out of touch with the actual science as to be absurdly, demonstrably false.
This is not an altogether innocent drifting, he points out in a videoconference interview from his home in Cold Spring, N.Y. In 2019 a report by the presidents of the National Academies of Sciences claimed the “magnitude and frequency of certain extreme events are increasing.” The United Nations Intergovernmental Panel on Climate Change, which is deemed to compile the best science, says all such claims should be treated with “low confidence.”
... Mr. Koonin, 69, and I are of one mind on 2018’s U.S. Fourth National Climate Assessment, issued in Donald Trump’s second year, which relied on such overegged worst-case emissions and temperature projections that even climate activists were abashed (a revolt continues to this day). “The report was written more to persuade than to inform,” he says. “It masquerades as objective science but was written as—all right, I’ll use the word—propaganda.”
Mr. Koonin is a Brooklyn-born math whiz and theoretical physicist, a product of New York’s selective Stuyvesant High School. His parents, with less than a year of college between them, nevertheless intuited in 1968 exactly how to handle an unusually talented and motivated youngster: You want to go cross the country to Caltech at age 16? “Whatever you think is right, go ahead,” they told him. “I wanted to know how the world works,” Mr. Koonin says now. “I wanted to do physics since I was 6 years old, when I didn’t know it was called physics.”
He would teach at Caltech for nearly three decades, serving as provost in charge of setting the scientific agenda for one of the country’s premier scientific institutions. Along the way he opened himself to the world beyond the lab. He was recruited at an early age by the Institute for Defense Analyses, a nonprofit group with Pentagon connections, for what he calls “national security summer camp: meeting generals and people in congress, touring installations, getting out on battleships.” The federal government sought “engagement” with the country’s rising scientist elite. It worked.
He joined and eventually chaired JASON, an elite private group that provides classified and unclassified advisory analysis to federal agencies. (The name isn’t an acronym and comes from a character in Greek mythology.) He got involved in the cold-fusion controversy. He arbitrated a debate between private and government teams competing to map the human genome on whether the target error rate should be 1 in 10,000 or whether 1 in 100 was good enough.
He began planting seeds as an institutionalist. He joined the oil giant BP as chief scientist, working for John Browne, now Baron Browne of Madingley, who had redubbed the company “Beyond Petroleum.” Using $500 million of BP’s money, Mr. Koonin created the Energy Biosciences Institute at Berkeley that’s still going strong. Mr. Koonin found his interest in climate science growing, “first of all because it’s wonderful science. It’s the most multidisciplinary thing I know. It goes from the isotopic composition of microfossils in the sea floor all the way through to the regulation of power plants.” From deeply examining the world’s energy system, he also became convinced that the real climate crisis was a crisis of political and scientific candor. He went to his boss and said, “John, the world isn’t going to be able to reduce emissions enough to make much difference.” Mr. Koonin still has a lot of Brooklyn in him: a robust laugh, a gift for expression and for cutting to the heart of any matter. His thoughts seem to be governed by an all-embracing realism. Hence the book coming out next month, Unsettled: What Climate Science Tells Us, What It Doesn’t, and Why It Matters. Any reader would benefit from its deft, lucid tour of climate science, the best I’ve seen. His rigorous parsing of the evidence will have you questioning the political class’s compulsion to manufacture certainty where certainty doesn’t exist. You will come to doubt the usefulness of centurylong forecasts claiming to know how 1% shifts in variables will affect a global climate that we don’t understand with anything resembling 1% precision. ... Note Added from comments: If you're older like Koonin or myself you can remember a time when climate change was entirely devoid of tribal associations -- it was not in the political domain at all. It is easier for us just to concentrate on where the science is, and indeed we can remember where it was in the 1990s or 2000s. Koonin was MUCH more concerned about alternative energy and climate than the typical scientist and that was part of his motivation for supporting the Berkeley Energy Biosciences Institute, created 2007. The fact that it was a$500M partnership between Berkeley and BP was a big deal and much debated at the time, but there was never any evidence that the science they did was negatively impacted.

It is IRONIC that his focus on scientific rigor now gets him labeled as a climate denier (or sympathetic to the "wrong" side). ALL scientists should be sceptical, especially about claims regarding long term prediction in complex systems.

Contrast the uncertainty estimates in the IPCC reports (which are not defensible and did not change for ~20y!) vs the (g-2) anomaly that was in the news recently.

When I was at Harvard the physics department and applied science and engineering school shared a coffee lounge. I used to sit there and work in the afternoon and it happened that one of the climate modeling labs had their group meetings there. So for literally years I overheard their discussions about uncertainties concerning water vapor, clouds, etc. which to this day are not fully under control. This is illustrated in Fig1 at the link: https://infoproc.blogspot.c...

The gap between what real scientists say in private and what the public (or non-specialists) gets second hand through the media or politically-focused "scientific policy reports" is vast...

If you don't think we can have long-lasting public delusions regarding "settled science" (like a decade long stock or real estate bubble), look up nuclear winter, which has a lot of similarities to greenhouse gas-driven climate change. Note, I am not claiming that I know with high confidence that nuclear winter can't happen, but I AM claiming that the confidence level expressed by the climate scientists working on it at the time was absurd and communicated in a grotesquely distorted fashion to political leaders and the general public. Even now I would say the scientific issue is not settled, due to its sheer complexity, which is LESS than the complexity involved in predicting long term climate change!

## Sunday, April 18, 2021

### Francois Chollet - Intelligence and Generalization, Psychometrics for Robots (AI/ML)

If you have thought a lot about AI and deep learning you may find much of this familiar. Nevertheless I enjoyed the discussion. Apparently Chollet's views (below) are controversial in some AI/ML communities but I do not understand why.

Chollet's Abstraction and Reasoning Corpus (ARC) = Raven's Matrices for AIs :-)
Show Notes:
...Francois has a clarity of thought that I've never seen in any other human being! He has extremely interesting views on intelligence as generalisation, abstraction and an information conversation ratio. He wrote on the measure of intelligence at the end of 2019 and it had a huge impact on my thinking. He thinks that NNs can only model continuous problems, which have a smooth learnable manifold and that many "type 2" problems which involve reasoning and/or planning are not suitable for NNs. He thinks that many problems have type 1 and type 2 enmeshed together. He thinks that the future of AI must include program synthesis to allow us to generalise broadly from a few examples, but the search could be guided by neural networks because the search space is interpolative to some extent.
Tim Intro [00:00:00​]
Manifold hypothesis and interpolation [00:06:15​]
Yann LeCun skit [00:07:58​]
Discrete vs continuous [00:11:12​]
NNs are not turing machines [00:14:18​]
Main show kick-off [00:16:19​]
DNN models are locally sensitive hash tables and only efficiently encode some kinds of data well [00:18:17​]
Why do natural data have manifolds? [00:22:11​]
Finite NNs are not "turing complete" [00:25:44​]
The dichotomy of continuous vs discrete problems, and abusing DL to perform the former [00:27:07​]
Reality really annoys a lot of people, and ...GPT-3 [00:35:55​]
There are type one problems and type 2 problems, but...they are enmeshed [00:39:14​]
Chollet's definition of intelligence and how to construct analogy [00:41:45​]
How are we going to combine type 1 and type 2 programs? [00:47:28​]
Will topological analogies be robust and escape the curse of brittleness? [00:52:04​]
Is type 1 and 2 two different physical systems? Is there a continuum? [00:54:26​]
Building blocks and the ARC Challenge [00:59:05​]
Solve ARC == intelligent? [01:01:31​]
Measure of intelligence formalism -- it's a whitebox method [01:03:50​]
Generalization difficulty [01:10:04​]
Lets create a marketplace of generated intelligent ARC agents! [01:11:54​]
Mapping ARC to psychometrics [01:16:01​]
Keras [01:16:45​]
New backends for Keras? JAX? [01:20:38​]
Intelligence Explosion [01:25:07​]
Bottlenecks in large organizations [01:34:29​]
Summing up the intelligence explosion [01:36:11​]
Post-show debrief [01:40:45​]
This is Chollet's paper which is the focus of much of the discussion.
On the Measure of Intelligence
François Chollet
https://arxiv.org/abs/1911.01547
To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.
Notes on the paper by Robert Lange (TU-Berlin), including illustrations like the ones below.