Pessimism of the Intellect, Optimism of the Will Favorite posts | Manifold podcast | Twitter: @hsu_steve
Wednesday, May 26, 2021
How Dominic Cummings And The Warner Brothers Saved The UK
Saturday, May 22, 2021
Feynman Lectures on the Strong Interactions (Jim Cline notes)
Feynman Lectures on the Strong Interactions
Richard P. Feynman, James M. Cline
These twenty-two lectures, with exercises, comprise the extent of what was meant to be a full-year graduate-level course on the strong interactions and QCD, given at Caltech in 1987-88. The course was cut short by the illness that led to Feynman's death. Several of the lectures were finalized in collaboration with Feynman for an anticipated monograph based on the course. The others, while retaining Feynman's idiosyncrasies, are revised similarly to those he was able to check. His distinctive approach and manner of presentation are manifest throughout. Near the end he suggests a novel, nonperturbative formulation of quantum field theory in D dimensions. Supplementary material is provided in appendices and ancillary files, including verbatim transcriptions of three lectures and the corresponding audiotaped recordings.
Sunday, May 16, 2021
Ditchley Foundation meeting: China Today and Tomorrow
China Today and Tomorrow
20 MAY 2021 - 21 MAY 2021
This Ditchley conference will focus on China, its internal state and sense of self today, its role in the region and world, and how these might evolve in years to come.
There are broadly two current divergent narratives about China. The first is that China’s successful response to the pandemic has accelerated China’s ascent to be the world’s pre-eminent economic power. The Made in China 2025 strategy will also see China take the lead in some technologies beyond 5G, become self-sufficient in silicon chip production and free itself largely of external constraints on growth. China’s internal market will grow, lessening dependence on exports and that continued growth will maintain the bargain between the Chinese people and the Chinese Communist Party through prosperity and stability. Retaining some elements of previous Chinese strategy though, this confidence is combined with a degree of humility: China is concerned with itself and its region, not becoming a global superpower or challenging the US. Economic supremacy is the aim but military strategy remains focused on defence, not increasing international leverage or scope of action.
The second competing narrative is that China’s position is more precarious than it appears. The Belt and Road Initiative will bring diplomatic support from client countries but not real economic gains. Human rights violations will damage China abroad. Internally the pressures on natural resources will prove hard to sustain. Democratic and free-market innovation, combined with a bit more industrial strategy, will outstrip China’s efforts. Careful attention to supply chains in the West will meanwhile reduce critical reliance on China and curb China’s economic expansion. This perceived fragility is often combined though with a sense of heightened Chinese ambition abroad, not just through the Belt and Road Initiative but in challenging the democratic global norms established since 1989 by presenting technologically-enabled and effective authoritarian rule as an alternative model for the world, rather than just a Chinese solution.
What is the evidence today for where we should settle between these narratives? What trends should we watch to determine likely future results? ...
[Suggested background reading at link above.]Unfortunately this meeting will be virtual. The video below gives some sense of the unique charm of in-person workshops at Ditchley.
... analysis by German academic Gunnar Heinsohn. Two of his slides appear below.
1. It is possible that by 2050 the highly able STEM workforce in PRC will be ~10x larger than in the US and comparable to or larger than the rest of the world combined. Here "highly able" means roughly top few percentile math ability in developed countries (e.g., EU), as measured by PISA at age 15.
[ It is trivial to obtain this kind of estimate: PRC population is ~4x US population and fraction of university students in STEM is at least ~2x higher. Pool of highly able 15 year olds as estimated by PISA or TIMMS international testing regimes is much larger than in US, even per capita. Heinsohn's estimate is somewhat high because he uses PISA numbers that probably overstate the population fraction of Level 6 kids in PRC. Current PISA studies disproportionately sample from more developed areas of China. At bottom (asterisk) he uses results from Taiwan/Macau that give a smaller ~20x advantage of PRC vs USA. My own ~10x estimate is quite conservative in comparison. ]2. The trajectory of international patent filings shown below is likely to continue.
Wednesday, May 12, 2021
Neural Tangent Kernels and Theoretical Foundations of Deep Learning
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot, Franck Gabriel, Clément Hongler
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function fθ (which maps input vectors to output vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and it stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We prove the positive-definiteness of the limiting NTK when the data is supported on the sphere and the non-linearity is non-polynomial. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function fθ follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stopping. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.The results are remarkably well summarized in the wikipedia entry on Neural Tangent Kernels:
For most common neural network architectures, in the limit of large layer width the NTK becomes constant. This enables simple closed form statements to be made about neural network predictions, training dynamics, generalization, and loss surfaces. For example, it guarantees that wide enough ANNs converge to a global minimum when trained to minimize an empirical loss. ...
An Artificial Neural Network (ANN) with scalar output consists in a family of functions parametrized by a vector of parameters .
The Neural Tangent Kernel (NTK) is a kernel defined by
In the language of kernel methods, the NTK is the kernel associated with the feature map .
For a dataset with scalar labels and a loss function , the associated empirical loss, defined on functions , is given by
When training the ANN is trained to fit the dataset (i.e. minimize ) via continuous-time gradient descent, the parameters evolve through the ordinary differential equation:
During training the ANN output function follows an evolution differential equation given in terms of the NTK:
This equation shows how the NTK drives the dynamics of in the space of functions during training.
Saturday, May 08, 2021
Three Thousand Years and 115 Generations of 徐 (Hsu / Xu)
Cheng Ting Hsu was born December 1, 1923 in Wenling, Zhejiang province, China. His grandfather, Zan Yao Hsu was a poet and doctor of Chinese medicine. His father, Guang Qiu Hsu graduated from college in the 1920's and was an educator, lawyer and poet.
Cheng Ting was admitted at age 16 to the elite National Southwest Unified University (Lianda), which was created during WWII by merging Tsinghua, Beijing, and Nankai Universities. This university produced numerous famous scientists and scholars such as the physicists C.N. Yang and T.D. Lee.
Cheng Ting studied aerospace engineering (originally part of Tsinghua), graduating in 1944. He became a research assistant at China's Aerospace Research Institute and a lecturer at Sichuan University. He also taught aerodynamics for several years to advanced students at the air force engineering academy.
In 1946 he was awarded one of only two Ministry of Education fellowships in his field to pursue graduate work in the United States. In 1946-1947 he published a three-volume book, co-authored with Professor Li Shoutong, on the structures of thin-walled airplanes.
In January 1948, he left China by ocean liner, crossing the Pacific and arriving in San Francisco. ...My mother's father was a KMT general, and her family related to Chiang Kai Shek by marriage. Both my grandfather and Chiang attended the military academy Shinbu Gakko in Tokyo. When the KMT lost to the communists, her family fled China and arrived in Taiwan in 1949. My mother's family had been converted to Christianity in the 19th century and became Methodists, like Sun Yat Sen. (I attended Methodist Sunday school while growing up in Ames IA.) My grandfather was a partner of T.V. Soong in the distribution of bibles in China in the early 20th century.
Wikipedia: The State of Xu (Chinese: 徐) (also called Xu Rong (徐戎) or Xu Yi (徐夷)[a] by its enemies)[4][5] was an independent Huaiyi state of the Chinese Bronze Age[6] that was ruled by the Ying family (嬴) and controlled much of the Huai River valley for at least two centuries.[3][7] It was centered in northern Jiangsu and Anhui. ...
Generations 114 and 115:
Four volume history of the Hsu (Xu) family, beginning in the 10th century BC. The first 67 generations are covered rather briefly, only indicating prominent individuals in each generation of the family tree. The books are mostly devoted to generations 68-113 living in Zhejiang. (Earlier I wrote that it was two volumes, but it's actually four. The printing that I have is two thick books.)
Sunday, May 02, 2021
40 Years of Quantum Computation and Quantum Information
This is a great article on the 1981 conference which one could say gave birth to quantum computing / quantum information.
Technology Review: Quantum computing as we know it got its start 40 years ago this spring at the first Physics of Computation Conference, organized at MIT’s Endicott House by MIT and IBM and attended by nearly 50 researchers from computing and physics—two groups that rarely rubbed shoulders.
Twenty years earlier, in 1961, an IBM researcher named Rolf Landauer had found a fundamental link between the two fields: he proved that every time a computer erases a bit of information, a tiny bit of heat is produced, corresponding to the entropy increase in the system. In 1972 Landauer hired the theoretical computer scientist Charlie Bennett, who showed that the increase in entropy can be avoided by a computer that performs its computations in a reversible manner. Curiously, Ed Fredkin, the MIT professor who cosponsored the Endicott Conference with Landauer, had arrived at this same conclusion independently, despite never having earned even an undergraduate degree. Indeed, most retellings of quantum computing’s origin story overlook Fredkin’s pivotal role.
Fredkin’s unusual career began when he enrolled at the California Institute of Technology in 1951. Although brilliant on his entrance exams, he wasn’t interested in homework—and had to work two jobs to pay tuition. Doing poorly in school and running out of money, he withdrew in 1952 and enlisted in the Air Force to avoid being drafted for the Korean War.
A few years later, the Air Force sent Fredkin to MIT Lincoln Laboratory to help test the nascent SAGE air defense system. He learned computer programming and soon became one of the best programmers in the world—a group that probably numbered only around 500 at the time.
Upon leaving the Air Force in 1958, Fredkin worked at Bolt, Beranek, and Newman (BBN), which he convinced to purchase its first two computers and where he got to know MIT professors Marvin Minsky and John McCarthy, who together had pretty much established the field of artificial intelligence. In 1962 he accompanied them to Caltech, where McCarthy was giving a talk. There Minsky and Fredkin met with Richard Feynman ’39, who would win the 1965 Nobel Prize in physics for his work on quantum electrodynamics. Feynman showed them a handwritten notebook filled with computations and challenged them to develop software that could perform symbolic mathematical computations. ...
... in 1974 he headed back to Caltech to spend a year with Feynman. The deal was that Fredkin would teach Feynman computing, and Feynman would teach Fredkin quantum physics. Fredkin came to understand quantum physics, but he didn’t believe it. He thought the fabric of reality couldn’t be based on something that could be described by a continuous measurement. Quantum mechanics holds that quantities like charge and mass are quantized—made up of discrete, countable units that cannot be subdivided—but that things like space, time, and wave equations are fundamentally continuous. Fredkin, in contrast, believed (and still believes) with almost religious conviction that space and time must be quantized as well, and that the fundamental building block of reality is thus computation. Reality must be a computer! In 1978 Fredkin taught a graduate course at MIT called Digital Physics, which explored ways of reworking modern physics along such digital principles.
Feynman, however, remained unconvinced that there were meaningful connections between computing and physics beyond using computers to compute algorithms. So when Fredkin asked his friend to deliver the keynote address at the 1981 conference, he initially refused. When promised that he could speak about whatever he wanted, though, Feynman changed his mind—and laid out his ideas for how to link the two fields in a detailed talk that proposed a way to perform computations using quantum effects themselves.
Feynman explained that computers are poorly equipped to help simulate, and thereby predict, the outcome of experiments in particle physics—something that’s still true today. Modern computers, after all, are deterministic: give them the same problem, and they come up with the same solution. Physics, on the other hand, is probabilistic. So as the number of particles in a simulation increases, it takes exponentially longer to perform the necessary computations on possible outputs. The way to move forward, Feynman asserted, was to build a computer that performed its probabilistic computations using quantum mechanics.
[ Note to reader: the discussion in the last sentences above is a bit garbled. The exponential difficulty that classical computers have with quantum calculations has to do with entangled states which live in Hilbert spaces of exponentially large dimension. Probability is not really the issue; the issue is the huge size of the space of possible states. Indeed quantum computations are strictly deterministic unitary operations acting in this Hilbert space. ]
Feynman's 1981 lecture Simulating Physics With Computers.Feynman hadn’t prepared a formal paper for the conference, but with the help of Norm Margolus, PhD ’87, a graduate student in Fredkin’s group who recorded and transcribed what he said there, his talk was published in the International Journal of Theoretical Physics under the title “Simulating Physics with Computers.” ...
We derive a fundamental upper bound on the rate at which a device can process information (i.e., the number of logical operations per unit time), arising from quantum mechanics and general relativity. In Planck units a device of volume V can execute no more than the cube root of V operations per unit time. We compare this to the rate of information processing performed by nature in the evolution of physical systems, and find a connection to black hole entropy and the holographic principle.
Physics of Computation Conference, Endicott House, MIT, May 6–8, 1981. 1 Freeman Dyson, 2 Gregory Chaitin, 3 James Crutchfield, 4 Norman Packard, 5 Panos Ligomenides, 6 Jerome Rothstein, 7 Carl Hewitt, 8 Norman Hardy, 9 Edward Fredkin, 10 Tom Toffoli, 11 Rolf Landauer, 12 John Wheeler, 13 Frederick Kantor, 14 David Leinweber, 15 Konrad Zuse, 16 Bernard Zeigler, 17 Carl Adam Petri, 18 Anatol Holt, 19 Roland Vollmar, 20 Hans Bremerman, 21 Donald Greenspan, 22 Markus Buettiker, 23 Otto Floberth, 24 Robert Lewis, 25 Robert Suaya, 26 Stand Kugell, 27 Bill Gosper, 28 Lutz Priese, 29 Madhu Gupta, 30 Paul Benioff, 31 Hans Moravec, 32 Ian Richards, 33 Marian Pour-El, 34 Danny Hillis, 35 Arthur Burks, 36 John Cocke, 37 George Michaels, 38 Richard Feynman, 39 Laurie Lingham, 40 P. S. Thiagarajan, 41 Marin Hassner, 42 Gerald Vichnaic, 43 Leonid Levin, 44 Lev Levitin, 45 Peter Gacs, 46 Dan Greenberger. (Photo courtesy Charles Bennett)