Showing posts with label mathematics. Show all posts
Showing posts with label mathematics. Show all posts

Wednesday, November 10, 2021

Fundamental limit on angular measurements and rotations from quantum mechanics and general relativity (published version)

This is the published version of our recent paper. See previous discussion of the arXiv preprint: Finitism and Physics.
Physics Letters B Volume 823, 10 December 2021, 136763 
Fundamental limit on angular measurements and rotations from quantum mechanics and general relativity 
Xavier Calmet and Stephen D.H. Hsu 
https://doi.org/10.1016/j.physletb.2021.136763 
Abstract 
We show that the precision of an angular measurement or rotation (e.g., on the orientation of a qubit or spin state) is limited by fundamental constraints arising from quantum mechanics and general relativity (gravitational collapse). The limiting precision is 1/r in Planck units, where r is the physical extent of the (possibly macroscopic) device used to manipulate the spin state. This fundamental limitation means that spin states cannot be experimentally distinguished from each other if they differ by a sufficiently small rotation. Experiments cannot exclude the possibility that the space of quantum state vectors (i.e., Hilbert space) is fundamentally discrete, rather than continuous. We discuss the implications for finitism: does physics require infinity or a continuum?

In the revision we edited the second paragraph below to clarify the history regarding Hilbert's program, Gödel, and the status of the continuum in analysis. The continuum was quite controversial at the time and was one of the primary motivations for Hilbert's axiomatization. There is a kind of modern middle-brow view that epsilon and delta proofs are sufficient to resolve the question of rigor in analysis, but this ignores far more fundamental problems that forced Hilbert, von Neumann, Weyl, etc. to resort to logic and set theory.

In the early 20th century little was known about neuroscience (i.e., our finite brains made of atoms), and it had not been appreciated that the laws of physics themselves might contain internal constraints that prevent any experimental test of infinitely continuous structures. Hence we can understand Weyl's appeal to human intuition as a basis for the mathematical continuum (Platonism informed by Nature; Gödel was also a kind of Platonist), even if today it appears implausible. Now we suspect that our minds are simply finite machines and nothing more, and that Nature itself does not require a continuum -- i.e., it can be simulated perfectly well with finitary processes.

It may come as a surprise to physicists that infinity and the continuum are even today the subject of debate in mathematics and the philosophy of mathematics. Some mathematicians, called finitists, accept only finite mathematical objects and procedures [30]. The fact that physics does not require infinity or a continuum is an important empirical input to the debate over finitism. For example, a finitist might assert (contra the Platonist perspective adopted by many mathematicians) that human brains built from finite arrangements of atoms, and operating under natural laws (physics) that are finitistic, are unlikely to have trustworthy intuitions concerning abstract concepts such as the continuum. These facts about the brain and about physical laws stand in contrast to intuitive assumptions adopted by many mathematicians. For example, Weyl (Das Kontinuum [26], [27]) argues that our intuitions concerning the continuum originate in the mind's perception of the continuity of space-time.
There was a concerted effort beginning in the 20th century to place infinity and the continuum on a rigorous foundation using logic and set theory. As demonstrated by Gödel, Hilbert's program of axiomatization using finitary methods (originally motivated, in part, by the continuum in analysis) could not succeed. Opinions are divided on modern approaches which are non-finitary. For example, the standard axioms of Zermelo-Fraenkel (ZFC) set theory applied to infinite sets lead to many counterintuitive results such as the Banach-Tarski Paradox: given any two solid objects, the cut pieces of either one can be reassembled into the other [28]. When examined closely all of the axioms of ZFC (e.g., Axiom of Choice) are intuitively obvious if applied to finite sets, with the exception of the Axiom of Infinity, which admits infinite sets. (Infinite sets are inexhaustible, so application of the Axiom of Choice leads to pathological results.) The Continuum Hypothesis, which proposes that there is no cardinality strictly between that of the integers and reals, has been shown to be independent (neither provable nor disprovable) in ZFC [29]. Finitists assert that this illustrates how little control rigorous mathematics has on even the most fundamental properties of the continuum.
Weyl was never satisfied that the continuum and classical analysis had been placed on a solid foundation.
Das Kontinuum (Stanford Encyclopedia of Philosophy)
Another mathematical “possible” to which Weyl gave a great deal of thought is the continuum. During the period 1918–1921 he wrestled with the problem of providing the mathematical continuum—the real number line—with a logically sound formulation. Weyl had become increasingly critical of the principles underlying the set-theoretic construction of the mathematical continuum. He had come to believe that the whole set-theoretical approach involved vicious circles[11] to such an extent that, as he says, “every cell (so to speak) of this mighty organism is permeated by contradiction.” In Das Kontinuum he tries to overcome this by providing analysis with a predicative formulation—not, as Russell and Whitehead had attempted, by introducing a hierarchy of logically ramified types, which Weyl seems to have regarded as excessively complicated—but rather by confining the comprehension principle to formulas whose bound variables range over just the initial given entities (numbers). Accordingly he restricts analysis to what can be done in terms of natural numbers with the aid of three basic logical operations, together with the operation of substitution and the process of “iteration”, i.e., primitive recursion. Weyl recognized that the effect of this restriction would be to render unprovable many of the central results of classical analysis—e.g., Dirichlet’s principle that any bounded set of real numbers has a least upper bound[12]—but he was prepared to accept this as part of the price that must be paid for the security of mathematics.
As Weyl saw it, there is an unbridgeable gap between intuitively given continua (e.g. those of space, time and motion) on the one hand, and the “discrete” exact concepts of mathematics (e.g. that of natural number[13]) on the other. The presence of this chasm meant that the construction of the mathematical continuum could not simply be “read off” from intuition. It followed, in Weyl’s view, that the mathematical continuum must be treated as if it were an element of the transcendent realm, and so, in the end, justified in the same way as a physical theory. It was not enough that the mathematical theory be consistent; it must also be reasonable.
Das Kontinuum embodies Weyl’s attempt at formulating a theory of the continuum which satisfies the first, and, as far as possible, the second, of these requirements. In the following passages from this work he acknowledges the difficulty of the task:
… the conceptual world of mathematics is so foreign to what the intuitive continuum presents to us that the demand for coincidence between the two must be dismissed as absurd. (Weyl 1987, 108)
… the continuity given to us immediately by intuition (in the flow of time and of motion) has yet to be grasped mathematically as a totality of discrete “stages” in accordance with that part of its content which can be conceptualized in an exact way. 
See also The History of the Planck Length and the Madness of Crowds.

Monday, August 30, 2021

Finitism and Physics

New paper on arXiv today.

A brief precis: Gravitational collapse limits the amount of energy present in any space-time region. This in turn limits the precision of any measurement or experimental process that takes place in the region. This implies that the class of models of physics which are discrete and finite (finitistic) cannot be excluded experimentally by any realistic process. Note any digital computer simulation of physical phenomena is a finitistic model.

We conclude that physics (Nature) requires neither infinity nor the continuum. For instance, neither space-time nor the Hilbert space structure of quantum mechanics need be absolutely continuous. This has consequences for the finitist perspective in mathematics -- see excerpt below.
Fundamental Limit on Angular Measurements and Rotations from Quantum Mechanics and General Relativity 
arXiv:2108.11990 
Xavier Calmet, Stephen D.H. Hsu 
We show that the precision of an angular measurement or rotation (e.g., on the orientation of a qubit or spin state) is limited by fundamental constraints arising from quantum mechanics and general relativity (gravitational collapse). The limiting precision is 1/r in Planck units, where r is the physical extent of the (possibly macroscopic) device used to manipulate the spin state. This fundamental limitation means that spin states S1 and S2 cannot be experimentally distinguished from each other if they differ by a sufficiently small rotation. Experiments cannot exclude the possibility that the space of quantum state vectors (i.e., Hilbert space) is fundamentally discrete, rather than continuous. We discuss the implications for finitism: does physics require infinity or a continuum?

From the conclusions:

IV. FINITISM: DOES PHYSICS REQUIRE A CONTINUUM? 
Our intuitions about the existence and nature of a continuum arise from perceptions of space and time [21]. But the existence of a fundamental Planck length suggests that spacetime may not be a continuum. In that case, our intuitions originate from something (an idealization) that is not actually realized in Nature. 
Quantum mechanics is formulated using continuous structures such as Hilbert space and a smoothly varying wavefunction, incorporating complex numbers of arbitrary precision. However beautiful these structures may be, it is possible that they are idealizations that do not exist in the physical world. 
The introduction of gravity limits the precision necessary to formulate a model of fundamental quantum physics. Indeed, any potential structure smaller than the Planck length or the minimal angle considered here cannot be observed by any device subject to quantum mechanics, general relativity, and causality. Our results suggest that quantum mechanics combined with gravity does not require a continuum, nor any concept of infinity. 
It may come as a surprise to physicists that infinity and the continuum are even today the subject of debate in mathematics and the philosophy of mathematics. Some mathematicians, called finitists, accept only finite mathematical objects and procedures [25]. The fact that physics does not require infinity or a continuum is an important empirical input to the debate over finitism. For example, a finitist might assert (contra the Platonist perspective adopted by many mathematicians) that human brains built from finite arrangements of atoms, and operating under natural laws (physics) that are finitistic, are unlikely to have trustworthy intuitions concerning abstract concepts such as the continuum. These facts about the brain and about physical laws stand in contrast to intuitive assumptions adopted by many mathematicians. For example, Weyl (Das Kontinuum [21, 22]) argues that our intuitions concerning the continuum originate in the mind’s perception of the continuity of space-time. 
There was a concerted effort beginning in the 20th century to place infinity and the continuum on a rigorous foundation using logic and set theory. However, these efforts have not been successful. For example, the standard axioms of Zermelo-Fraenkel (ZFC) set theory applied to infinite sets lead to many counterintuitive results such as the Banach-Tarski Paradox: given any two solid objects, the cut pieces of either one can be reassembled into the other [23]. When examined closely all of the axioms of ZFC (e.g., Axiom of Choice) are intuitively obvious if applied to finite sets, with the exception of the Axiom of Infinity, which admits infinite sets. (Infinite sets are inexhaustible, so application of the Axiom of Choice leads to pathological results.) The Continuum Hypothesis, which proposes that there is no cardinality strictly between that of the integers and reals, has been shown to be independent (neither provable nor disprovable) in ZFC [24]. Finitists assert that this illustrates how little control rigorous mathematics has on even the most fundamental properties of the continuum. 
David Deutsch [26]: The reason why we find it possible to construct, say, electronic calculators, and indeed why we can perform mental arithmetic, cannot be found in mathematics or logic. The reason is that the laws of physics “happen to” permit the existence of physical models for the operations of arithmetic such as addition, subtraction and multiplication. 
This suggests the primacy of physical reality over mathematics, whereas usually the opposite assumption is made. From this perspective, the parts of mathematics which are simply models or abstractions of “real” physical things are most likely to be free of contradiction or misleading intuition. Aspects of mathematics which have no physical analog (e.g., infinite sets, the continuum) are prone to problems in formalization or mechanization. Physics – i.e., models which can be compared to experimental observation, actual “effective procedures” – does not ever require infinity, although it may be of some conceptual convenience. Hence it seems possible, and the finitists believe, that the Axiom of Infinity and its equivalents do not provide a sound foundation for mathematics.
See also 

We experience the physical world directly, so the highest confidence belief we have is in its reality. Mathematics is an invention of our brains, and cannot help but be inspired by the objects we find in the physical world. Our idealizations (such as "infinity") may or may not be well-founded. In fact, mathematics with infinity included may be very sick, as evidenced by Godel's results, or paradoxes in set theory. There is no reason that infinity is needed (as far as we know) to do physics. It is entirely possible that there are only a (large but) finite number of degrees of freedom in the physical universe.
Paul Cohen: I will ascribe to Skolem a view, not explicitly stated by him, that there is a reality to mathematics, but axioms cannot describe it. Indeed one goes further and says that there is no reason to think that any axiom system can adequately describe it.
This "it" (mathematics) that Cohen describes may be the set of idealizations constructed by our brains extrapolating from physical reality. But there is no guarantee that these idealizations have a strong kind of internal consistency and indeed they cannot be adequately described by any axiom system.



Note added
: I should clarify the paragraph from our paper that begins
There was a concerted effort beginning in the 20th century to place infinity and the continuum on a rigorous foundation using logic and set theory. However, these efforts have not been successful. ...
This refers to Hilbert's Program:
In the early 1920s, the German mathematician David Hilbert (1862–1943) put forward a new proposal for the foundation of classical mathematics which has come to be known as Hilbert’s Program. It calls for a formalization of all of mathematics in axiomatic form, together with a proof that this axiomatization of mathematics is consistent. The consistency proof itself was to be carried out using only what Hilbert called “finitary” methods. The special epistemological character of finitary reasoning then yields the required justification of classical mathematics. Although Hilbert proposed his program in this form only in 1921, various facets of it are rooted in foundational work of his going back until around 1900, when he first pointed out the necessity of giving a direct consistency proof of analysis. ...
which Godel showed is not possible to carry out. Note that one of Hilbert's main motivations was the continuum (e.g., construction of the Reals in analysis). What has subsequently been adopted as the rigorous basis for analysis does not satisfy Hilbert's desire for axiomatic, finitary methods. 

The remaining sentences in the paragraph are meant to elucidate aspects of the modern treatment that its critics find unappealing. Of course, judgements of this type are philosophical in nature. 
... For example, the standard axioms of Zermelo-Fraenkel (ZFC) set theory applied to infinite sets lead to many counterintuitive results such as the Banach-Tarski Paradox: given any two solid objects, the cut pieces of either one can be reassembled into the other [23]. When examined closely all of the axioms of ZFC (e.g., Axiom of Choice) are intuitively obvious if applied to finite sets, with the exception of the Axiom of Infinity, which admits infinite sets. (Infinite sets are inexhaustible, so application of the Axiom of Choice leads to pathological results.) The Continuum Hypothesis, which proposes that there is no cardinality strictly between that of the integers and reals, has been shown to be independent (neither provable nor disprovable) in ZFC [24]. Finitists assert that this illustrates how little control rigorous mathematics has on even the most fundamental properties of the continuum. 
See also Paul Cohen on this topic (source of the quote above about Skolem and axiomatization):
Skolem and pessimism about proof in mathematics 
Abstract: Attitudes towards formalization and proof have gone through large swings during the last 150 years. We sketch the development from Frege’s first formalization, to the debates over intuitionism and other schools, through Hilbert’s program and the decisive blow of the Go¨del Incompleteness Theorem. A critical role is played by the Skolem–Lowenheim Theorem, which showed that no first-order axiom system can characterize a unique infinite model. Skolem himself regarded this as a body blow to the belief that mathematics can be reliably founded only on formal axiomatic systems. In a remarkably prescient paper, he even sketches the possibility of interesting new models for set theory itself, something later realized by the method of forcing. This is in contrast to Hilbert’s belief that mathematics could resolve all its questions. We discuss the role of new axioms for set theory, questions in set theory itself, and their relevance for number theory. We then look in detail at what the methods of the predicate calculus, i.e. mathematical reasoning, really entail. The conclusion is that there is no reasonable basis for Hilbert’s assumption. The vast majority of questions even in elementary number theory, of reasonable complexity, are beyond the reach of any such reasoning ... 
... The startling conclusion that Skolem drew is the famous Skolem Paradox, that any of the usual axiom systems for set theory will have countable models, unless they are contradictory. Since I will not assume that my audience are all trained logicians, I point out that though the set of reals from the countable model is countable seen from outside, there is no function ‘living in the model’ which puts it in one-to-one correspondence with the set of integers of the model. This fact and other considerations led Skolem to this viewpoint:
I believed that it was so clear that axiomatization in terms of sets was not a satisfactory ultimate foundation of mathematics, that mathematicians would, for the most part, not be very much concerned by it.
The view that I shall present differs somewhat from this, and is in a sense more radical, namely that it is unreasonable to expect that any reasoning of the type we call rigorous mathematics can hope to resolve all but the tiniest fraction of possible mathematical questions.
The theorem of Lowenheim–Skolem was the first truly important discovery about formal systems in general, and it remains probably the most basic. ...
Conclusion: ...Therefore, my conclusion is the following. I believe that the vast majority of statements about the integers are totally and permanently beyond proof in any reasonable system. Here I am using proof in the sense that mathematicians use that word. Can statistical evidence be regarded as proof ? I would like to have an open mind, and say ‘Why not?’. If the first ten billion zeros of the zeta function lie on the line whose real part is 1/2, what conclusion shall we draw? I feel incompetent even to speculate on how future generations will regard numerical evidence of this kind. 
In this pessimistic spirit, I may conclude by asking if we are witnessing the end of the era of pure proof, begun so gloriously by the Greeks. I hope that mathematics lives for a very long time, and that we do not reach that dead end for many generations to come.

Wednesday, May 12, 2021

Neural Tangent Kernels and Theoretical Foundations of Deep Learning

A colleague recommended this paper to me recently. See also earlier post Gradient Descent Models Are Kernel Machines.
Neural Tangent Kernel: Convergence and Generalization in Neural Networks 
Arthur Jacot, Franck Gabriel, Clément Hongler 
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function fθ (which maps input vectors to output vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and it stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We prove the positive-definiteness of the limiting NTK when the data is supported on the sphere and the non-linearity is non-polynomial. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function fθ follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stopping. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.
The results are remarkably well summarized in the wikipedia entry on Neural Tangent Kernels:

For most common neural network architectures, in the limit of large layer width the NTK becomes constant. This enables simple closed form statements to be made about neural network predictions, training dynamics, generalization, and loss surfaces. For example, it guarantees that wide enough ANNs converge to a global minimum when trained to minimize an empirical loss. ...

An Artificial Neural Network (ANN) with scalar output consists in a family of functions  parametrized by a vector of parameters .

The Neural Tangent Kernel (NTK) is a kernel  defined by

In the language of kernel methods, the NTK  is the kernel associated with the feature map  .

For a dataset  with scalar labels  and a loss function , the associated empirical loss, defined on functions , is given by

When training the ANN  is trained to fit the dataset (i.e. minimize ) via continuous-time gradient descent, the parameters  evolve through the ordinary differential equation:

During training the ANN output function follows an evolution differential equation given in terms of the NTK:

This equation shows how the NTK drives the dynamics of  in the space of functions  during training.

This is a very brief (3 minute) summary by the first author:



This 15 minute IAS talk gives a nice overview of the results, and their relation to fundamental questions (both empirical and theoretical) in deep learning. Longer (30m) version: On the Connection between Neural Networks and Kernels: a Modern Perspective.  




I hope to find time to explore this in more depth. Large width seems to provide a limiting case (analogous to the large-N limit in gauge theory) in which rigorous results about deep learning can be proved.

Some naive questions:

What is the expansion parameter of the finite width expansion?

What role does concentration of measure play in the results? (See 30m video linked above.)

Simplification seems to be a consequence of overparametrization. But the proof method seems to apply to a regularized (but still convex, e.g., using L1 penalization) loss function that imposes sparsity. It would be interesting to examine this specific case in more detail.

Notes to self:

The overparametrized (width ~ w^2) network starts in a random state and by concentration of measure this initial kernel K is just the expectation, which is the NTK. Because of the large number of parameters the effect of training (i.e., gradient descent) on any individual parameter is 1/w, and the change in the eigenvalue spectrum of K is also 1/w. It can be shown that the eigenvalue spectrum is positive and bounded away from zero, and this property does not change under training. Also, the evolution of f is linear in K up to corrections with are suppressed by 1/w. Hence evolution follows a convex trajectory and can achieve global minimum loss in a finite (polynomial) time. 

The parametric 1/w expansion may depend on quantities such as the smallest NTK eigenvalue k: the proof might require  k >> 1/w  or  wk large.

In the large w limit the function space has such high dimensionality that any typical initial f is close (within a ball of radius 1/w?) to an optimal f. 

These properties depend on specific choice of loss function.

Friday, March 26, 2021

John von Neumann, 1966 Documentary

 

This 1966 documentary on von Neumann was produced by the Mathematical Association of America. It includes interviews with Wigner, Ulam, Halmos, Goldstine, and others. 

At ~34m Bethe (leader of the Los Alamos theory division) gives primary credit to vN for the implosion method in fission bombs. While vN's previous work on shock waves and explosive lenses is often acknowledged as important for solving the implosion problem, this is the first time I have seen him given credit for the idea itself. Seth Neddermeyer's Enrico Fermi Award citation gives him credit for "invention of the implosion technique" and the original solid core design was referred to as the "Christy gadget" after Robert Christy. As usual, history is much more complicated than the simplified narrative that becomes conventional.
Teller: He could and did talk to my three-year-old son on his own terms and I sometimes wondered whether his relations to the rest of us were a little bit similar.
A recent application of vN's Quantum Ergodic Theorem: Macroscopic Superposition States in Isolated Quantum Systems.

Cloning vN (science fiction): short story, longer (AI vs genetic engineering).

Wednesday, January 27, 2021

Yuri Milner interviews Donaldson, Kontsevich, Lurie, Tao, and Taylor (2015 Breakthrough Prize)

 

I came across this panel discussion recently, with Yuri Milner (former theoretical physicist, internet billionaire, and sponsor of the Breakthrough Prize) as interlocutor and panelists Simon Donaldson, Maxim Kontsevich, Jacob Lurie, Terence Tao, and Richard Taylor. 

Among the topics covered: the nature of mathematics, the simulation question, AGI and automated proof, human-machine collaboration in mathematics. Kontsevich marvels at the crystalline form of quantum mechanics: why linearity? why a vector space structure? 

Highly recommended!


See also 

The Quantum Simulation Hypothesis: Do we live in a quantum multiverse simulation? 

Saturday, January 16, 2021

Harvard CMSA talks (video)

I recently came across this channel on YouTube, produced by CMSA at Harvard.
The new Center for Mathematical Sciences and Applications in the Faculty of Arts and Sciences will serve as a fusion point for mathematics, statistics, physics, and related sciences. Evergrande will support new professorships, research, and core programming. 
Shing-Tung Yau, Harvard’s William Caspar Graustein Professor of Mathematics, will serve as the center’s first director. 
“The Center for Mathematical Sciences and Applications will establish applied mathematics at Harvard as a first-class, interdisciplinary field of study, relating mathematics with many other important fields,” Yau said. “The center will not only carry out the most innovative research but also train young researchers from all over the world, especially those from China. The center marks a new chapter in the development of mathematical science.”
If I'm not mistaken Evergrande is a big real estate developer in China. It's nice to see them supporting mathematics and science in the US :-) 

In 2010 I accompanied S.T. Yau and a number of other US academics and technologists to visit Alibaba, which wanted to establish a center for data science in China. Unfortunately this never really got off the ground, but CMSA looks like it is off to a good start. 

Here are some talks I found interesting. There are quite a few more.






The talk on Atiyah, Geometry, and Physics led me to this poem which I like very much. Sadly, Atiyah passed in 2019. I believe we met once at a dinner at the Society of Fellows, but I hardly knew him.
In the broad light of day mathematicians check their equations and their proofs, leaving no stone unturned in their search for rigour. 
But, at night, under the full moon, they dream, they float among the stars and wonder at the mystery of the heavens: they are inspired. 
Without dreams there is no art, no mathematics, no life. 
—Michael Atiyah

Monday, June 24, 2019

Ulam on von Neumann, Godel, and Einstein


Ulam expresses so much in a few sentences! From his memoir, Adventures of a Mathematician. Above: Einstein and Godel. Bottom: von Neumann, Feynman, Ulam.
When it came to other scientists, the person for whom he [vN] had a deep admiration was Kurt Gödel. This was mingled with a feeling of disappointment at not having himself thought of "undecidability." For years Gödel was not a professor at Princeton, merely a visiting fellow, I think it was called. Apparently there was someone on the faculty who was against him and managed to prevent his promotion to a professorship. Johnny would say to me, "How can any of us be called professor when Gödel is not?" ...

As for Gödel, he valued Johnny very highly and was much interested in his views. I believe knowing the importance of his own discovery did not prevent Gödel from a gnawing uncertainty that maybe all he had discovered was another paradox à la Burali Forte or Russell. But it is much, much more. It is a revolutionary discovery which changed both the philosophical and the technical aspects of mathematics.

When we talked about Einstein, Johnny would express the usual admiration for his epochal discoveries which had come to him so effortlessly, for the improbable luck of his formulations, and for his four papers on relativity, on the Brownian motion, and on the photo-electric quantum effect. How implausible it is that the velocity of light should be the same emanating from a moving object, whether it is coming toward you or whether it is receding. But his admiration seemed mixed with some reservations, as if he thought, "Well, here he is, so very great," yet knowing his limitations. He was surprised at Einstein's attitude in his debates with Niels Bohr—at his qualms about quantum theory in general. My own feeling has always been that the last word has not been said and that a new "super quantum theory" might reconcile the different premises.

Tuesday, April 09, 2019

Genomic prediction of student flow through high school math curriculum

Compute polygenic EA scores for 3000 US high school students of European ancestry. Track individual progress from 9th to 12th grade, focusing on mathematics courses. The students are out-of-sample: not used in training of predictor. In fact, a big portion (over half?) of individuals used in predictor training are not even from the US -- they are from the UK/EU.

Results: predictor captures about as much variance as family background (SES = Social Economic Status). Students with lower polygenic scores are less likely to take advanced math (e.g., Geometry and beyond).

Typical education paths of individuals with, e.g., bottom few percentile polygenic score are radically different from those in the top percentiles, even after controlling for SES. For example, consider only rich kids or kids at superior schools and compare educational trajectory vs polygenic score. Looks like (bottom figure) odds ratio for taking Geometry in 9th grade is about 4-6x higher for top polygenic score kids.
Genetic Associations with Mathematics Tracking and Persistence in Secondary School

K. Paige Harden and Benjamin W. Domingue, et al.

...we address this question using student polygenic scores, which are DNA-based indicators of propensity to succeed in education8. We integrated genetic and official school transcript data from over 3,000 European-ancestry students from U.S. high schools. We used polygenic scores as a molecular tracer to understand how the flow of students through the high school math pipeline differs in socioeconomically advantaged versus disadvantaged schools. Students with higher education polygenic scores were tracked to more advanced math already at the beginning of high school and persisted in math for more years...

...including family-SES and school-SES as covariates attenuated the association between the education-PGS and mathematics tracking in the 9th-grade only by about 20% (attenuated from b = 0.583, SE = .034, to b = 0.461, SE = .036, p < 2 × 10-16, Supplementary Table S3). Note that the association with genetics was roughly comparable in magnitude to the association with familySES...







See also Game Over: Genomic Prediction of Social Mobility (some overlap in authors with the new paper).



A talk by the first author:


Monday, December 10, 2018

Music and Mathematics: Noam Elkies


Dinner with two old Harvard friends -- mathematician Noam Elkies and MSU physicist Dean Lee. Noam is in town this week to give a lecture, a colloquium, and perform a piano recital.

At 26 Noam became the youngest full professor in Harvard history, and the youngest to ever receive tenure. He has an amazing Wikipedia entry :-)
In 1981, at age 14, he was awarded a gold medal at the 22nd International Mathematical Olympiad, receiving a perfect score of 42 and becoming one of just 26 participants to attain this score,[3] and one of the youngest ever to do so. Elkies graduated from Stuyvesant High School in 1982[4][5] and went on to Columbia University, where he won the Putnam competition at the age of sixteen years and four months, making him one of the youngest Putnam Fellows in history.[6] He was a Putnam Fellow two more times during his undergraduate years. After graduating as valedictorian at age 18 with a summa cum laude in Mathematics and Music, he earned his Ph.D. at the age 20 under the supervision of Benedict Gross and Barry Mazur at Harvard University.[7]

From 1987 to 1990 he was a junior fellow of the Harvard Society of Fellows.[8]

In 1987, he proved that an elliptic curve over the rational numbers is supersingular at infinitely many primes. In 1988, he found a counterexample to Euler's sum of powers conjecture for fourth powers.[9] His work on these and other problems won him recognition and a position as an associate professor at Harvard in 1990.[4] In 1993, he was made a full, tenured professor at the age of 26. This made him the youngest full professor in the history of Harvard.[10] Along with A. O. L. Atkin he extended Schoof's algorithm to create the Schoof–Elkies–Atkin algorithm.
Noam, Dean, and I are all veterans of the Malkin Athletic Center weight room, when it was old-school and gritty :-)

Here's an earlier version of the talk Noam gave tonight. Video should start with him constructing a canon from thin air!

Saturday, September 22, 2018

The French Way: Alain Connes interview


I came across this interview with Fields Medalist Alain Connes (excerpt below) via an essay by Dominic Cummings (see his blog here).

Dom's essay is also highly recommended. He has spent considerable effort to understand the history of highly effective scientific / research organizations. There is a good chance that his insights will someday be put to use in service of the UK. Dom helped create a UK variant of Kolmogorov's School for Physics and Mathematics.

On the referendum and on Expertise: the ARPA/PARC ‘Dream Machine’, science funding, high performance, and UK national strategy


Topics discussed by Connes: CNRS as a model for nurturing talent, materialism and hedonic treadmill as the enemy to intellectual development, string theory (pro and con!), US, French, and Soviet systems for science / mathematics, his entry into Ecole Normale and the '68 Paris convulsions.

France and Ecole Normale produce great mathematicians far in excess of their population size.
Connes: I believe that the most successful systems so far were these big institutes in the Soviet union, like the Landau institute, the Steklov institute, etc. Money did not play any role there, the job was just to talk about science. It is a dream to gather many young people in an institute and make sure that their basic activity is to talk about science without getting corrupted by thinking about buying a car, getting more money, having a plan for career etc. ... Of course in the former Soviet Union there were no such things as cars to buy etc. so the problem did not arise. In fact CNRS comes quite close to that dream too, provided one avoids all interference from our society which nowadays unfortunately tends to become more and more money oriented.


Q: You were criticizing the US way of doing research and approach to science but they have been very successful too, right? You have to work hard to get tenure, and research grants. Their system is very unified in the sense they have very few institutes like Institute for Advanced Studies but otherwise the system is modeled after universities. So you become first an assistant professor and so on. You are always worried about your raise but in spite of all these hazards the system is working.


Connes: I don’t really agree. The system does not function as a closed system. The US are successful mostly because they import very bright scientists from abroad. For instance they have imported all of the Russian mathematicians at some point.


Q: But the system is big enough to accommodate all these people this is also a good point.


Connes: If the Soviet Union had not collapsed there would still be a great school of mathematics there with no pressure for money, no grants and they would be more successful than the US. In some sense once they migrated in the US they survived and did very well but I believed they would have bloomed better if not transplanted. By doing well they give the appearance that the US system is very successful but it is not on its own by any means. The constant pressure for producing reduces the “time unit” of most young people there. Beginners have little choice but to find an adviser that is sociologically well implanted (so that at a later stage he or she will be able to write the relevant recommendation letters and get a position for the student) and then write a technical thesis showing that they have good muscles, and all this in a limited amount of time which prevents them from learning stuff that requires several years of hard work. We badly need good technicians, of course, but it is only a fraction of what generates progress in research. It reminds me of an anecdote about Andre Weil who at some point had some problems with elliptic operators so he invited a great expert in the field and he gave him the problem. The expert sat at the kitchen table and solved the problem after several hours. To thank him, Andre Weil said “when I have a problem with electricity I call an electrician, when I have a problem with ellipticity I use an elliptician”.

From my point of view the actual system in the US really discourages people who are truly original thinkers, which often goes with a slow maturation at the technical level. Also the way the young people get their position on the market creates “feudalities” namely a few fields well implanted in key universities which reproduce themselves leaving no room for new fields.

....

Q: So you were in Paris [ Ecole Normale ] in the best place and in the best time.

Connes: Yes it was a good time. I think it was ideal that we were a small group of people and our only motivation was pure thought and no talking about careers. We couldn’t care the less and our main occupation was just discussing mathematics and challenging each other with problems. I don’t mean ”puzzles” but problems which required a lot of thought, time or speed was not a factor, we just had all the time we needed. If you could give that to gifted young people it would be perfect.
See also Defining Merit:
... As a parting shot, Wilson could not resist accusing Ford of anti-intellectualism; citing Ford's desire to change Harvard's image, Wilson asked bluntly: "What's wrong with Harvard being regarded as an egghead college? Isn't it right that a country the size of the United States should be able to afford one university in which intellectual achievement is the most important consideration?"

E. Bright Wilson was Harvard professor of chemistry and member of the National Academy of Sciences, later a recipient of the National Medal of Science. The last quote from Wilson could easily have come from anyone who went to Caltech! Indeed, both E. Bright Wilson and his son, Nobel Laureate Ken Wilson (theoretical physics), earned their doctorates at Caltech (the father under Linus Pauling, the son under Murray Gell-Mann).
Where Nobel winners get their start (Nature):
Top Nobel-producing undergraduate institutions

Rank School                Country               Nobelists per capita (UG alumni)
1 École Normale Supérieure France       0.00135
2 Caltech                               US             0.00067
3 Harvard University            US             0.00032
4 Swarthmore College          US             0.00027
5 Cambridge University       UK             0.00025
6 École Polytechnique          France       0.00025
7 MIT                                   US              0.00025
8 Columbia University         US              0.00021
9 Amherst College               US              0.00019
10 University of Chicago     US              0.00017

Tuesday, July 17, 2018

ICML notes

It's never been a better time to work on AI/ML. Vast resources are being deployed in this direction, by corporations and governments alike. In addition to the marvelous practical applications in development, a theoretical understanding of Deep Learning may emerge in the next few years.

The notes below are to keep track of some interesting things I encountered at the meeting.

Some ML learning resources:

Metacademy
Depth First study of AlphaGo


I heard a more polished version of this talk by Elad at the Theory of Deep Learning workshop. He is trying to connect results in sparse learning (e.g., performance guarantees for L1 or threshold algos) to Deep Learning. (Video is from UCLA IPAM.)



It may turn out that the problems on which DL works well are precisely those in which the training data (and underlying generative processes) have a hierarchical structure which is sparse, level by level. Layered networks perform a kind of coarse graining (renormalization group flow): first layers filter by feature, subsequent layers by combinations of features, etc. But the whole thing can be understood as products of sparse filters, and the performance under training is described by sparse performance guarantees (ReLU = thresholded penalization?). Given the inherent locality of physics (atoms, molecules, cells, tissue; atoms, words, sentences, ...) it is not surprising that natural phenomena generate data with this kind of hierarchical structure.


Off-topic: At dinner with one of my former students and his colleague (both researchers at an AI lab in Germany), the subject of Finitism came up due to a throwaway remark about the Continuum Hypothesis.

Wikipedia
Horizons of Truth
Chaitin on Physics and Mathematics

David Deutsch:
The reason why we find it possible to construct, say, electronic calculators, and indeed why we can perform mental arithmetic, cannot be found in mathematics or logic. The reason is that the laws of physics "happen" to permit the existence of physical models for the operations of arithmetic such as addition, subtraction and multiplication.
My perspective: We experience the physical world directly, so the highest confidence belief we have is in its reality. Mathematics is an invention of our brains, and cannot help but be inspired by the objects we find in the physical world. Our idealizations (such as "infinity") may or may not be well-founded. In fact, mathematics with infinity included may be very sick, as evidenced by Godel's results, or paradoxes in set theory. There is no reason that infinity is needed (as far as we know) to do physics. It is entirely possible that there are only a (large but) finite number of degrees of freedom in the physical universe.

Paul Cohen:
I will ascribe to Skolem a view, not explicitly stated by him, that there is a reality to mathematics, but axioms cannot describe it. Indeed one goes further and says that there is no reason to think that any axiom system can adequately describe it.
This "it" (mathematics) that Cohen describes may be the set of idealizations constructed by our brains extrapolating from physical reality. But there is no guarantee that these idealizations have a strong kind of internal consistency and indeed they cannot be adequately described by any axiom system.

Tuesday, June 12, 2018

Big Ed on Classical and Quantum Information Theory

I'll have to carve out some time this summer to look at these :-) Perhaps on an airplane...

When I visited IAS earlier in the year, Witten was sorting out Lieb's (nontrivial) proof of strong subadditivity. See also Big Ed.
A Mini-Introduction To Information Theory
https://arxiv.org/abs/1805.11965

This article consists of a very short introduction to classical and quantum information theory. Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information. A few more detailed topics are considered in the quantum case.
Notes On Some Entanglement Properties Of Quantum Field Theory
https://arxiv.org/abs/1803.04993

These are notes on some entanglement properties of quantum field theory, aiming to make accessible a variety of ideas that are known in the literature. The main goal is to explain how to deal with entanglement when – as in quantum field theory – it is a property of the algebra of observables and not just of the states.
Years ago at Caltech, walking back to Lauritsen after a talk on quantum information, with John Preskill and a famous string theorist not to be named. When I asked the latter what he thought of the talk, he laughed and said Well, after all, it's just linear algebra :-)

Sunday, December 03, 2017

Big Ed


Today I came across a recent interview with Ed Witten in Quanta Magazine. The article has some nice photos like the one above. I was struck by the following quote from Witten ("It from Qubit!"):
When I was a beginning grad student, they had a series of lectures by faculty members to the new students about theoretical research, and one of the people who gave such a lecture was Wheeler. He drew a picture on the blackboard of the universe visualized as an eye looking at itself. I had no idea what he was talking about. It’s obvious to me in hindsight that he was explaining what it meant to talk about quantum mechanics when the observer is part of the quantum system. I imagine there is something we don’t understand about that.  [ Italics mine ]
The picture he refers to is reproduced below.


This question has been of interest to me since I was first exposed to quantum mechanics, although I put it off for a long time because quantum foundations is not considered a respectable area by most physicists! Of course it should be obvious that if quantum mechanics is to be a universal theory of nature, then observers like ourselves can't help but be part of the (big) quantum system.

See related posts Feynman and Everett, Schwinger on Quantum Foundations, Gell-Man on Quantum Foundations, and Weinberg on Quantum Foundations.

Here's a similar figure, meant to represent the perspective of an observer inside the wavefunction of the universe (which evolves deterministically and unitarily; the degrees of freedom of the observer's mind are part of the Hilbert space of Psi; time runs vertically and Psi evolves into exp(-iHT) Psi while we are "inside" :-). The little person (observer) is watching particles collapse into a black hole, which then evaporates into Hawking radiation. The figure was drawn on the whiteboard of my University of Oregon office and persisted there for a year or more. I doubt any visitors (other than perhaps one special grad student) understood what it was about.



For some powerful Witten anecdotes like the one below, see here. (If you don't know who Ed Witten is this should clarify things a bit!)
I met him in Boston in 1977, when I was getting interested in the connection between physics and mathematics. I attended a meeting, and there was this young chap with the older guys. We started talking, and after a few minutes I realized that the younger guy was much smarter than the old guys. He understood all the mathematics I was talking about, so I started paying attention to him. That was Witten. And I’ve kept in touch with him ever since.

In 2001, he invited me to Caltech, where he was a visiting professor. I felt like a graduate student again. Every morning I would walk into the department, I’d go to see Witten, and we’d talk for an hour or so. He’d give me my homework. I’d go away and spend the next 23 hours trying to catch up. Meanwhile, he’d go off and do half a dozen other things. We had a very intense collaboration. It was an incredible experience because it was like working with a brilliant supervisor. I mean, he knew all the answers before I got them. If we ever argued, he was right and I was wrong. It was embarrassing!

(Fields Medalist Michael Atiyah, on what it was like to collaborate with Witten)
The closest thing I have read to a personal intellectual history of Witten is his essay Adventures in Physics and Math, which I highly recommend. The essay addresses some common questions, such as What was Ed like as a kid? How did he choose a career in Physics? How does he know so much Mathematics? For example,
At about age 11, I was presented with some relatively advanced math books. My father is a theoretical physicist and he introduced me to calculus. For a while, math was my passion. My parents, however, were reluctant to push me too far, too fast with math (as they saw it) and so it was a long time after that before I was exposed to any math that was really more advanced than basic calculus. I am not sure in hindsight whether their attitude was best or not.
A great video, suggested by a commenter:

Thursday, November 30, 2017

CMSE (Computational Mathematics, Science and Engineering) at MSU



At Oregon I was part of an interdisciplinary institute that included theoretical physicists and chemists, mathematicians, and computer scientists. We tried to create a program (not even a new department, just an interdisciplinary program) in applied math and computation, but failed due to lack of support from higher administration. When I arrived at MSU as VPR I learned that the faculty here had formulated a similar plan for a new department. Together with the Engineering dean and the Natural Sciences dean we pushed it through and created an entirely new department in just a few years. This new department already has a research ranking among the top 10 in the US (according to Academic Analytics).

Computational Mathematics, Science and Engineering at MSU.


Tuesday, March 28, 2017

The brute tyranny of g-loading: Lawrence Krauss and Joe Rogan



I love Joe Rogan -- he has an open, inquisitive mind and is a sharp observer of life and human society. See, for example, this interview with Dan Bilzerian about special forces, professional poker, sex, drugs, heart attacks, life, happiness, hedonic treadmill, social media, girls, fame, prostitution, money, steroids, stem cell therapy, and plenty more.

I know Lawrence Krauss quite well -- he and I work in the same area of theoretical physics. However, the 20+ minute opening segment in which Krauss tries to explain gauge symmetry (1, 2, 3) to Joe is downright painful. Some things are just conceptually hard, and are built up from other concepts that are themselves non-obvious.


Gauge symmetry is indeed central to modern theoretical physics -- all of the known forces of nature are gauge interactions. I've been at an uncountable number of cocktail parties (sometimes with other professors) where I've tried to explain this concept to someone as sincerely interested as Rogan is in the video. Who doesn't like to hear about fundamental laws of Nature and deep principles of physical reality?

No matter how clearly a very g-loaded concept is explained, it is challenging for the typical person to comprehend. (This is almost a definition.) Many ideas in physics are challenging even to college professors. One sad aspect of the Internet is that there isn't any significant discussion forum or blog comment section where even much simpler concepts such as regression to the mean are understood by all the participants.

Listening to the conversation between Joe and Lawrence about gauge theory and the Higgs field, I couldn't help but think of this Far Side cartoon:



Oppenheimer: Mathematics is "an immense enlargement of language, an ability to talk about things which in words would be simply inaccessible."

See also this Reddit discussion of the podcast episode.

Wednesday, January 18, 2017

Oppenheimer on Bohr (1964 UCLA)



I came across this 1964 UCLA talk by Oppenheimer, on his hero Niels Bohr.

Oppenheimer: Mathematics is "an immense enlargement of language, an ability to talk about things which in words would be simply inaccessible."

I find it strange that psychometricians usually define "verbal ability" over a vocabulary set that excludes words from mathematics and other scientific areas. A person's verbal score is enhanced by knowing many (increasingly obscure) words for the same concept, as opposed to knowing words which describe new concepts beyond those which appear in ordinary language.

Is it more valuable to have mastery of these words: esoteric, abstruse, enigmatic, cryptic, recondite, inscrutable, opaque, ... (all describe similar concepts; they are synonyms for not easily understood),

or these: mean, variance, standard deviation, fluctuation, scaling, dimensionality, eigenvector, orthogonal, kernel, null space (these describe distinct but highly useful concepts not found in ordinary language)?

Among the simplest (and most useful) mathematical words/concepts that flummox ordinary people are statistical terms such as mean, variance, standard deviation, etc. One could be familiar with all of these words and concepts, yet obtain a low score on a test of verbal ability due to an insufficiently large grasp of (relatively useless) esoteric synonyms.

See also Thought vectors and the dimensionality of the space of concepts , Toward a Geometry of Thought and High V, Low M

Added from comments:
I'd like to clarify something that was probably confusing in the original post and my subsequent comments. 
One of the things I noticed in the SAT reading comprehension sections my kids were looking at is that one is NOT being asked to making subtle distinctions between nearby concepts/words. One is merely being asked to know that X (esoteric word) is a synonym for Y (common word), without having to know the subtle difference between X and Y. 
So, if my kid didn't know that "Brobdingnagian" is a synonym for "big" they might not be able to answer a multiple-choice question about a paragraph containing the sentence: "But of course the error was of Brobdingnagian proportions." To answer the question doesn't require knowledge of Gulliver's Travels -- I could un-befuddle my kid (allowing him or her to easily answer the question) just by saying "Brobdingnagian means big"! 
So, at least this psychometric exam (the SAT) isn't even testing fine distinctions -- it just tests whether you know that X1, X2, ... , XN are synonyms of a very primitive concept like BIG. What is the value of taking N larger and larger (in this sense; not the fine distinction sense)? Surely there are diminishing returns...

Blog Archive

Labels