Showing posts with label computing. Show all posts
Showing posts with label computing. Show all posts

Thursday, September 21, 2023

Huawei and the US-China Chip War — Manifold #44

 

TP Huang is a computer scientist and analyst of global technology development. He posts often on X: https://twitter.com/tphuang 


Steve and TP discuss: 

0:00 Introduction: TP Huang and semiconductor technology 
5:40 Huawei’s new phone and SoC 
23:19 SMIC 7nm chip production in China: Yield and economics 
28:21 Impact on Qualcomm 
36:08 U.S. sanctions solved the coordination problem for China semiconductor companies 
42:48 5G modem and RF chips: impact on Qualcomm, Broadcom, Apple, etc. 
47:14 5G and Huawei 52:50 Satellite capabilities of Huawei phones 
56:46 Huawei vs Apple and Chinese consumers 
1:01:33 Chip War and AI model training

Thursday, February 02, 2023

ChatGPT, LLMs, and AI — Manifold #29

 

Steve discusses Large Language Model AIs such as ChatGPT. 

0:00 How do LLMs work? 
10:22 Impact of ChatGPT 
15:21 AI landscape 
24:13 Hallucination and Focus 
33:09 Applications 
39:29 Future landscape 

Manifold interview with John Schulman of OpenAI: 


Blog posts on word vectors and approximately linear vector space of concepts used by the human mind:
 

Thursday, July 14, 2022

Tim Palmer (Oxford): Status and Future of Climate Modeling — Manifold Podcast #16

 

Tim Palmer is Royal Society Research Professor in Climate Physics, and a Senior Fellow at the Oxford Martin Institute. He is interested in the predictability and dynamics of weather and climate, including extreme events. 

He was involved in the first five IPCC assessment reports and was co-chair of the international scientific steering group of the World Climate Research Programme project (CLIVAR) on climate variability and predictability. 

After completing his DPhil at Oxford in theoretical physics, Tim worked at the UK Meteorological Office and later the European Centre for Medium-Range Weather Forecasts. For a large part of his career, Tim has developed ensemble methods for predicting uncertainty in weather and climate forecasts. 

In 2020 Tim was elected to the US National Academy of Sciences. 

Steve, Corey Washington, and Tim first discuss his career path from physics to climate research and then explore the science of climate modeling and the main uncertainties in state-of-the-art models. 

In this episode, we discuss: 

00:00 Introduction 
1:48 Tim Palmer's background and transition from general relativity to climate modeling 
15:13 Climate modeling uncertainty 
46:41 Navier-Stokes equations in climate modeling 
53:37 Where climate change is an existential risk 
1:01:26 Investment in climate research 

Links: 
 
Tim Palmer (Oxford University) 

The scientific challenge of understanding and estimating climate change (2019) https://www.pnas.org/doi/pdf/10.1073/pnas.1906691116 

ExtremeEarth 

Physicist Steve Koonin on climate change


Note added
: For some background on the importance of water vapor (cloud) distribution within the primitive cells used in these climate simulations, see:


Low clouds trap IR radiation near the Earth, while high clouds reflect solar energy back into space. The net effect on heating from the distribution of water vapor is crucial in these models. However, due to the complexity of the Navier-Stokes equations, current simulations cannot actually solve for this distribution from first principles. Rather, the modelers hand code assumptions about fine grained behavior within each cell. The resulting uncertainty in (e.g., long term) climate prediction from these approximations is unknown.

Sunday, June 12, 2022

Von Neumann: The Interaction of Mathematics and Computing, Stan Ulam 1976 talk (video)

 

Von Neumann: The Interaction of Mathematics and Computing, by Stan Ulam. 

See A History of Computing in the Twentieth Century, Edited by: N. METROPOLIS, J. HOWLETT and GIAN-CARLO ROTA.
 
More videos from the conference here. (Konrad Zuse!)

See at 50 minutes for an interesting story about von Neumann's role in the implosion mechanism for atomic bombs. vN apparently solved the geometrical problem for the shape of the explosive lens overnight after hearing a seminar on the topic. Still classified?
To solve this problem, the Los Alamos team planned to produce an “explosive lens”, a combination of different explosives with different shock wave speeds. When molded into the proper shape and dimensions, the high-speed and low-speed shock waves would combine with each other to produce a uniform concave pressure wave with no gaps. This inwardly-moving concave wave, when it reached the plutonium sphere at the center of the design, would instantly squeeze the metal to at least twice the density, producing a compressed ball of plutonium that contained about 5 times the necessary critical mass. A nuclear explosion would then result.
More here.

Sunday, December 06, 2020

AlphaFold 2: protein folding solved?

 

This is a good discussion of DeepMind's AlphaFold 2, a big breakthrough in protein folding. The details of how AlphaFold 2 works have not been published -- the video mainly discusses the January 2020 paper on the earlier version of AlphaFold, which already had world leading performance. However, it provides a good introduction both to protein folding as a physical / biological problem, as well as to AI/ML approaches.

I visited DeepMind in 2018 to give a talk on genomic prediction. I was hoping to get them interested! However, they were already focused on the protein folding problem. Most of my time there was spent discussing the latter topic with some of the AlphaFold team. They probably thought that a physicist who works on genomics might be worth talking to about protein folding, but I'm sure I learned more from them about it than vice versa...

In 2013 I blogged about a talk by Fields Medalist Stephen Smale on ML approaches to protein folding. He convinced me that ML approaches might work better than solving physics equations by brute force. 

Deep neural nets excel at learning high dimensional nonlinear functions that have some internal hierarchical structure (e.g., by length scale). Protein folding falls into this category. AlphaFold was able to utilize 170k training samples and extensive information from MSA (Multiple Sequence Alignment) which gives estimates of 3D distances: see, e.g., here.



Monday, September 28, 2020

Feynman on AI

Thanks to a reader for sending the video to me. The first clip is of Feynman discussing AI, taken from the longer 1985 lecture in the second video.

There is not much to disagree with in his remarks on AI. He was remarkably well calibrated and would not have been very surprised by what has happened in the following 35 years, except that he did not anticipate (at least, does not explicitly predict) the success that neural nets and deep learning would have for the problem that he describes several times as "pattern recognition" (face recognition, fingerprint recognition, gait recognition). Feynman was well aware of early work on neural nets, through his colleague John Hopfield.  [1] [2] [3]

I was at Caltech in 1985 and this is Feynman as I remember him. To me, still a teen ager, he seemed ancient. But his mind was marvelously active! As you can see from the talk he was following the fields of AI and computation rather closely. 

Of course, he and other Manhattan project physicists were present at the creation. They had to use crude early contraptions for mechanical calculation in bomb design computations. Thus, the habit of reducing a complex problem (whether in physics or machine learning) to primitive operations was second nature. Already for kids of my generation it was not second nature -- we grew up with early "home computers" like the Apple II and Commodore, so there was a black box magic aspect already to programming in high level languages. Machine language was useful for speeding up video games, but not everyone learned it. The problem is even worse today: children first encounter computers as phones or tablets that already seem like magic. The highly advanced nature of these devices discourages them from trying to grasp the underlying first principles.  

If I am not mistaken the t-shirt he is wearing is from the startup Thinking Machines, which built early parallel supercomputers.

Just three years later he was gone. The finely tuned neural connections in his brain -- which allowed him to reason with such acuity and communicate with such clarity still in 1985 -- were lost forever.



Thursday, May 17, 2018

Exponential growth in compute used for AI training


Chart shows the total amount of compute, in petaflop/s-days, used in training (e.g., optimizing an objective function in a high dimensional space). This exponential trend is likely to continue for some time -- leading to qualitative advances in machine intelligence.
AI and Compute (OpenAI blog): ... since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time (by comparison, Moore’s Law had an 18-month doubling period). Since 2012, this metric has grown by more than 300,000x (an 18-month doubling period would yield only a 12x increase). Improvements in compute have been a key component of AI progress, so as long as this trend continues, it’s worth preparing for the implications of systems far outside today’s capabilities.

... Three factors drive the advance of AI: algorithmic innovation, data (which can be either supervised data or interactive environments), and the amount of compute available for training. Algorithmic innovation and data are difficult to track, but compute is unusually quantifiable, providing an opportunity to measure one input to AI progress. Of course, the use of massive compute sometimes just exposes the shortcomings of our current algorithms. But at least within many current domains, more compute seems to lead predictably to better performance, and is often complementary to algorithmic advances.

...We see multiple reasons to believe that the trend in the graph could continue. Many hardware startups are developing AI-specific chips, some of which claim they will achieve a substantial increase in FLOPS/Watt (which is correlated to FLOPS/$) over the next 1-2 years. ...

Thursday, November 30, 2017

CMSE (Computational Mathematics, Science and Engineering) at MSU



At Oregon I was part of an interdisciplinary institute that included theoretical physicists and chemists, mathematicians, and computer scientists. We tried to create a program (not even a new department, just an interdisciplinary program) in applied math and computation, but failed due to lack of support from higher administration. When I arrived at MSU as VPR I learned that the faculty here had formulated a similar plan for a new department. Together with the Engineering dean and the Natural Sciences dean we pushed it through and created an entirely new department in just a few years. This new department already has a research ranking among the top 10 in the US (according to Academic Analytics).

Computational Mathematics, Science and Engineering at MSU.


Saturday, November 18, 2017

Robot Overlords and the Academy


In a previous post Half of all jobs (> $60k/y) coding related? I wrote
In the future there will be two kinds of jobs. Workers will either

Tell computers what to do    
      or
Be told by computers what to do
I've been pushing Michigan State University to offer a coding bootcamp experience to all undergraduates who want it: e.g., Codecademy.com. The goal isn't to turn non-STEM majors into software developers, but to give all interested students exposure to an increasingly important and central aspect of the modern world.

I even invited the CodeNow CEO to campus to help push the idea. We're still working on it at the university -- painfully SLOWLY, if you ask me. But this fall I learned my kids are taking a class based on Codecademy at their middle school! Go figure.

(Image via 1, 2)

Sunday, June 11, 2017

Rise of the Machines: Survey of AI Researchers


These predictions are from a recent survey of AI/ML researchers. See SSC and also here for more discussion of the results.
When Will AI Exceed Human Performance? Evidence from AI Experts

Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang, Owain Evans

Advances in artificial intelligence (AI) will transform modern life by reshaping transportation, health, science, finance, and the military. To adapt public policy, we need to better anticipate these advances. Here we report the results from a large survey of machine learning researchers on their beliefs about progress in AI. Researchers predict AI will outperform humans in many activities in the next ten years, such as translating languages (by 2024), writing high-school essays (by 2026), driving a truck (by 2027), working in retail (by 2031), writing a bestselling book (by 2049), and working as a surgeon (by 2053). Researchers believe there is a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years, with Asian respondents expecting these dates much sooner than North Americans. These results will inform discussion amongst researchers and policymakers about anticipating and managing trends in AI.
Another figure:


Keep in mind that the track record for this type of prediction, even by experts, is not great:


See below for the cartoon version :-)



Sunday, June 04, 2017

Epistemic Caution and Climate Change

[ UPDATE: See 2019 post: Certainties and Uncertainties in our Energy and Climate Futures: Steve Koonin ]

I have not, until recently, invested significant time in trying to understand climate modeling. These notes are primarily for my own use, however I welcome comments from readers who have studied this issue in more depth.

I take a dim view of people who express strong opinions about complex phenomena without having understood the underlying uncertainties. I have yet to personally encounter anyone who claims to understand all of the issues discussed below, but I constantly meet people with strong views about climate change.

See my old post on epistemic caution Intellectual honesty: how much do we know?
... when it comes to complex systems like society or economy (and perhaps even climate), experts have demonstrably little predictive power. In rigorous studies, expert performance is often no better than random.  
... worse, experts are usually wildly overconfident about their capabilities. ... researchers themselves often have beliefs whose strength is entirely unsupported by available data.
Now to climate and CO2. AFAIU, the direct heating effect due to increasing CO2 concentration is only a logarithmic function (all the absorption is in a narrow frequency band). The main heating effects in climate models come from secondary effects such as water vapor distribution in the atmosphere, which are not calculable from first principles, nor under good experimental/observational control. Certainly any "catastrophic" outcomes would have to result from these secondary feedback effects.

The first paper below gives an elementary calculation of direct effects from atmospheric CO2. This is the "settled science" part of climate change -- it depends on relatively simple physics. The prediction is about 1 degree Celsius of warming from a doubling of CO2 concentration. Anything beyond this is due to secondary effects which, in their totality, are not well understood -- see second paper below, about model tuning, which discusses rather explicitly how these unknowns are dealt with.
Simple model to estimate the contribution of atmospheric CO2 to the Earth’s greenhouse effect
Am. J. Phys. 80, 306 (2012)
http://dx.doi.org/10.1119/1.3681188

We show how the CO2 contribution to the Earth’s greenhouse effect can be estimated from relatively simple physical considerations and readily available spectroscopic data. In particular, we present a calculation of the “climate sensitivity” (that is, the increase in temperature caused by a doubling of the concentration of CO2) in the absence of feedbacks. Our treatment highlights the important role played by the frequency dependence of the CO2 absorption spectrum. For pedagogical purposes, we provide two simple models to visualize different ways in which the atmosphere might return infrared radiation back to the Earth. The more physically realistic model, based on the Schwarzschild radiative transfer equations, uses as input an approximate form of the atmosphere’s temperature profile, and thus includes implicitly the effect of heat transfer mechanisms other than radiation.
From Conclusions:
... The question of feedbacks, in its broadest sense, is the whole question of climate change: namely, how much and in which way can we expect the Earth to respond to an increase of the average surface temperature of the order of 1 degree, arising from an eventual doubling of the concentration of CO2 in the atmosphere? And what further changes in temperature may result from this response? These are, of course, questions for climate scientists to resolve. ...
The paper below concerns model tuning. It should be apparent that there are many adjustable parameters hidden in any climate model. One wonders whether the available data, given its own uncertainties, can constrain this high dimensional parameter space sufficiently to produce predictive power in a rigorous statistical sense.

The first figure below illustrates how different choices of these parameters can affect model predictions. Note the huge range of possible outcomes! The second figure below illustrates some of the complex physical processes which are subsumed in the parameter choices. Over longer timescales, (e.g., decades) uncertainties such as the response of ecosystems (e.g., plant growth rates) to increased CO2 would play a role in the models. It is obvious that we do not (may never?) have control over these unknowns.
THE ART AND SCIENCE OF CLIMATE MODEL TUNING

AMERICAN METEOROLOGICAL SOCIETY MARCH 2017 | 589

... Climate model development is founded on well-understood physics combined with a number of heuristic process representations. The fluid motions in the atmosphere and ocean are resolved by the so-called dynamical core down to a grid spacing of typically 25–300 km for global models, based on numerical formulations of the equations of motion from fluid mechanics. Subgrid-scale turbulent and convective motions must be represented through approximate subgrid-scale parameterizations (Smagorinsky 1963; Arakawa and Schubert 1974; Edwards 2001). These subgrid-scale parameterizations include coupling with thermodynamics; radiation; continental hydrology; and, optionally, chemistry, aerosol microphysics, or biology.

Parameterizations are often based on a mixed, physical, phenomenological and statistical view. For example, the cloud fraction needed to represent the mean effect of a field of clouds on radiation may be related to the resolved humidity and temperature through an empirical relationship. But the same cloud fraction can also be obtained from a more elaborate description of processes governing cloud formation and evolution. For instance, for an ensemble of cumulus clouds within a horizontal grid cell, clouds can be represented with a single-mean plume of warm and moist air rising from the surface (Tiedtke 1989; Jam et al. 2013) or with an ensemble of such plumes (Arakawa and Schubert 1974). Similar parameterizations are needed for many components not amenable to first-principle approaches at the grid scale of a global model, including boundary layers, surface hydrology, and ecosystem dynamics. Each parameterization, in turn, typically depends on one or more parameters whose numerical values are poorly constrained by first principles or observations at the grid scale of global models. Being approximate descriptions of unresolved processes, there exist different possibilities for the representation of many processes. The development of competing approaches to different processes is one of the most active areas of climate research. The diversity of possible approaches and parameter values is one of the main motivations for model inter-comparison projects in which a strict protocol is shared by various modeling groups in order to better isolate the uncertainty in climate simulations that arises from the diversity of models (model uncertainty). ...

... All groups agreed or somewhat agreed that tuning was justified; 91% thought that tuning global-mean temperature or the global radiation balance was justified (agreed or somewhat agreed). ... the following were considered acceptable for tuning by over half the respondents: atmospheric circulation (74%), sea ice volume or extent (70%), and cloud radiative effects by regime and tuning for variability (both 52%).






Here is Steve Koonin, formerly Obama's Undersecretary for Science at DOE and a Caltech theoretical physicist, calling for a "Red Team" analysis of climate science, just a few months ago (un-gated link):
WSJ: ... The outcome of a Red/Blue exercise for climate science is not preordained, which makes such a process all the more valuable. It could reveal the current consensus as weaker than claimed. Alternatively, the consensus could emerge strengthened if Red Team criticisms were countered effectively. But whatever the outcome, we scientists would have better fulfilled our responsibilities to society, and climate policy discussions would be better informed.

Note Added: In 2014 Koonin ran a one day workshop for the APS (American Physical Society), inviting six leading climate scientists to present their work and engage in an open discussion. The APS committee responsible for reviewing the organization's statement on climate change were the main audience for the discussion. The 570+ page transcript, which is quite informative, is here. See Physics Today coverage, and an annotated version of Koonin's WSJ summary.

Below are some key questions Koonin posed to the panelists in preparation for the workshop. After the workshop he declared that The idea that “Climate science is settled” runs through today’s popular and policy discussions. Unfortunately, that claim is misguided.
The estimated equilibrium climate sensitivity to CO2 has remained between 1.5 and 4.5 in the IPCC reports since 1979, except for AR4 where it was given as 2-5.5.

What gives rise to the large uncertainties (factor of three!) in this fundamental parameter of the climate system?

How is the IPCC’s expression of increasing confidence in the detection/attribution/projection of anthropogenic influences consistent with this persistent uncertainty?

Wouldn’t detection of an anthropogenic signal necessarily improve estimates of the response to anthropogenic perturbations?
I seriously doubt that the process by which the 1.5 to 4.5 range is computed is statistically defensible. From the transcript, it appears that IPCC results of this kind are largely the result of "Expert Opinion" rather than a specific computation! It is rather curious that the range has not changed in 30+ years, despite billions of dollars spent on this research. More here.

Saturday, June 03, 2017

Python Programming in one video



Putting this here in hopes I can get my kids to watch it at some point 8-)

Please recommend similar resources in the comments!

Saturday, April 15, 2017

History of Bayesian Neural Networks



This talk gives the history of neural networks in the framework of Bayesian inference. Deep learning is (so far) quite empirical in nature: things work, but we lack a good theoretical framework for understanding why or even how. The Bayesian approach offers some progress in these directions, and also toward quantifying prediction uncertainty.

I was sad to learn from this talk that David Mackay passed last year, from cancer. I recommended his book Information theory, inference and learning algorithms back in 2007.

Yarin Gal's dissertation Uncertainty in Deep Learning, mentioned in the talk.

I suppose I can thank my Caltech education for a quasi-subconscious understanding of neural nets despite never having worked on them. They were in the air when I was on campus, due to the presence of John Hopfield (he co-founded the Computation and Neural Systems PhD program at Caltech in 1986). See also Hopfield on physics and biology.

Amusingly, I discovered this talk via deep learning: YouTube's recommendation engine, powered by deep neural nets, suggested it to me this Saturday afternoon :-)

Friday, November 25, 2016

Von Neumann: "If only people could keep pace with what they create"

I recently came across this anecdote in Von Neumann, Morgenstern, and the Creation of Game Theory: From Chess to Social Science, 1900-1960.

One night in early 1945, just back from Los Alamos, vN woke in a state of alarm in the middle of the night and told his wife Klari:
"... we are creating ... a monster whose influence is going to change history ... this is only the beginning! The energy source which is now being made available will make scientists the most hated and most wanted citizens in any country.

The world could be conquered, but this nation of puritans will not grab its chance; we will be able to go into space way beyond the moon if only people could keep pace with what they create ..."
He then predicted the future indispensable role of automation, becoming so agitated that he had to be put to sleep by a strong drink and sleeping pills.

In his obituary for John von Neumann, Ulam recalled a conversation with vN about the "ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue." This is the origin of the concept of technological singularity. Perhaps we can even trace it to that night in 1945 :-)

How will humans keep pace? See Super-Intelligent Humans are Coming and Don't Worry, Smart Machines Will Take Us With Them.

Monday, September 05, 2016

World's fastest supercomputer: Sunway TaihuLight (41k nodes, 11M cores)



Jack Dongarra, professor at UT Knoxville, discusses the strengths and weaknesses of the Sunway TaihuLight, currently the world's fastest supercomputer. The fastest US supercomputer, Titan (#3 in the world), is at Oak Ridge National Lab, near UTK. More here and here.

MSU's latest HPC cluster would be ranked ~150 in the world.
Top 500 Supercomputers in the world

Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) and installed at the National Supercomputing Center in Wuxi, which is in China's Jiangsu province is the No. 1 system with 93 petaflop/s (Pflop/s) on the Linpack benchmark. The system has 40,960 nodes, each with one SW26010 processor for a combined total of 10,649,600 computing cores. Each SW26010 processor is composed of 4 MPEs, 4 CPEs, (a total of 260 cores), 4 Memory Controllers (MC), and a Network on Chip (NoC) connected to the System Interface (SI). Each of the four MPEs, CPEs, and MCs have access to 8GB of DDR3 memory. The system is based on processors exclusively designed and built in China. The Sunway TaihuLight is almost three times as fast and three times as efficient as Tianhe-2, the system it displaces in the number one spot. The peak power consumption under load (running the HPL benchmark) is at 15.371 MW or 6 Gflops/W. This allows the TaihuLight system to hold one of the top spots on the Green500 in terms of the Performance/Power metric. [ IIRC, these processors are inspired by the old Digital Alpha chips that I used to use... ]

...

The number of systems installed in China has increased dramatically to 167, compared to 109 on the last list. China is now at the No. 1 position as a user of HPC. Additionally, China now is at No. 1 position in the performance share thanks to the big contribution of the systems at No. 1 and No. 2.

The number of systems installed in the USA declines sharply and is now at 165 systems, down from from 199 in the previous list. This is the lowest number of systems installed in the U.S. since the list was started 23 years ago.

...

The U.S., the leading consumer of HPC systems since the inception of the TOP500 lists is now second for the first time after China with 165 of the 500 systems. China leads the systems and performance categories now thanks to the No.1 and No. 2 system and a surge in industrial and research installations registered over the last few years. The European share (105 systems compared to 107 last time) has fallen and is now lower than the dominant Asian share of 218 systems, up from 173 in November 2015.

Dominant countries in Asia are China with 167 systems (up from 109) and Japan with 29 systems (down from 37).

In Europe, Germany is the clear leader with 26 systems followed by France with 18 and the UK with 12 systems.

Sunday, August 14, 2016

Half of all jobs (> $60k/y) coding related?

In the future there will be two kinds of jobs. Workers will either

Tell computers what to do    

or

Be told by computers what to do





See this jobs report, based on BLS statistics and analysis of 26 million job postings scraped from job boards, newspapers, and other online sources in 2015.
Coding jobs represent a large and growing part of the job market. There were nearly 7 million job openings in the U.S. last year for roles requiring coding skills. This represents 20% of the total market for career-track jobs that pay $15 an hour or more. Jobs with coding skills are projected to grow 12% faster than the job market overall in the next 10 years. IT jobs are expected to grow even more rapidly: 25% faster than the overall market.1

Programming skills are in demand across a range of industries. Half of all programming openings are in Finance, Manufacturing, Health Care, and other sectors outside of the technology industry.

...

Jobs valuing coding skills pay $22,000 per year more, on average, than jobs that don’t: $84,000 vs $62,000 per year. The value of these skills is striking and, for students looking to increase their potential income, few other skills open the door to as many well-paying careers. Slicing the data another way, 49% of the jobs in the top wage quartile (>$58,000/yr) value coding skills.

...

We define coding jobs as those in any occupation where knowing how to write computer code makes someone a stronger candidate and where employers commonly request coding skills in job postings. In some cases, coding is a prerequisite skill for the role, such as for Database Administrators. In other cases, such as Graphic Designers, knowing how to code may not be required in all cases, but job seekers with relevant programming skills will typically have an advantage.
See also The Butlerian Jihad and Darwin among the Machines.

Thursday, April 21, 2016

Deep Learning tutorial: Yoshua Bengio, Yann Lecun NIPS 2015



I think these are the slides.

One of the topics which I've remarked on before is the absence of local minima in the high dimensional optimization required to tune these DNNs. In the limit of high dimensionality a critical point is overwhelmingly likely to be a saddlepoint (have at least one negative eigenvalue). This means that even though the surface is not strictly convex the optimization is tractable.

Thursday, April 14, 2016

The story of the Monte Carlo Algorithm



George Dyson is Freeman's son. I believe this talk was given at SciFoo or Foo Camp.

More Ulam (neither he nor von Neumann were really logicians, at least not primarily).

Wikipedia on Monte Carlo Methods. I first learned these in Caltech's Physics 129: Mathematical Methods, which used the textbook by Mathews and Walker. This book was based on lectures taught by Feynman, emphasizing practical techniques developed at Los Alamos during the war. The students in the class were about half undergraduates and half graduate students. For example, Martin Savage was a first year graduate student that year. Martin is now a heavy user of Monte Carlo in lattice gauge theory :-)

Monday, February 29, 2016

Moore's Law and AI

By now you've probably heard that Moore's Law is really dead. So dead that the semiconductor industry roadmap for keeping it on track has more or less been abandoned: see, e.g., here, here or here. (Reported on this blog 2 years ago!)

What I have not yet seen discussed is how a significantly reduced rate of improvement in hardware capability will affect AI and the arrival of the dreaded (in some quarters) Singularity. The fundamental physical problems associated with ~ nm scale feature size could take decades or more to overcome. How much faster are today's cars and airplanes than those of 50 years ago?

Hint to technocratic planners: invest more in physicists, chemists, and materials scientists. The recent explosion in value from technology has been driven by physical science -- software gets way too much credit. From the former we got a factor of a million or more in compute power, data storage, and bandwidth. From the latter, we gained (perhaps) an order of magnitude or two in effectiveness: how much better are current OSes and programming languages than Unix and C, both of which are ~50 years old now?


HLMI = ‘high–level machine intelligence’ = one that can carry out most human professions at least as well as a typical human. (From Minds and Machines.)

Of relevance to this discussion: a big chunk of AlphaGo's performance improvement over other Go programs is due to raw compute power (link via Jess Riedel). The vertical axis is ELO score. You can see that without multi-GPU compute, AlphaGo has relatively pedestrian strength.


ELO range 2000-3000 spans amateur to lower professional Go ranks. The compute power certainly affects depth of Monte Carlo Tree Search. The initial training of the value and policy neural networks using KGS Go server positions might have still been possible with slower machines, but would have taken a long time.

Thursday, February 26, 2015

Second-generation PLINK

"... these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM"  :-)

Interview with author Chris Chang. User Google group.

If one estimates a user population of ~1000, each saving of order $1000 in CPU/work time per year, then in the next few years PLINK 1.9 and its successors will deliver millions of dollars in value to the scientific community.
Second-generation PLINK: rising to the challenge of larger and richer datasets

Background
PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1’s primary data format.

Findings
To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(n‾√)-time/constant-space Hardy-Weinberg equilibrium and Fisher’s exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0).

Conclusions
The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

Blog Archive

Labels