Although Google sits astride the English (and Indo-European)-language Web like a colossus, their market share in Chinese is less than 25 percent and hasn't grown at all in recent years, even under the lead of celebrity hire Kai Fu Lee (who recently left Google to run his own venture fund). Search in China is dominated by Baidu, which appears to be Google's only rival in any language (eat your heart out, Quaero and Agence de l'innovation industrielle :-)
Given that China has the largest number of internet users (over 300 million and still growing very fast) and generates more search queries than the US, this is an interesting strategic development.
For a discussion of Baidu by founder Robin Li, watch this talk. The first 40 minutes or so will be old hat to any startup veteran, but the last 20 minutes is fairly interesting -- Li makes some interesting comments about the future of innovation in China. He notes that the Chinese language search index triples in size each year, whereas in English and other languages it only grows about 50% per year.
Here's a goofy idea for a sci fi story: a conversation or encounter between two big early AIs, one which evolved out of Google and its Indo-European corpus of data + "Mechanical Turk" / user input, and the other which evolved from Baidu and its primarily Sinic corpus and input. Would they be different in a fundamental way?
Pessimism of the Intellect, Optimism of the Will Favorite posts | Manifold podcast | Twitter: @hsu_steve
Showing posts with label search. Show all posts
Showing posts with label search. Show all posts
Friday, October 02, 2009
Tuesday, May 12, 2009
WolframAlpha: W|A
If you are attuned to the geek universe, you've probably already heard the rumors about WolframAlpha, the "computational knowledge engine" from Steve Wolfram, the creator of Mathematica. The service is scheduled to launch this weekend, and media hype has already reached a frenzy -- see, e.g., this NYTimes article.
Curious to find out more, I emailed a former theoretical physicist who works (remotely) for Wolfram Research but sits just down the hall from me. Years ago I had lobbied for an office for him because, although he isn't primarily in physics research any more, he's a brilliant guy and was bound to end up doing something interesting. (Wanting to take advantage of the opportunity to work remotely at Wolfram he had researched the best places to live and ended up choosing Eugene.) Little did I know he's been leading the W|A project from here! He stumbled into my office yesterday to answer my questions and tell me that he's knee deep in reviewing code commits and getting ready for the launch :-)
My experience has confirmed over and over again that big brains tend to do interesting things. Whenever I meet someone who is "scary smart" (there are not so many in the world), I keep an eye out for what they do later in life.
Here is what AI researcher Doug Lenat wrote about W|A:
See also this video of a lecture by Wolfram about W|A.
Curious to find out more, I emailed a former theoretical physicist who works (remotely) for Wolfram Research but sits just down the hall from me. Years ago I had lobbied for an office for him because, although he isn't primarily in physics research any more, he's a brilliant guy and was bound to end up doing something interesting. (Wanting to take advantage of the opportunity to work remotely at Wolfram he had researched the best places to live and ended up choosing Eugene.) Little did I know he's been leading the W|A project from here! He stumbled into my office yesterday to answer my questions and tell me that he's knee deep in reviewing code commits and getting ready for the launch :-)
My experience has confirmed over and over again that big brains tend to do interesting things. Whenever I meet someone who is "scary smart" (there are not so many in the world), I keep an eye out for what they do later in life.
Here is what AI researcher Doug Lenat wrote about W|A:
...At its heart is a formal Mathematica representation. Its inference engine is basically a large number of individually hand-engineered scripts for tapping into data which he and his team have spent the last several years gathering and "curating". For example, he has assembled tables of historical financial information about countries' GDP's and about companies' stock prices. In a small number of cases, he also connects via API to third party information, but mostly for realtime data such as a current stock price or current temperature. Rather than connecting to and relying on the current or future Semantic Web, Alpha computes its answers primarily from his own curated data to the extent possible; he sees Alpha as the home for almost all the information it needs, and will use to answer users' queries.
In an important sense, Alpha is a logical extension of Mathematica: it extends the range of types of information for which significant power can be gained by manually, and exhaustively, enumerating a large set of cases: airplane designs, cities, currencies, etc. I.e., Alpha extends what Mathematica has done previously for things like chemical compounds, geometric surfaces, topological configurations, arithmetic series, trigonometric ratios, and equations. In the new cases, as Mathematica did in those abstract math cases, Alpha excels at not just retrieving the stored data but performing various appropriate numeric calculations on the data, and displaying the results in beautiful graphs and easily comprehended tables for the user.
The resulting mosaic covers a large portion of the space of queries that the average person might genuinely want to ask, in the course of their day. The interface is not exactly natural language, but can be treated by the user as though it were -- just as users of browsers can treat them as though they parsed sentences even though they don't. A better way to think of it is a DWIMM ("do what I might mean"), so if you type in something like "gdp France / Germany", it calculates and returns a graph of the relative fraction of France's annual GDP to Germany's GDP, over the last 30 years or so. If you just type in "gdp", it looks up your local host and (in my case) displays the GDP of the USA over the last 30 years, plus various pieces of information about what gross domestic product is, from a mathematical formula perspective but not from a semantic one. It does not have an ontology, so what it knows about, say, GDP, or population, or stock price, is no more nor less than the equations that involve that term.
See also this video of a lecture by Wolfram about W|A.
Friday, October 05, 2007
Two book reviews
I've had both of these on my shelf for some time, but haven't found time to write detailed reviews.
Google's PageRank and Beyond, Langville and Meyer (Princeton University Press).
Written by two math professors, this is the best technical account I could find of search algorithms. The math (mainly linear algebra and a little graph theory) is accessible and introduced in a self-contained way in a separate chapter. The coverage isn't limited to beautiful algorithmic ideas (the primary one being to find the dominant eigenvector of the matrix representing the graph of hyperlinks) -- the discussion includes nitty gritty details like how to treat dangling nodes, how to accelerate computations, etc. There's also a running historical summary of Google's development up to and including the IPO.
If you're wondering why I have this book, it's not just academic curiosity -- the PageRank algorithm in its basic form can be understood pretty quickly from overviews available online. I'm interested in understanding the current state of the art and the possibility of improvements ;-)
An Engine, Not a Camera, D. MacKenzie (MIT Press)
This is the best history of modern finance and options pricing theory I have yet read. MacKenzie has a sufficient understanding of the theory and of the subtle sociological issues involved (strangely, he is not an economist but a sociologist). Figures like Mandelbrot (the mathematician), Thorp (perhaps the real inventor of Black-Scholes) and Osborne (a physicist) appear along with better known economists like Samuelson, Fama, Miller, Sharpe, Black, Scholes, Merton, etc. The section on Mandelbrot and Levy distributions is especially good, as is the account of LTCM. The title is from Milton Friedman, who (controversially) characterized economic theory as an "engine to analyze the world, not a photographic reproduction of it".
Google's PageRank and Beyond, Langville and Meyer (Princeton University Press).
Written by two math professors, this is the best technical account I could find of search algorithms. The math (mainly linear algebra and a little graph theory) is accessible and introduced in a self-contained way in a separate chapter. The coverage isn't limited to beautiful algorithmic ideas (the primary one being to find the dominant eigenvector of the matrix representing the graph of hyperlinks) -- the discussion includes nitty gritty details like how to treat dangling nodes, how to accelerate computations, etc. There's also a running historical summary of Google's development up to and including the IPO.
If you're wondering why I have this book, it's not just academic curiosity -- the PageRank algorithm in its basic form can be understood pretty quickly from overviews available online. I'm interested in understanding the current state of the art and the possibility of improvements ;-)
An Engine, Not a Camera, D. MacKenzie (MIT Press)
This is the best history of modern finance and options pricing theory I have yet read. MacKenzie has a sufficient understanding of the theory and of the subtle sociological issues involved (strangely, he is not an economist but a sociologist). Figures like Mandelbrot (the mathematician), Thorp (perhaps the real inventor of Black-Scholes) and Osborne (a physicist) appear along with better known economists like Samuelson, Fama, Miller, Sharpe, Black, Scholes, Merton, etc. The section on Mandelbrot and Levy distributions is especially good, as is the account of LTCM. The title is from Milton Friedman, who (controversially) characterized economic theory as an "engine to analyze the world, not a photographic reproduction of it".
Subscribe to:
Posts (Atom)
Blog Archive
Labels
- physics (420)
- genetics (325)
- globalization (301)
- genomics (295)
- technology (282)
- brainpower (280)
- finance (275)
- american society (261)
- China (249)
- innovation (231)
- ai (206)
- economics (202)
- psychometrics (190)
- science (172)
- psychology (169)
- machine learning (166)
- biology (163)
- photos (162)
- genetic engineering (150)
- universities (150)
- travel (144)
- podcasts (143)
- higher education (141)
- startups (139)
- human capital (127)
- geopolitics (124)
- credit crisis (115)
- political correctness (108)
- iq (107)
- quantum mechanics (107)
- cognitive science (103)
- autobiographical (97)
- politics (93)
- careers (90)
- bounded rationality (88)
- social science (86)
- history of science (85)
- realpolitik (85)
- statistics (83)
- elitism (81)
- talks (80)
- evolution (79)
- credit crunch (78)
- biotech (76)
- genius (76)
- gilded age (73)
- income inequality (73)
- caltech (68)
- books (64)
- academia (62)
- history (61)
- intellectual history (61)
- MSU (60)
- sci fi (60)
- harvard (58)
- silicon valley (58)
- mma (57)
- mathematics (55)
- education (53)
- video (52)
- kids (51)
- bgi (48)
- black holes (48)
- cdo (45)
- derivatives (43)
- neuroscience (43)
- affirmative action (42)
- behavioral economics (42)
- economic history (42)
- literature (42)
- nuclear weapons (42)
- computing (41)
- jiujitsu (41)
- physical training (40)
- film (39)
- many worlds (39)
- quantum field theory (39)
- expert prediction (37)
- ufc (37)
- bjj (36)
- bubbles (36)
- mortgages (36)
- google (35)
- race relations (35)
- hedge funds (34)
- security (34)
- von Neumann (34)
- meritocracy (31)
- feynman (30)
- quants (30)
- taiwan (30)
- efficient markets (29)
- foo camp (29)
- movies (29)
- sports (29)
- music (28)
- singularity (27)
- entrepreneurs (26)
- conferences (25)
- housing (25)
- obama (25)
- subprime (25)
- venture capital (25)
- berkeley (24)
- epidemics (24)
- war (24)
- wall street (23)
- athletics (22)
- russia (22)
- ultimate fighting (22)
- cds (20)
- internet (20)
- new yorker (20)
- blogging (19)
- japan (19)
- scifoo (19)
- christmas (18)
- dna (18)
- gender (18)
- goldman sachs (18)
- university of oregon (18)
- cold war (17)
- cryptography (17)
- freeman dyson (17)
- smpy (17)
- treasury bailout (17)
- algorithms (16)
- autism (16)
- personality (16)
- privacy (16)
- Fermi problems (15)
- cosmology (15)
- happiness (15)
- height (15)
- india (15)
- oppenheimer (15)
- probability (15)
- social networks (15)
- wwii (15)
- fitness (14)
- government (14)
- les grandes ecoles (14)
- neanderthals (14)
- quantum computers (14)
- blade runner (13)
- chess (13)
- hedonic treadmill (13)
- nsa (13)
- philosophy of mind (13)
- research (13)
- aspergers (12)
- climate change (12)
- harvard society of fellows (12)
- malcolm gladwell (12)
- net worth (12)
- nobel prize (12)
- pseudoscience (12)
- Einstein (11)
- art (11)
- democracy (11)
- entropy (11)
- geeks (11)
- string theory (11)
- television (11)
- Go (10)
- ability (10)
- complexity (10)
- dating (10)
- energy (10)
- football (10)
- france (10)
- italy (10)
- mutants (10)
- nerds (10)
- olympics (10)
- pop culture (10)
- crossfit (9)
- encryption (9)
- eugene (9)
- flynn effect (9)
- james salter (9)
- simulation (9)
- tail risk (9)
- turing test (9)
- alan turing (8)
- alpha (8)
- ashkenazim (8)
- data mining (8)
- determinism (8)
- environmentalism (8)
- games (8)
- keynes (8)
- manhattan (8)
- new york times (8)
- pca (8)
- philip k. dick (8)
- qcd (8)
- real estate (8)
- robot genius (8)
- success (8)
- usain bolt (8)
- Iran (7)
- aig (7)
- basketball (7)
- free will (7)
- fx (7)
- game theory (7)
- hugh everett (7)
- inequality (7)
- information theory (7)
- iraq war (7)
- markets (7)
- paris (7)
- patents (7)
- poker (7)
- teaching (7)
- vietnam war (7)
- volatility (7)
- anthropic principle (6)
- bayes (6)
- class (6)
- drones (6)
- econtalk (6)
- empire (6)
- global warming (6)
- godel (6)
- intellectual property (6)
- nassim taleb (6)
- noam chomsky (6)
- prostitution (6)
- rationality (6)
- academia sinica (5)
- bobby fischer (5)
- demographics (5)
- fake alpha (5)
- kasparov (5)
- luck (5)
- nonlinearity (5)
- perimeter institute (5)
- renaissance technologies (5)
- sad but true (5)
- software development (5)
- solar energy (5)
- warren buffet (5)
- 100m (4)
- Poincare (4)
- assortative mating (4)
- bill gates (4)
- borges (4)
- cambridge uk (4)
- censorship (4)
- charles darwin (4)
- computers (4)
- creativity (4)
- hormones (4)
- humor (4)
- judo (4)
- kerviel (4)
- microsoft (4)
- mixed martial arts (4)
- monsters (4)
- moore's law (4)
- soros (4)
- supercomputers (4)
- trento (4)
- 200m (3)
- babies (3)
- brain drain (3)
- charlie munger (3)
- cheng ting hsu (3)
- chet baker (3)
- correlation (3)
- ecosystems (3)
- equity risk premium (3)
- facebook (3)
- fannie (3)
- feminism (3)
- fst (3)
- intellectual ventures (3)
- jim simons (3)
- language (3)
- lee kwan yew (3)
- lewontin fallacy (3)
- lhc (3)
- magic (3)
- michael lewis (3)
- mit (3)
- nathan myhrvold (3)
- neal stephenson (3)
- olympiads (3)
- path integrals (3)
- risk preference (3)
- search (3)
- sec (3)
- sivs (3)
- society generale (3)
- systemic risk (3)
- thailand (3)
- twitter (3)
- alibaba (2)
- bear stearns (2)
- bruce springsteen (2)
- charles babbage (2)
- cloning (2)
- david mamet (2)
- digital books (2)
- donald mackenzie (2)
- drugs (2)
- dune (2)
- exchange rates (2)
- frauds (2)
- freddie (2)
- gaussian copula (2)
- heinlein (2)
- industrial revolution (2)
- james watson (2)
- ltcm (2)
- mating (2)
- mba (2)
- mccain (2)
- monkeys (2)
- national character (2)
- nicholas metropolis (2)
- no holds barred (2)
- offices (2)
- oligarchs (2)
- palin (2)
- population structure (2)
- prisoner's dilemma (2)
- singapore (2)
- skidelsky (2)
- socgen (2)
- sprints (2)
- star wars (2)
- ussr (2)
- variance (2)
- virtual reality (2)
- war nerd (2)
- abx (1)
- anathem (1)
- andrew lo (1)
- antikythera mechanism (1)
- athens (1)
- atlas shrugged (1)
- ayn rand (1)
- bay area (1)
- beats (1)
- book search (1)
- bunnie huang (1)
- car dealers (1)
- carlos slim (1)
- catastrophe bonds (1)
- cdos (1)
- ces 2008 (1)
- chance (1)
- children (1)
- cochran-harpending (1)
- cpi (1)
- david x. li (1)
- dick cavett (1)
- dolomites (1)
- eharmony (1)
- eliot spitzer (1)
- escorts (1)
- faces (1)
- fads (1)
- favorite posts (1)
- fiber optic cable (1)
- francis crick (1)
- gary brecher (1)
- gizmos (1)
- greece (1)
- greenspan (1)
- hypocrisy (1)
- igon value (1)
- iit (1)
- inflation (1)
- information asymmetry (1)
- iphone (1)
- jack kerouac (1)
- jaynes (1)
- jazz (1)
- jfk (1)
- john dolan (1)
- john kerry (1)
- john paulson (1)
- john searle (1)
- john tierney (1)
- jonathan littell (1)
- las vegas (1)
- lawyers (1)
- lehman auction (1)
- les bienveillantes (1)
- lowell wood (1)
- lse (1)
- machine (1)
- mcgeorge bundy (1)
- mexico (1)
- michael jackson (1)
- mickey rourke (1)
- migration (1)
- money:tech (1)
- myron scholes (1)
- netwon institute (1)
- networks (1)
- newton institute (1)
- nfl (1)
- oliver stone (1)
- phil gramm (1)
- philanthropy (1)
- philip greenspun (1)
- portfolio theory (1)
- power laws (1)
- pyschology (1)
- randomness (1)
- recession (1)
- sales (1)
- skype (1)
- standard deviation (1)
- starship troopers (1)
- students today (1)
- teleportation (1)
- tierney lab blog (1)
- tomonaga (1)
- tyler cowen (1)
- venice (1)
- violence (1)
- virtual meetings (1)
- wealth effect (1)
