Pessimism of the Intellect, Optimism of the Will Favorite posts | Manifold podcast | Twitter: @hsu_steve
Showing posts with label algorithms. Show all posts
Showing posts with label algorithms. Show all posts
Tuesday, October 30, 2018
Algorithms Rule Us All - VPRO documentary - 2018
ALGOS BAD!!! ... and these instances prove it ... ;-)
Wednesday, October 25, 2017
AlphaGo Zero: algorithms over data and compute
AlphaGo Zero was trained entirely through self-play -- no data from human play was used. The resulting program is the strongest Go player ever by a large margin, and is extremely efficient in its use of compute (running on only 4 TPUs).
Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.Rapid progress from a random initial state is rather amazing, but perhaps something we should get used to given that:
1. Deep Neural Nets are general enough to learn almost any function (i.e., high dimensional mathematical function) no matter how complex
2. The optimization process is (close to) convex
A widely discussed AI mystery: how do human babies manage to learn (language, intuitive physics, theory of mind) so quickly and with relatively limited training data? AlphaGo Zero's impressive results are highly suggestive in this context -- the right algorithms make a huge difference.
It seems certain that great things are coming in the near future...
Sunday, November 01, 2015
David Donoho interview at HKUST
A long interview with Stanford professor David Donoho (academic web page) at the IAS at HKUST.
Donoho was a pioneer in thinking about sparsity in high dimensional statistical problems. The motivation for this came from real world problems in geosciences (oil exploration), encountered in Texas when he was still a student. Geophysicists were using Compressed Sensing long before the rigorous mathematical basis was established.
The figure below, from the earlier post Compressed Sensing and Genomes, exhibits the Donoho-Tanner phase transition.
For more discussion of our recent paper The human genome as a compressed sensor, see this blog post by my collaborator Carson Chow and another on the machine learning blog Nuit Blanche. One of our main points in the paper is that the phase transition between the regimes of poor and good recovery of the L1 penalized algorithm (LASSO) is readily detectable, and that the scaling behavior of the phase boundary allows theoretical estimates for the necessary amount of data required for good performance at a given sparsity. Apparently, this reasoning has appeared before in the compressed sensing literature, and has been used to optimize hardware designs for sensors. In our case, the sensor is the human genome, and its statistical properties are fixed. Fortunately, we find that genotype matrices are in the same universality class as random matrices, which are good compressed sensors.
The black line in the figure below is the theoretical prediction (Donoho 2006) for the location of the phase boundary. The shading shows results from our simulations. The scale on the right is L2 (norm squared) error in the recovered effects vector compared to the actual effects.
From Donoho's autobiographical sketch, provided for the Shaw Prize:
During 2004-2010, Jared Tanner and I discovered the precise tradeoff between sparsity and undersampling, showing when L1-minimization can work successfully with random measurements. Our work developed the combinatorial geometry of sparse solutions to underdetermined systems, a beautiful subject involving random high-dimensional polytopes. What my whole life I thought of privately as ‘non-classical’ mathematics was absorbed into classical high-dimensional convex geometry. [ Discussed at ~ 1:38 in the video. ]More about John Tukey, Donoho's undergraduate advisor at Princeton.
Thursday, February 26, 2015
Second-generation PLINK
"... these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM" :-)
Interview with author Chris Chang. User Google group.
If one estimates a user population of ~1000, each saving of order $1000 in CPU/work time per year, then in the next few years PLINK 1.9 and its successors will deliver millions of dollars in value to the scientific community.
Interview with author Chris Chang. User Google group.
If one estimates a user population of ~1000, each saving of order $1000 in CPU/work time per year, then in the next few years PLINK 1.9 and its successors will deliver millions of dollars in value to the scientific community.
Second-generation PLINK: rising to the challenge of larger and richer datasets
Background
PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1’s primary data format.
Findings
To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(n‾√)-time/constant-space Hardy-Weinberg equilibrium and Fisher’s exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0).
Conclusions
The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
Friday, February 20, 2015
Coding for kids
I've been trying to get my kids interested in coding. I found this nice game called Lightbot, in which one writes simple programs that control the discrete movements of a bot. It's very intuitive and in just one morning my kids learned quite a bit about the idea of an algorithm and the notion of a subroutine or loop. Some of the problems (e.g., involving nested loops) are challenging.
Browser (Flash?) version.
There are Android and iOS versions as well.
Other coding for kids recommendations?
Browser (Flash?) version.
There are Android and iOS versions as well.
Other coding for kids recommendations?
Monday, May 26, 2014
The Mystery of Go
Nice article on the progress of computer Go.
See also The Laskers and the Go master: "While the baroque rules of Chess could only have been created by humans, the rules of Go are so elegant, organic, and rigorously logical that if intelligent life forms exist elsewhere in the universe, they almost certainly play Go." (Edward Lasker, International Master and US Chess champion)
See also The Laskers and the Go master: "While the baroque rules of Chess could only have been created by humans, the rules of Go are so elegant, organic, and rigorously logical that if intelligent life forms exist elsewhere in the universe, they almost certainly play Go." (Edward Lasker, International Master and US Chess champion)
WIRED: ... Even in the West, Go has long been a favorite game of mathematicians, physicists, and computer scientists. Einstein played Go during his time at Princeton, as did mathematician John Nash. Seminal computer scientist Alan Turing was a Go aficionado, and while working as a World War II code-breaker, he introduced the game to fellow cryptologist I.J. Good. ... Good gave the game a huge boost in Europe with a 1965 article for New Scientist entitled “The Mystery of Go.”The current handicap accorded a computer against a professional player is 4 stones. In the story below the world chess champion and his brother are given 9 stones by the Japanese mathematician (shodan = lowest non-beginner or black belt).
... Good opens the article by suggesting that Go is inherently superior to all other strategy games, an opinion shared by pretty much every Go player I’ve met. “There is chess in the western world, but Go is incomparably more subtle and intellectual,” says South Korean Lee Sedol, perhaps the greatest living Go player and one of a handful who make over seven figures a year in prize money. Subtlety, of course, is subjective. But the fact is that of all the world’s deterministic perfect information games — tic-tac-toe, chess, checkers, Othello, xiangqi, shogi — Go is the only one in which computers don’t stand a chance against humans.
...
After the match, I ask Coulom when a machine will win without a handicap. “I think maybe ten years,” he says. “But I do not like to make predictions.” His caveat is a wise one. In 2007, Deep Blue’s chief engineer, Feng-Hsiung Hsu, said much the same thing. Hsu also favored alpha-beta search over Monte Carlo techniques in Go programs, speculating that the latter “won’t play a significant role in creating a machine that can top the best human players.”
Even with Monte Carlo, another ten years may prove too optimistic. And while programmers are virtually unanimous in saying computers will eventually top the humans, many in the Go community are skeptical. “The question of whether they’ll get there is an open one,” says Will Lockhart, director of the Go documentary The Surrounding Game. “Those who are familiar with just how strong professionals really are, they’re not so sure.”
According to University of Sydney cognitive scientist and complex systems theorist Michael Harré, professional Go players behave in ways that are incredibly hard to predict. In a recent study, Harré analyzed Go players of various strengths, focusing on the predictability of their moves given a specific local configuration of stones. “The result was totally unexpected,” he says. “Moves became steadily more predictable until players reached near-professional level. But at that point, moves started getting less predictable, and we don’t know why. Our best guess is that information from the rest of the board started influencing decision-making in a unique way.” ...
Mr. Kitabatake one day told us that a Japanese mathematician was going to pass through Berlin on his way to London, and if we wanted to we could play a game with him at the Japanese Club. Dr. Lasker asked him whether he and I could perhaps play a game with him in consultation, and was wondering whether the master – he was a shodan – would give us a handicap.
“Well, of course,” said Mr. Kitabatake.
“How many stones do you think he would give us?" asked Lasker.
“Nine stones, naturally,” replied Mr. Kitabatake.
“Impossible!” said Lasker. “There isn’t a man in the world who can give me nine stones. I have studied the game for a year, and I know I understood what they were doing.”
Mr. Kitabatake only smiled.
“You will see,” he said.
The great day came when we were invited to the Japanese Club and met the master – I remember to this day how impressed I was by his technique – he actually spotted us nine stones, and we consulted on every move, playing very carefully. We were a little disconcerted by the speed with which the master responded to our deepest combinations. He never took more than a fraction of second. We were beaten so badly at the end, that Emanuel Lasker was quite heartbroken. On the way home he told me we must go to Japan and play with the masters there, then we would quickly improve and be able to play them on even terms. I doubted that very strongly, but I agreed that I was going to try to find a way to make the trip.
Labels:
ai,
algorithms,
chess,
computing,
Go,
machine learning
Monday, August 20, 2012
Genomic prediction: no bull
This Atlantic article discusses the application of genomic prediction to cattle breeding. Breeders have recently started switching from pedigree based methods to statistical models incorporating SNP genotypes. We can now make good predictions of phenotypes like milk and meat production using genetic data alone. Cows are easier than people because, as domesticated animals, they have smaller effective breeding population and less genetic diversity. Nevertheless, I expect very similar methods to be applied to humans within the next 5-10 years.
The Atlantic: ... the semen that Badger-Bluff Fanny Freddie produces has become such a hot commodity in what one artificial-insemination company calls "today's fast paced cattle semen market." In January of 2009, before he had a single daughter producing milk, the United States Department of Agriculture took a look at his lineage and more than 50,000 markers on his genome and declared him the best bull in the land. And, three years and 346 milk- and data-providing daughters later, it turns out that they were right. "When Freddie [as he is known] had no daughter records our equations predicted from his DNA that he would be the best bull," USDA research geneticist Paul VanRaden emailed me with a detectable hint of pride. "Now he is the best progeny tested bull (as predicted)."
Data-driven predictions are responsible for a massive transformation of America's dairy cows. While other industries are just catching on to this whole "big data" thing, the animal sciences -- and dairy breeding in particular -- have been using large amounts of data since long before VanRaden was calculating the outsized genetic impact of the most sought-after bulls with a pencil and paper in the 1980s.
Dairy breeding is perfect for quantitative analysis. Pedigree records have been assiduously kept; relatively easy artificial insemination has helped centralized genetic information in a small number of key bulls since the 1960s; there are a relatively small and easily measurable number of traits -- milk production, fat in the milk, protein in the milk, longevity, udder quality -- that breeders want to optimize; each cow works for three or four years, which means that farmers invest thousands of dollars into each animal, so it's worth it to get the best semen money can buy. The economics push breeders to use the genetics.
The bull market (heh) can be reduced to one key statistic, lifetime net merit, though there are many nuances that the single number cannot capture. Net merit denotes the likely additive value of a bull's genetics. The number is actually denominated in dollars because it is an estimate of how much a bull's genetic material will likely improve the revenue from a given cow. A very complicated equation weights all of the factors that go into dairy breeding and -- voila -- you come out with this single number. For example, a bull that could help a cow make an extra 1000 pounds of milk over her lifetime only gets an increase of $1 in net merit while a bull who will help that same cow produce a pound more protein will get $3.41 more in net merit. An increase of a single month of predicted productive life yields $35 more.
See below -- theoretical calculations suggest that even outliers with net merit of $700-800 will be eclipsed by specimens with 10x higher merit that can be produced by further selection on existing genetic variation. Similar results apply to humans.When you add it all up, Badger-Fluff Fanny Freddie has a net merit of $792. No other proven sire ranks above $750 and only seven bulls in the country rank above $700.
... It turned out they were in the perfect spot to look for statistical rules. They had databases of old and new bull semen. They had old and new production data. In essence, it wasn't that difficult to generate rules fortransforming genomic data into real-world predictions. Despite -- or because of -- the effectiveness of traditional breeding techniques, molecular biology has been applied in the field for years in different ways. Given that breeders were trying to discover bulls' hidden genetic profiles by evaluating the traits in their offspring that could be measured, it just made sense to start generating direct data about the animals' genomes.
"Each of the bulls on the sire list, we have 50,000 genetic markers. Most of those, we have 700,000," the USDA's VanRaden said. "Every month we get another 12,000 new calves, the DNA readings come in and we send the predictions out. We have a total of 200,000 animals with DNA analysis. That's why it's been so easy. We had such a good phenotype file and we had DNA stored on all these bulls."
... Nowadays breeders can choose between "genomic bulls," which have been evaluated based purely on their genes and "proven bulls," for which real world data is available. Discussions among dairy breeders show that many are beginning to mix in younger bulls with good-looking genomic data into the breeding regimens. How well has it gone? The first of the bulls who were bred from their genetic profiles alone, are receiving their initial production data. So far, it seems as if the genomic estimates were a little high, but more accurate than traditional methods alone.
The unique dataset and success of dairy breeders now has other scientists sniffing around their findings. Leonid Kruglyak, a genomics professor at Princeton, told me that "a lot of the statistical techniques and methodology" that connect phenotype and genotype were developed by animal breeders. In a sense, they are like codebreakers. If you know the rules of encoding. it's not difficult to put information in one end and have it pop out the other as a code. But if you're starting with the code, that's a brutally difficult problem. And it's the one that diary geneticists have been working on.(Kruglyak was a graduate student in biophysics at Berkeley under Bill Bialek when I was there.)
... John Cole, yet another USDA animal improvement scientist, generated an estimate of the perfect bull by choosing the optimal observed genetic sequences and hypothetically combining them. He found that the optimal bull would have a net merit value of $7,515, which absolutely blows any current bull out of the water. In other words, we're nowhere near creating the perfect milk machine.
Here's a recent paper on the big data aspects of genomic selection applied to animal breeding.
Saturday, May 26, 2012
Algo vs Algo and the Facebook IPO
Nanex research on algo vs algo activity during Facebook's IPO (via zerohedge).
Do we need to impose a small random delay on all transactions, or possibly a small transaction tax?
Do we need to impose a small random delay on all transactions, or possibly a small transaction tax?
Confused? See here :-)Did a Stuck Quote Prevent a Facebook Opening Day Pop?On 18-May-2012, within seconds of theopening in Facebook, we noticed an exceptional occurrence: Nasdaq quotes had higher bid prices than ask prices. This is called a cross market and occurs frequently between two different exchanges, but practically never on the same exchange (the buyer just needs to match up with the seller, which is fundamentally what an exchange does).When Nasdaq's ask price dropped below its bid price, the quote was marked non-firm -- indicating something is wrong with it, and for software to exclude it from any best bid/offer calculations. However, in several of the earlier occurrences the first non-firm crossed quote was immediately preceded by a regular or firm crossed quote!During the immediate period of time when the Nasdaq quote went from normal to non-firm, you can see an immediate evaporation in quotes from other exchanges, often accompanied by a flurry of trades. We first noticed this behavior while making a a video we made of quotes during the opening period in Facebook trading.The reaction to the crossed quote often resulted in the spread to widen from 1 cents to 70 cents or more in 1/10th of a second! It is important to realize that algorithms (algos) which are based on speed use existing prices (orders) from other exchanges as their primary (if not sole) input. So it is quite conceivable, if not highly likely that these unusual, and rare inverted quotes coming from Nasdaq influenced algorithms running on other exchanges.It is now more than a curiosity that the market was unable to penetrate Nasdaq's crossed $42.99 bid which appeared within 30 second of the open and remained stuck until 13:50. Could this have prevented the often expected pop (increase) in an IPO's stock price for FaceBook?This also brings another example of the dangers of placing a blind, mindless emphasis on speed above everything else. Algos reacting to prices created by other algos reacting to prices created by still other algos. Somewhere along the way, it has to start with a price based on economic reality. But the algos at the bottom of the intelligence chain can't waste precious milliseconds for that.They are built to simply react faster than the other guys algos. Why? Because the other guy figured out how to go faster! We don't need this in our markets. We need more intelligence. The economic and psychological costs stemming from Facebook not getting the traditional opening day pop are impossible to measure. That it may have been caused by algos reacting to a stuck quote from one exchange is not, sadly, surprising anymore.
Tuesday, May 24, 2011
MacKenzie on high frequency trading
Donald MacKenzie writes on high frequency trading in the London Review of Books.
Earlier posts: MacKenzie on the credit crisis , two book reviews
This is the funniest thing I've seen in a long time, if you can understand what the chimp is saying :-)
Earlier posts: MacKenzie on the credit crisis , two book reviews
LRB: ... No one in the markets contests the legitimacy of electronic market making or statistical arbitrage. Far more controversial are algorithms that effectively prey on other algorithms. Some algorithms, for example, can detect the electronic signature of a big VWAP, a process called ‘algo-sniffing’. This can earn its owner substantial sums: if the VWAP is programmed to buy a particular corporation’s shares, the algo-sniffing program will buy those shares faster than the VWAP, then sell them to it at a profit. Algo-sniffing often makes users of VWAPs and other execution algorithms furious: they condemn it as unfair, and there is a growing business in adding ‘anti-gaming’ features to execution algorithms to make it harder to detect and exploit them. However, a New York broker I spoke to last October defended algo-sniffing:
"I don’t look at it as in any way evil … I don’t think the guy who’s trying to hide the supply-demand imbalance [by using an execution algorithm] is any better a human being than the person trying to discover the true supply-demand. I don’t know why … someone who runs an algo-sniffing strategy is bad … he’s trying to discover the guy who has a million shares [to sell] and the price then should readjust to the fact that there’s a million shares to buy."
Whatever view one takes on its ethics, algo-sniffing is indisputably legal. More dubious in that respect is a set of strategies that seek deliberately to fool other algorithms. An example is ‘layering’ or ‘spoofing’. A spoofer might, for instance, buy a block of shares and then issue a large number of buy orders for the same shares at prices just fractions below the current market price. Other algorithms and human traders would then see far more orders to buy the shares in question than orders to sell them, and be likely to conclude that their price was going to rise. They might then buy the shares themselves, causing the price to rise. When it did so, the spoofer would cancel its buy orders and sell the shares it held at a profit. It’s very hard to determine just how much of this kind of thing goes on, but it certainly happens. In October 2008, for example, the London Stock Exchange imposed a £35,000 penalty on a firm (its name has not been disclosed) for spoofing.
... As Steve Wunsch, one of the pioneers of electronic exchanges, put it in another TABB forum discussion, US share trading ‘is now so complex as a system that no one can predict what will happen when something new is added to it, no matter how much vetting is done.’ If Wunsch is correct, there is a risk that attempts to make the system safer – by trying to find mechanisms that would prevent a repetition of last May’s events, for example – may have unforeseen and unintended consequences.
Systems that are both tightly coupled and highly complex, Perrow argues in Normal Accidents (1984), are inherently dangerous. Crudely put, high complexity in a system means that if something goes wrong it takes time to work out what has happened and to act appropriately. Tight coupling means that one doesn’t have that time. Moreover, he suggests, a tightly coupled system needs centralised management, but a highly complex system can’t be managed effectively in a centralised way because we simply don’t understand it well enough ...
This is the funniest thing I've seen in a long time, if you can understand what the chimp is saying :-)
Saturday, October 24, 2009
Eric Baum: What is Thought?
Last week we had AI researcher and former physicist Eric Baum here as our colloquium speaker. (See here for an 11 minute video of a similar, but shorter, talk he gave at the 2008 Singularity Summit.)
Here's what I wrote about Baum and his book What is Thought back in 2008:
When I first looked at What is Thought? I was under the impression that Baum's meaning, underlying structure and compact program were defined in terms of algorithmic complexity. However, it's more complicated than that. While Nature is governed by an algorithmically simple program (the Standard Model Hamiltonian can, after all, be written down on a single sheet of paper) a useful evolved program has to run in a reasonable amount of time, under resource (memory, CPU) constraints that Nature itself does not face. Compressible does not imply tractable -- all of physics might reduce to a compact Theory of Everything, but it probably won't be very useful for designing jet airplanes.
Useful programs have to be efficient in many ways -- algorithmically and computationally. So it's not a tautology that Nature is very compressible, therefore there must exist compact (useful) programs that exploit this compressibility. It's important that there are many intermediate levels of compression (i.e., description -- as in quarks vs molecules vs bulk solids vs people), and computationally effective programs to deal with those levels. I'm not sure what measure is used in computer science to encompass both algorithmic and computational complexity. Baum discusses something called minimum description length, but it's not clear to me exactly how the requirement of effective means of computation is formalized. In the language of physicists, Baum's compact (useful) programs are like effective field theories incorporating the relevant degrees of freedom for a certain problem -- they are not only a compressed model of the phenomena, but also allow simple computations.
Evolution has, using a tremendous amount of computational power, found these programs, and our best hope for AI is to exploit their existence to speed our progress. If Baum is correct, the future may be machine learning guided by human Mechanical Turk workers.
Baum has recently relocated to Berkeley to pursue a startup based on his ideas. (Ah, the excitement! I did the same in 2000 ...) His first project is to develop a world class Go program (no lack of ambition :-), with more practical applications down the road. Best of Luck!
Here's what I wrote about Baum and his book What is Thought back in 2008:
My favorite book on AI is Eric Baum's What is Thought? (Google books version). Baum (former theoretical physicist retooled as computer scientist) notes that evolution has compressed a huge amount of information in the structure of our brains (and genes), a process that AI would have to somehow replicate. A very crude estimate of the amount of computational power used by nature in this process leads to a pessimistic prognosis for AI even if one is willing to extrapolate Moore's Law well into the future. Most naive analyses of AI and computational power only ask what is required to simulate a human brain, but do not ask what is required to evolve one. I would guess that our best hope is to cheat by using what nature has already given us -- emulating the human brain as much as possible.
This perspective seems quite obvious now that I have kids -- their rate of learning about the world is clearly enhanced by pre-evolved capabilities. They're not generalized learning engines -- they're optimized to do things like recognize patterns (e.g., faces), use specific concepts (e.g., integers), communicate using language, etc.
What is Thought?
In What Is Thought? Eric Baum proposes a computational explanation of thought. Just as Erwin Schrodinger in his classic 1944 work What Is Life? argued ten years before the discovery of DNA that life must be explainable at a fundamental level by physics and chemistry, Baum contends that the present-day inability of computer science to explain thought and meaning is no reason to doubt there can be such an explanation. Baum argues that the complexity of mind is the outcome of evolution, which has built thought processes that act unlike the standard algorithms of computer science and that to understand the mind we need to understand these thought processes and the evolutionary process that produced them in computational terms.
Baum proposes that underlying mind is a complex but compact program that exploits the underlying structure of the world. He argues further that the mind is essentially programmed by DNA. We learn more rapidly than computer scientists have so far been able to explain because the DNA code has programmed the mind to deal only with meaningful possibilities. Thus the mind understands by exploiting semantics, or meaning, for the purposes of computation; constraints are built in so that although there are myriad possibilities, only a few make sense. Evolution discovered corresponding subroutines or shortcuts to speed up its processes and to construct creatures whose survival depends on making the right choice quickly. Baum argues that the structure and nature of thought, meaning, sensation, and consciousness therefore arise naturally from the evolution of programs that exploit the compact structure of the world.
When I first looked at What is Thought? I was under the impression that Baum's meaning, underlying structure and compact program were defined in terms of algorithmic complexity. However, it's more complicated than that. While Nature is governed by an algorithmically simple program (the Standard Model Hamiltonian can, after all, be written down on a single sheet of paper) a useful evolved program has to run in a reasonable amount of time, under resource (memory, CPU) constraints that Nature itself does not face. Compressible does not imply tractable -- all of physics might reduce to a compact Theory of Everything, but it probably won't be very useful for designing jet airplanes.
Useful programs have to be efficient in many ways -- algorithmically and computationally. So it's not a tautology that Nature is very compressible, therefore there must exist compact (useful) programs that exploit this compressibility. It's important that there are many intermediate levels of compression (i.e., description -- as in quarks vs molecules vs bulk solids vs people), and computationally effective programs to deal with those levels. I'm not sure what measure is used in computer science to encompass both algorithmic and computational complexity. Baum discusses something called minimum description length, but it's not clear to me exactly how the requirement of effective means of computation is formalized. In the language of physicists, Baum's compact (useful) programs are like effective field theories incorporating the relevant degrees of freedom for a certain problem -- they are not only a compressed model of the phenomena, but also allow simple computations.
Evolution has, using a tremendous amount of computational power, found these programs, and our best hope for AI is to exploit their existence to speed our progress. If Baum is correct, the future may be machine learning guided by human Mechanical Turk workers.
Baum has recently relocated to Berkeley to pursue a startup based on his ideas. (Ah, the excitement! I did the same in 2000 ...) His first project is to develop a world class Go program (no lack of ambition :-), with more practical applications down the road. Best of Luck!
Saturday, August 29, 2009
The cost of liquidity?
Speaking of high frequency trading, here's Zero Hedge on the Renaissance vs. Volfbeyn and Belpolsky matter. Related posts here and here.
130. While employed by Renaissance, Dr. Volfbeyn's superiors repeatedly asked him to assist Renaissance in conducting securities transaction that Dr. Volfbeyn believed to be illegal.
The illegality of these activites touched upon three main topical areas:
ITG-POSIT (The Dark Pool angle)
131. In particular, Dr. Volfbeyn was instructed to devise a strategy to defraud investors trading through the Portfolio System for Institutional Trading ("POSIT"). POSIT is an electronic trading system operating by Investment Technology Group ("ITG" ). POSIT collects buy and sell orders from large traders and attempts to match them.
132. On information and belief, POSIT is completely confidential. It does not reveal information about orders to anyone. For its customers, this confidentiality is an essential aspect of the system.
133. Renaissance asked Dr. Volfbeyn to create a computer algorithm to reveal information that POSIT intended to keep confidential [REDACTED]
134. Renaissance intended to, and did, use this trading strategy [the POSIT strategy] to profit [REDACTED]
135. Dr. Volfbeyn believed that [REDACTED] [the POSIT strategy] violated securities laws. He expressed his opinion to his superiors at Renaissance and refused to build the computer algorithm as they requested.
Limit Order Strategy [Stealing Liquidity]
139. Renaissance asked Dr. Volfbeyn to develop a computer algorithm [REDACTED] [the "limit order strategy"]
140. A limit order is an instruction to trade at the best price available, but only if the price is no worse that a "limit price" specified by the trader. Standing limit orders are placed in a file, called a limit-order book. Limit-order books on the New York Stock Exchange and NASDAQ are available to be viewed by anyone.
141. By [REDACTED], Renaissance intended to profit illegally.
142. Dr. Volfbeyn refused to participate in such activities. He explained that his refusal was based on his belief that the proposed transactions violated securities laws [2nd time RenTec allegedly used an illegel strategy]
143. Senior Renaissance personnel, including Executive Vice President Peter Brown and Vice President Mark Silber, attempted to persuade Dr. Volfbeyn to engage in the [REDACTED] limit order strategy, despite his objections. Mark Silber is the compliance officer for Renaissance, responsible for implementing systems to ensure that Renaissance does not violate the securities laws, and for protecting employees who complain about potentially illegal conduct.
144. On information and belief, Renaissance did not implement the [REDACTED] limit order strategy prior to Dr. Volfbeyn's termination. [What about after?]
Swap Transactions [The Naked Short Scam]
145. At all times relevant to this action, Rule 3350 of the NASD, prohibited NASD members, with certain exceptions from effecting short sales in any Nasdaq security at or below the current national best (inside) bid when the current national best (inside) bid is below the preceding national best (inside) bid in the security.
146. At all times relevant to this action, Rule 10a-1 under the Securities Exchange Act of 1934 provided that, subject to certain exceptions, an exchange-listed security could only be sold short at a price above the immediately preceding reported price or at the last sale price if it is higher than the last different reported price.
147. During the period when Dr. Volfbeyn and Dr. Belopolsky were employed at Renaissance, plaintiff engaged in a massive scam [REDACTED] [the "swap transaction strategy"]
148. [REDACTED]
149. [REDACTED]
150. Renaissance conducted [REDACTED] in violation of Rule 3350 and Rule 10a-1. Renaissance also intentionally [REDACTED] in violation of SEC and NASD rules. [REDACTED] Renaissance profited from the strategy [REDACTED].
151. Researchers at Renaissance expressed their concern to Executive Vice President Peter Brown and other officials of Renaissance about the legality of these swap transactions, including concerns that the transactions violated the tax laws and securities laws. Renaissance failed to halt the transactions. On information and belief, the swap transactions are continuing and generate substantial profits for Renaissance.
Sunday, August 23, 2009
High speed trading: an $8 billion per year tax?
Because there is money to be made at this game (estimated in the article to be in excess of $8 billion per year), it is attracting both financial and human capital. The human capital might be doing more productive things elsewhere. Let's suppose that the social utility provided by the market is efficient resource allocation via price signals. I see no benefit to society from making these price signals "efficient" over time scales as small as a tenth of a second (i.e., much smaller than the time scales over which physical resources can be allocated, or humans can make executive decisions). Imposing a 1-100 second random delay on any order placed at an exchange would not (as far as I can see) have any negative impact on the actual economy, but would eliminate an expensive arms race that transfers money from small or long-term investors to brainy hedge funds.
These activities are particularly appealing to banks and hedge funds in the current environment because a trader can book a real profit or loss at the end of each day (or even every few seconds!) -- very different from the illiquid positions that led to the credit crisis. In the long run, even complex derivatives like CDO and CDS contracts have the potential of providing some social good -- the ability to diversify risks, etc. I see no comparable redeeming value in high speed trading.
A couple years ago I listened to a talk by a former high energy physicist about how his firm was using FPGAs to execute their trading algorithms in hardware. Sound like a good use of brainpower and resources to you?
Regarding the Aleynikov case reported on below, 32 MB of proprietary source code is not a small amount of code. It could be the core of Goldman's algorithms. I would think a court-appointed expert could easily determine the value of the code Aleynikov downloaded. If he merely grabbed it by mistake while downloading some open source directories (his claim), it would be unlikely to contain the key algorithms.
Related posts here (Aleynikov vs. Goldman) and here (Volfbeyn and Belopolsky vs. Renaissance).
Clarification in response to comments: I guess I should make my claims more precise. It seems to me that imposing a random delay of average length T would
1) remove the possibility of gaming the system on timescales much less than T (thereby sending lots of smart people back to productive activities)
2) not affect market liquidity on timescales much larger than T. I claim that for T of order a minute (or even longer) there is no social value from liquidity on much smaller time scales.
I am not highly confident of my statements because market making is a dynamical system, with interacting agents, etc. Doyne Farmer and Sante Fe researchers did some modeling for NASDAQ in anticipation of their 2001 decimalization (see here); the details are complicated. Yes, the price of decimalization and liquidity on very short time scales is the current arms race, but I have yet to see an argument for why those things are good for society.
For a defense of high frequency trading, see discussion at Scott Locklin's blog here and here.
These activities are particularly appealing to banks and hedge funds in the current environment because a trader can book a real profit or loss at the end of each day (or even every few seconds!) -- very different from the illiquid positions that led to the credit crisis. In the long run, even complex derivatives like CDO and CDS contracts have the potential of providing some social good -- the ability to diversify risks, etc. I see no comparable redeeming value in high speed trading.
A couple years ago I listened to a talk by a former high energy physicist about how his firm was using FPGAs to execute their trading algorithms in hardware. Sound like a good use of brainpower and resources to you?
Regarding the Aleynikov case reported on below, 32 MB of proprietary source code is not a small amount of code. It could be the core of Goldman's algorithms. I would think a court-appointed expert could easily determine the value of the code Aleynikov downloaded. If he merely grabbed it by mistake while downloading some open source directories (his claim), it would be unlikely to contain the key algorithms.
Related posts here (Aleynikov vs. Goldman) and here (Volfbeyn and Belopolsky vs. Renaissance).
NYTimes: ... the charges, along with civil cases in Chicago and New York involving other Wall Street firms, offer a glimpse into the turbulent world of ultrafast computerized stock trading.
Little understood outside the securities industry, the business has suddenly become one of the most competitive and controversial on Wall Street. At its heart are computer programs that take years to develop and are treated as closely guarded secrets.
Mr. Aleynikov, who is free on $750,000 bond, is suspected of having taken pieces of Goldman software that enables the buying and selling of shares in milliseconds. Banks and hedge funds use such programs to profit from tiny price discrepancies among markets and in some instances leap in front of bigger orders.
Defenders of the programs say they make trading more efficient. Critics say they are little more than a tax on long-term investors and can even worsen market swings.
But no one disputes that high-frequency trading is highly profitable. The Tabb Group, a financial markets research firm, estimates that the programs will make $8 billion this year for Wall Street firms. Bernard S. Donefer, a distinguished lecturer at Baruch College and the former head of markets systems at Fidelity Investments, says profits are even higher.
“It is certainly growing,” said Larry Tabb, founder of the Tabb Group. “There’s more talent around, and the technology is getting cheaper.”
The profits have led to a gold rush, with hedge funds and investment banks dangling million-dollar salaries at software engineers. In one lawsuit, the Citadel Investment Group, a $12 billion hedge fund, revealed that it had paid tens of millions to two top programmers in the last seven years.
“A geek who writes code — those guys are now the valuable guys,” Mr. Donefer said.
The spate of lawsuits reflects the highly competitive nature of ultrafast trading, which is evolving quickly, largely because of broader changes in stock trading, securities industry experts say.
Until the late 1990s, big investors bought and sold large blocks of shares through securities firms like Morgan Stanley. But in the last decade, the profits from making big trades have vanished, so investment banks have become reluctant to take such risks.
Today, big investors divide large orders into smaller trades and parcel them to many exchanges, where traders compete to make a penny or two a share on each order. Ultrafast trading is an outgrowth of that strategy.
As Mr. Aleynikov and other programmers have discovered, investment banks do not take kindly to their leaving, especially if the banks believe that the programmers are taking code — the engine that drives trading — on their way out.
Mr. Aleynikov immigrated to the United States from Russia in 1991. In 1998, he joined IDT, a telecommunications company, where he wrote software to route calls and data more efficiently. In 2007, Goldman hired him as a vice president, paying him $400,000 a year, according to the federal complaint against him.
He lived in the central New Jersey suburbs with his wife and three young daughters. This year, the family moved to a $1.14 million mansion in North Caldwell, best known as Tony Soprano’s hometown. ...
This spring, Mr. Aleynikov quit Goldman to join Teza Technologies, a new trading firm, tripling his salary to about $1.2 million, according to the complaint. He left Goldman on June 5. In the days before he left, he transferred code to a server in Germany that offers free data hosting.
At Mr. Aleynikov’s bail hearing, Joseph Facciponti, the assistant United States attorney prosecuting the case, said that Goldman discovered the transfer in late June. On July 1, the company told the government about the suspected theft. Two days later, agents arrested Mr. Aleynikov at Newark.
After his arrest, Mr. Aleynikov was taken for interrogation to F.B.I. offices in Manhattan. Mr. Aleynikov waived his rights against self-incrimination, and agreed to allow agents to search his house.
He said that he had inadvertently downloaded a portion of Goldman’s proprietary code while trying to take files of open source software — programs that are not proprietary and can be used freely by anyone. He said he had not used the Goldman code at his new job or distributed it to anyone else, and the criminal complaint offers no evidence that he has.
Why he downloaded the open source software from Goldman, rather than getting it elsewhere, and how he could at the same time have inadvertently downloaded some of the firm’s most confidential software, is not yet clear.
At Mr. Aleynikov’s bail hearing, Mr. Facciponti said that simply by sending the code to the German server, he had badly damaged Goldman.
“The bank itself stands to lose its entire investment in creating this software to begin with, which is millions upon millions of dollars,” Mr. Facciponti said.
Sabrina Shroff, a public defender who represents Mr. Aleynikov, responded that he had transferred less than 32 megabytes of Goldman proprietary code, a small fraction of the overall program, which is at least 1,224 megabytes. Kevin N. Fox, the magistrate judge, ordered Mr. Aleynikov released on bond.
Clarification in response to comments: I guess I should make my claims more precise. It seems to me that imposing a random delay of average length T would
1) remove the possibility of gaming the system on timescales much less than T (thereby sending lots of smart people back to productive activities)
2) not affect market liquidity on timescales much larger than T. I claim that for T of order a minute (or even longer) there is no social value from liquidity on much smaller time scales.
I am not highly confident of my statements because market making is a dynamical system, with interacting agents, etc. Doyne Farmer and Sante Fe researchers did some modeling for NASDAQ in anticipation of their 2001 decimalization (see here); the details are complicated. Yes, the price of decimalization and liquidity on very short time scales is the current arms race, but I have yet to see an argument for why those things are good for society.
For a defense of high frequency trading, see discussion at Scott Locklin's blog here and here.
Monday, July 06, 2009
More algorithm wars
Some time ago I posted about two MIT-trained former physicists who were sued by Renaissance for theft of trade secrets related to algorithmic trading and market making. Reportedly, Belopolsky and Volfbeyn won their court case and are now printing money at a well-known hedge fund. The Bloomberg article below is about a former Goldman employee who may have made off with code used in prop trading and market making.
The story is also covered in the WSJ (whose reporters and editors don't know the difference between "code" and "codes" -- as in software vs cryptographic keys), where it is revealed that Aleynikov was paid $400k per year at Goldman and left to join a fund in Chicago which offered him three times as much.
The story is also covered in the WSJ (whose reporters and editors don't know the difference between "code" and "codes" -- as in software vs cryptographic keys), where it is revealed that Aleynikov was paid $400k per year at Goldman and left to join a fund in Chicago which offered him three times as much.
Goldman Trading-Code Investment Put at Risk by Theft
2009-07-06 23:18:39.529 GMT
July 6 (Bloomberg) -- Goldman Sachs Group Inc. may lose its investment in a proprietary trading code and millions of dollars from increased competition if software allegedly stolen by an ex-employee gets into the wrong hands, a prosecutor said.
Sergey Aleynikov, an ex-Goldman Sachs computer programmer, was arrested July 3 after arriving at Liberty International Airport in Newark, New Jersey, U.S. officials said. Aleynikov, 39, who has dual American and Russian citizenship, is charged in a criminal complaint with stealing the trading software. At a court appearance July 4 in Manhattan, Assistant U.S. Attorney Joseph Facciponti told a federal judge that Aleynikov’s alleged theft poses a risk to U.S. markets. Aleynikov transferred the code, which is worth millions of dollars, to a computer server in Germany, and others may have had access to it, Facciponti said, adding that New York-based Goldman Sachs may be harmed if the software is disseminated. ...
The prosecutor added, “Once it is out there, anybody will be able to use this, and their market share will be adversely affected.” The proprietary code lets the firm do “sophisticated, high-speed and high-volume trades on various stock and commodities markets,” prosecutors said in court papers. The trades generate “many millions of dollars” each year.
... “Someone stealing that code is basically stealing the way that Goldman Sachs makes money in the equity marketplace,” said Larry Tabb, founder of TABB Group, a financial-market research and advisory firm. “The more sophisticated market makers -- and Goldman is one of them -- spend significant amounts of money developing software that’s extremely fast and can analyze different execution strategies so they can be the first one to make a decision.”
Aleynikov studied applied mathematics at the Moscow Institute of Transportation Engineering before transferring to Rutgers University, where he received a bachelor’s degree in computer science in 1993 and a master’s of science degree, specializing in medical image processing and neural networks, in 1996, according to his profile on the social-networking site LinkedIn.
Saturday, February 07, 2009
Ant algorithms
A review of E.O. Wilson's latest in The NY Review of Books. Ants seem to get a lot done based on a few simple capabilities: they can lay down odors, detect and differentiate those odors, and count.
In Surely You're Joking, Feynman recounts some great experiments he did on ants in his Princeton dorm room. See, e.g., here. My wife is totally uninterested when I do these types of things at home, but perhaps my kids will like it when they get a bit older :-)
In Surely You're Joking, Feynman recounts some great experiments he did on ants in his Princeton dorm room. See, e.g., here. My wife is totally uninterested when I do these types of things at home, but perhaps my kids will like it when they get a bit older :-)
...However, ants clearly are fundamentally different from us. A whimsical example concerns the work of ant morticians, which recognize ant corpses purely on the basis of the presence of a product of decomposition called oleic acid. When researchers daub live ants with the acid, they are promptly carried off to the ant cemetery by the undertakers, despite the fact that they are alive and kicking. Indeed, unless they clean themselves very thoroughly they are repeatedly dragged to the mortuary, despite showing every other sign of life.
The means that ants use to find their way in the world are fascinating. It has recently been found that ant explorers count their steps to determine where they are in relation to home. This remarkable ability was discovered by researchers who lengthened the legs of ants by attaching stilts to them. The stilt-walking ants, they observed, became lost on their way home to the nest at a distance proportionate to the length of their stilts.
The principal tools ants use, however, in guiding their movements and actions are potent chemical signals known as pheromones. ...
One can hardly help but admire the intelligence of the ant colony, yet theirs is an intelligence of a very particular kind. "Nothing in the brain of a worker ant represents a blueprint of the social order," Hölldobler and Wilson tell us, and there is no overseer or "brain caste" that carries such a master plan in its head. Instead, the ants have discovered how to create strength from weakness, by pooling their individually limited capacities into a collective decision-making system that bears an uncanny resemblance to our own democratic processes.
[How peculiar is it? Just replace the word "ant" by "cell" or "neuron" or something like that. Unless you don't believe in AI ;-)]
This capacity is perhaps most clearly illustrated when an ant colony finds reason to move. Many ants live in cavities in trees or rocks, and the size, temperature, humidity, and precise form and location of the chamber are all critically important to the success of the superorganism. Individual ants appear to size up the suitability of a new cavity using a rule of thumb called Buffon's needle algorithm. They do this by laying a pheromone trail across the cavity that is unique to that individual ant, then walking about the space for a given period of time. The more often they cross their own trail, the smaller the cavity is.
This yields only a rough measure of the cavity's size, for some ants using it may choose cavities that are too large, and others will choose cavities that are too small. The cavity deemed most suitable by the majority, however, is likely to be the best. The means employed by the ants to "count votes" for and against a new cavity is the essence of elegance and simplicity, for the cavity visited by the most ants has the strongest pheromone trail leading to it, and it is in following this trail that the superorganism makes its collective decision.
Sunday, May 11, 2008
On data mining
Last week we had Jiawei Han of UIUC here to give a talk: Exploring the Power of Links in Information Network Mining. He's the author of a well-known book on data mining.
During our conversation we discussed a number of projects his group has worked on in the past, all of which involve teasing out the structure in large bodies of data. Being a lazy theorist, my attitude in the past about data mining has been as follows: sit and think about the problem, come up with list of potential signals, analyze data to see which signals actually work. The point being that the good signals would turn out to be a subset (or possibly combination) of the ones you could think of a priori -- i.e., for which there is a plausible, human-comprehensible, reason.
In many of the examples we discussed I was able to guess the main signals that turned out to be useful. However, Han impressed on me that, these days, with gigantic corpora of data available, one often encounters very subtle signals that are identified only by algorithm -- that human intuition completely fails to identify. (Gee, why that weird linear combination of those inputs, with alternating signs, even?! :-)
Our conversation made me want to get my hands dirty on some big data mining project. Of course, it's much easier for him -- his group has something like ten graduate students at a time! Interestingly, he identified this ability to tap into large chunks of manpower as an advantage of being in academia as opposed to, e.g., at Microsoft Research. Of course, if you are doing very commercially applicable research you can access even greater resources at a company lab/startup, but for blue sky academic work it wouldn't be the case.
During our conversation we discussed a number of projects his group has worked on in the past, all of which involve teasing out the structure in large bodies of data. Being a lazy theorist, my attitude in the past about data mining has been as follows: sit and think about the problem, come up with list of potential signals, analyze data to see which signals actually work. The point being that the good signals would turn out to be a subset (or possibly combination) of the ones you could think of a priori -- i.e., for which there is a plausible, human-comprehensible, reason.
In many of the examples we discussed I was able to guess the main signals that turned out to be useful. However, Han impressed on me that, these days, with gigantic corpora of data available, one often encounters very subtle signals that are identified only by algorithm -- that human intuition completely fails to identify. (Gee, why that weird linear combination of those inputs, with alternating signs, even?! :-)
Our conversation made me want to get my hands dirty on some big data mining project. Of course, it's much easier for him -- his group has something like ten graduate students at a time! Interestingly, he identified this ability to tap into large chunks of manpower as an advantage of being in academia as opposed to, e.g., at Microsoft Research. Of course, if you are doing very commercially applicable research you can access even greater resources at a company lab/startup, but for blue sky academic work it wouldn't be the case.
Friday, February 01, 2008
Dating by algorithm
In this post NYTimes science reporter John Tierney, who writes the blog Tierny Lab, does a little experiment on the dating site eHarmony. eHarmony uses a complicated algorithm to match couples based on a lengthy personality questionnaire. Tierney seems surprised that the algorithm doesn't match him up with his wife, even when restricted geographically to his NYC zip code and even after further tweaking of their survey responses and consultation with eHarmony's chief scientist.
What Tierney doesn't seem to understand is that, under almost any algorithm for matching (including the "correct" algorithm that would predict happiness in his case), it is highly unlikely that the wife he found is actually optimal :-) Within a 10 mile radius (in NYC) there are dozens (hundreds? thousands?) of better matches he unfortunately never met. It's unromantic but true that chance played a bigger role in his marriage choice than optimality.
On a related note, I wonder whether social networking and online dating are gradually increasing the overall quality of marriages. It seems much easier to meet compatible partners than it was in the pre-Internet dark ages.
What Tierney doesn't seem to understand is that, under almost any algorithm for matching (including the "correct" algorithm that would predict happiness in his case), it is highly unlikely that the wife he found is actually optimal :-) Within a 10 mile radius (in NYC) there are dozens (hundreds? thousands?) of better matches he unfortunately never met. It's unromantic but true that chance played a bigger role in his marriage choice than optimality.
On a related note, I wonder whether social networking and online dating are gradually increasing the overall quality of marriages. It seems much easier to meet compatible partners than it was in the pre-Internet dark ages.
Subscribe to:
Posts (Atom)
Blog Archive
Labels
- physics (420)
- genetics (325)
- globalization (301)
- genomics (295)
- technology (282)
- brainpower (280)
- finance (275)
- american society (261)
- China (249)
- innovation (231)
- ai (206)
- economics (202)
- psychometrics (190)
- science (172)
- psychology (169)
- machine learning (166)
- biology (163)
- photos (162)
- genetic engineering (150)
- universities (150)
- travel (144)
- podcasts (143)
- higher education (141)
- startups (139)
- human capital (127)
- geopolitics (124)
- credit crisis (115)
- political correctness (108)
- iq (107)
- quantum mechanics (107)
- cognitive science (103)
- autobiographical (97)
- politics (93)
- careers (90)
- bounded rationality (88)
- social science (86)
- history of science (85)
- realpolitik (85)
- statistics (83)
- elitism (81)
- talks (80)
- evolution (79)
- credit crunch (78)
- biotech (76)
- genius (76)
- gilded age (73)
- income inequality (73)
- caltech (68)
- books (64)
- academia (62)
- history (61)
- intellectual history (61)
- MSU (60)
- sci fi (60)
- harvard (58)
- silicon valley (58)
- mma (57)
- mathematics (55)
- education (53)
- video (52)
- kids (51)
- bgi (48)
- black holes (48)
- cdo (45)
- derivatives (43)
- neuroscience (43)
- affirmative action (42)
- behavioral economics (42)
- economic history (42)
- literature (42)
- nuclear weapons (42)
- computing (41)
- jiujitsu (41)
- physical training (40)
- film (39)
- many worlds (39)
- quantum field theory (39)
- expert prediction (37)
- ufc (37)
- bjj (36)
- bubbles (36)
- mortgages (36)
- google (35)
- race relations (35)
- hedge funds (34)
- security (34)
- von Neumann (34)
- meritocracy (31)
- feynman (30)
- quants (30)
- taiwan (30)
- efficient markets (29)
- foo camp (29)
- movies (29)
- sports (29)
- music (28)
- singularity (27)
- entrepreneurs (26)
- conferences (25)
- housing (25)
- obama (25)
- subprime (25)
- venture capital (25)
- berkeley (24)
- epidemics (24)
- war (24)
- wall street (23)
- athletics (22)
- russia (22)
- ultimate fighting (22)
- cds (20)
- internet (20)
- new yorker (20)
- blogging (19)
- japan (19)
- scifoo (19)
- christmas (18)
- dna (18)
- gender (18)
- goldman sachs (18)
- university of oregon (18)
- cold war (17)
- cryptography (17)
- freeman dyson (17)
- smpy (17)
- treasury bailout (17)
- algorithms (16)
- autism (16)
- personality (16)
- privacy (16)
- Fermi problems (15)
- cosmology (15)
- happiness (15)
- height (15)
- india (15)
- oppenheimer (15)
- probability (15)
- social networks (15)
- wwii (15)
- fitness (14)
- government (14)
- les grandes ecoles (14)
- neanderthals (14)
- quantum computers (14)
- blade runner (13)
- chess (13)
- hedonic treadmill (13)
- nsa (13)
- philosophy of mind (13)
- research (13)
- aspergers (12)
- climate change (12)
- harvard society of fellows (12)
- malcolm gladwell (12)
- net worth (12)
- nobel prize (12)
- pseudoscience (12)
- Einstein (11)
- art (11)
- democracy (11)
- entropy (11)
- geeks (11)
- string theory (11)
- television (11)
- Go (10)
- ability (10)
- complexity (10)
- dating (10)
- energy (10)
- football (10)
- france (10)
- italy (10)
- mutants (10)
- nerds (10)
- olympics (10)
- pop culture (10)
- crossfit (9)
- encryption (9)
- eugene (9)
- flynn effect (9)
- james salter (9)
- simulation (9)
- tail risk (9)
- turing test (9)
- alan turing (8)
- alpha (8)
- ashkenazim (8)
- data mining (8)
- determinism (8)
- environmentalism (8)
- games (8)
- keynes (8)
- manhattan (8)
- new york times (8)
- pca (8)
- philip k. dick (8)
- qcd (8)
- real estate (8)
- robot genius (8)
- success (8)
- usain bolt (8)
- Iran (7)
- aig (7)
- basketball (7)
- free will (7)
- fx (7)
- game theory (7)
- hugh everett (7)
- inequality (7)
- information theory (7)
- iraq war (7)
- markets (7)
- paris (7)
- patents (7)
- poker (7)
- teaching (7)
- vietnam war (7)
- volatility (7)
- anthropic principle (6)
- bayes (6)
- class (6)
- drones (6)
- econtalk (6)
- empire (6)
- global warming (6)
- godel (6)
- intellectual property (6)
- nassim taleb (6)
- noam chomsky (6)
- prostitution (6)
- rationality (6)
- academia sinica (5)
- bobby fischer (5)
- demographics (5)
- fake alpha (5)
- kasparov (5)
- luck (5)
- nonlinearity (5)
- perimeter institute (5)
- renaissance technologies (5)
- sad but true (5)
- software development (5)
- solar energy (5)
- warren buffet (5)
- 100m (4)
- Poincare (4)
- assortative mating (4)
- bill gates (4)
- borges (4)
- cambridge uk (4)
- censorship (4)
- charles darwin (4)
- computers (4)
- creativity (4)
- hormones (4)
- humor (4)
- judo (4)
- kerviel (4)
- microsoft (4)
- mixed martial arts (4)
- monsters (4)
- moore's law (4)
- soros (4)
- supercomputers (4)
- trento (4)
- 200m (3)
- babies (3)
- brain drain (3)
- charlie munger (3)
- cheng ting hsu (3)
- chet baker (3)
- correlation (3)
- ecosystems (3)
- equity risk premium (3)
- facebook (3)
- fannie (3)
- feminism (3)
- fst (3)
- intellectual ventures (3)
- jim simons (3)
- language (3)
- lee kwan yew (3)
- lewontin fallacy (3)
- lhc (3)
- magic (3)
- michael lewis (3)
- mit (3)
- nathan myhrvold (3)
- neal stephenson (3)
- olympiads (3)
- path integrals (3)
- risk preference (3)
- search (3)
- sec (3)
- sivs (3)
- society generale (3)
- systemic risk (3)
- thailand (3)
- twitter (3)
- alibaba (2)
- bear stearns (2)
- bruce springsteen (2)
- charles babbage (2)
- cloning (2)
- david mamet (2)
- digital books (2)
- donald mackenzie (2)
- drugs (2)
- dune (2)
- exchange rates (2)
- frauds (2)
- freddie (2)
- gaussian copula (2)
- heinlein (2)
- industrial revolution (2)
- james watson (2)
- ltcm (2)
- mating (2)
- mba (2)
- mccain (2)
- monkeys (2)
- national character (2)
- nicholas metropolis (2)
- no holds barred (2)
- offices (2)
- oligarchs (2)
- palin (2)
- population structure (2)
- prisoner's dilemma (2)
- singapore (2)
- skidelsky (2)
- socgen (2)
- sprints (2)
- star wars (2)
- ussr (2)
- variance (2)
- virtual reality (2)
- war nerd (2)
- abx (1)
- anathem (1)
- andrew lo (1)
- antikythera mechanism (1)
- athens (1)
- atlas shrugged (1)
- ayn rand (1)
- bay area (1)
- beats (1)
- book search (1)
- bunnie huang (1)
- car dealers (1)
- carlos slim (1)
- catastrophe bonds (1)
- cdos (1)
- ces 2008 (1)
- chance (1)
- children (1)
- cochran-harpending (1)
- cpi (1)
- david x. li (1)
- dick cavett (1)
- dolomites (1)
- eharmony (1)
- eliot spitzer (1)
- escorts (1)
- faces (1)
- fads (1)
- favorite posts (1)
- fiber optic cable (1)
- francis crick (1)
- gary brecher (1)
- gizmos (1)
- greece (1)
- greenspan (1)
- hypocrisy (1)
- igon value (1)
- iit (1)
- inflation (1)
- information asymmetry (1)
- iphone (1)
- jack kerouac (1)
- jaynes (1)
- jazz (1)
- jfk (1)
- john dolan (1)
- john kerry (1)
- john paulson (1)
- john searle (1)
- john tierney (1)
- jonathan littell (1)
- las vegas (1)
- lawyers (1)
- lehman auction (1)
- les bienveillantes (1)
- lowell wood (1)
- lse (1)
- machine (1)
- mcgeorge bundy (1)
- mexico (1)
- michael jackson (1)
- mickey rourke (1)
- migration (1)
- money:tech (1)
- myron scholes (1)
- netwon institute (1)
- networks (1)
- newton institute (1)
- nfl (1)
- oliver stone (1)
- phil gramm (1)
- philanthropy (1)
- philip greenspun (1)
- portfolio theory (1)
- power laws (1)
- pyschology (1)
- randomness (1)
- recession (1)
- sales (1)
- skype (1)
- standard deviation (1)
- starship troopers (1)
- students today (1)
- teleportation (1)
- tierney lab blog (1)
- tomonaga (1)
- tyler cowen (1)
- venice (1)
- violence (1)
- virtual meetings (1)
- wealth effect (1)


