Showing posts with label prisoner's dilemma. Show all posts
Showing posts with label prisoner's dilemma. Show all posts

Saturday, July 28, 2012

Iterated Prisoner's Dilemma is an Ultimatum Game

Amazing new results on iterated prisoner's dilemma (IPD) by Bill Press (Numerical Recipes) and Freeman Dyson. There is something new under the sun. Once again, physicists invade adjacent field and add value.
Extortion and cooperation in the Prisoner’s Dilemma (June 18, 2012) 
The two-player Iterated Prisoner’s Dilemma game is a model for both sentient and evolutionary behaviors, especially including the emergence of cooperation. It is generally assumed that there exists no simple ultimatum strategy whereby one player can enforce a unilateral claim to an unfair share of rewards. Here, we show that such strategies unexpectedly do exist. In particular, a player X who is witting of these strategies can (i) deterministically set her opponent Y’s score, independently of his strategy or response, or (ii) enforce an extortionate linear relation between her and his scores. Against such a player, an evolutionary player’s best response is to accede to the extortion. Only a player with a theory of mind about his opponent can do better, in which case Iterated Prisoner’s Dilemma is an Ultimatum Game.
Accompanying commentary in PNAS. See these comments by Press and Dyson.
[[Press]] I was originally wondering about a much more modest question that, annoyingly, I couldn’t find already answered in the Prisoner’s Dilemma literature. ... The story now becomes one of symbiosis between computer and human intelligence: The computer could find instances, but not generalize them. I knew that the exact complement of computer intelligence, as yin to yang, is Freeman-Dyson-intelligence. So I showed what I had to Freeman. A day later, he sent me an email with the general result, equations 1-7 in our paper, all worked out. These equations immediately expose all the ZD strategies, including the successful extortionate ones. 
... The successful extortionate strategies have been mathematically present in IPD from the moment that Axelrod defined the game; they just went, seemingly, unnoticed. On a planet in another galaxy, seven million years ago, Axelrod-Prime independently invented the same IPD game. He (it?) was of a species several times more intelligent than Homo sapiens [[i.e., like Dyson!]] and so recognized immediately that, between sentient players, the IPD game is dominated by an obvious extortionate strategy. Hence, for Axelrod-Prime, IPD was just another instantiation of the well-studied Ultimatum Game. He (it?) thus never bothered to publish it.
The history of IPD shows that bounded cognition prevented the dominant strategies from being discovered for over 60 years, despite significant attention from game theorists, computer scientists, economists, evolutionary biologists, etc. Press and Dyson have shown that IPD is effectively an ultimatum game, which is very different from the Tit for Tat stories told by generations of people who worked on IPD (Axelrod, Dawkins, etc., etc.).

How can we expect markets populated by apes to find optimal solutions in finite time under realistic conditions, when the underlying parameters of the game (unlike in IPD) are constantly changing? You cannot think of a simpler quasi-realistic game of cooperation and defection than IPD, yet the game was not understood properly until Dyson investigated it! Economists should think deeply about the history of the academic study of IPD, and what it implies about rationality, heuristics, "efficient" markets (i.e., everyone can be wrong for a long, long time). 

For evolutionary biologists: Dyson clearly thinks this result has implications for multilevel (group vs individual selection):

... Cooperation loses and defection wins. The ZD strategies confirm this conclusion and make it sharper. ... The system evolved to give cooperative tribes an advantage over non-cooperative tribes, using punishment to give cooperation an evolutionary advantage within the tribe. This double selection of tribes and individuals goes way beyond the Prisoners' Dilemma model.
See also What use is game theory? and Plenty of room at the top.

Zero-Determinant Strategies in the Iterated Prisoner’s Dilemma provides a pedagogical summary of the new results.

Sunday, February 04, 2007

Anatol Rapoport, 1911-2007

Economist's View notes the passing of Anatol Rapoport, a mathematician turned game theorist and political theorist. I first discovered Rapoport from his introduction to an edition of von Clausewitz's On War. I found the introduction far more lucid and useful than von Clauswitz's own presentation. You can often detect a first rate thinker from a relatively short piece of work, and this brief introduction piqued my interest in Rapoport many years ago. As noted here,

I read some of his 1984 book "Mathematical Methods in the Social and Behavioral Sciences" and it's a great book. There are not many people who have a strong and original mathematical mind and yet know how to apply it with wisdom, but Rapoport's reach and depth in the book is hugely impressive.

Rapoport was the author of Tit for Tat, the benevolent strategy for prisoner's dilemma that won the earliest tournaments conducted by Axelrod at Michigan.

Globe&Mail: That year also saw publication of political scientist Robert Axelrod's seminal book, The Evolution of Co-operation, which asked a simple, yet age-old, question: If living things evolve through competition, how can co-operation ever emerge? A computer tournament was organized to study the relationship of game theory to evolution -- a variation on the Prisoner's Dilemma. Entries came from the world's top theorists.

Dr. Rapoport entered a program he wrote called Tit-For-Tat, consisting of four lines of code. It was by far the simplest entry, and it won. Betraying the retributive implications of its name, the program opened by co-operating with its opponent. Thereafter, it played exactly as the other side had played in the preceding game. If the other side had defected, Tit-For-Tat also defected for that one game. If the other side had co-operated, it co-operated on the next round.

"In effect, Tit-For-Tat punished the other player for selfish behaviour and rewarded her for co-operative behaviour -- but the punishment lasted only as long as the selfish behaviour lasted," observed Metta Spencer, editor of Peace Magazine, on the occasion of Dr. Rapoport's 90th birthday. "This proved to be an exceptionally effective sanction, quickly showing the other side the advantages of co-operating. . . . It also set moral philosophers to proposing this as a workable principle to use in real life interactions."

Blog Archive

Labels