Extortion and cooperation in the Prisoner’s Dilemma (June 18, 2012)
The two-player Iterated Prisoner’s Dilemma game is a model for both sentient and evolutionary behaviors, especially including the emergence of cooperation. It is generally assumed that there exists no simple ultimatum strategy whereby one player can enforce a unilateral claim to an unfair share of rewards. Here, we show that such strategies unexpectedly do exist. In particular, a player X who is witting of these strategies can (i) deterministically set her opponent Y’s score, independently of his strategy or response, or (ii) enforce an extortionate linear relation between her and his scores. Against such a player, an evolutionary player’s best response is to accede to the extortion. Only a player with a theory of mind about his opponent can do better, in which case Iterated Prisoner’s Dilemma is an Ultimatum Game.Accompanying commentary in PNAS. See these comments by Press and Dyson.
[[Press]] I was originally wondering about a much more modest question that, annoyingly, I couldn’t find already answered in the Prisoner’s Dilemma literature. ... The story now becomes one of symbiosis between computer and human intelligence: The computer could find instances, but not generalize them. I knew that the exact complement of computer intelligence, as yin to yang, is Freeman-Dyson-intelligence. So I showed what I had to Freeman. A day later, he sent me an email with the general result, equations 1-7 in our paper, all worked out. These equations immediately expose all the ZD strategies, including the successful extortionate ones.
... The successful extortionate strategies have been mathematically present in IPD from the moment that Axelrod defined the game; they just went, seemingly, unnoticed. On a planet in another galaxy, seven million years ago, Axelrod-Prime independently invented the same IPD game. He (it?) was of a species several times more intelligent than Homo sapiens [[i.e., like Dyson!]] and so recognized immediately that, between sentient players, the IPD game is dominated by an obvious extortionate strategy. Hence, for Axelrod-Prime, IPD was just another instantiation of the well-studied Ultimatum Game. He (it?) thus never bothered to publish it.The history of IPD shows that bounded cognition prevented the dominant strategies from being discovered for over over 60 years, despite significant attention from game theorists, computer scientists, economists, evolutionary biologists, etc. Press and Dyson have shown that IPD is effectively an ultimatum game, which is very different from the Tit for Tat stories told by generations of people who worked on IPD (Axelrod, Dawkins, etc., etc.).
How can we expect markets populated by apes to find optimal solutions in finite time under realistic conditions, when the underlying parameters of the game (unlike in IPD) are constantly changing? You cannot think of a simpler quasi-realistic game of cooperation and defection than IPD, yet the game was not understood properly until Dyson investigated it! Economists should think deeply about the history of the academic study of IPD, and what it implies about rationality, heuristics, "efficient" markets (i.e., everyone can be wrong for a long, long time).
For evolutionary biologists: Dyson clearly thinks this result has implications for multilevel (group vs individual selection):
... Cooperation loses and defection wins. The ZD strategies confirm this conclusion and make it sharper. ... The system evolved to give cooperative tribes an advantage over non-cooperative tribes, using punishment to give cooperation an evolutionary advantage within the tribe. This double selection of tribes and individuals goes way beyond the Prisoners' Dilemma model.See also What use is game theory? and Plenty of room at the top.
Zero-Determinant Strategies in the Iterated Prisoner’s Dilemma provides a pedagogical summary of the new results.