Friday, May 14, 2010

Prisoner's Dilemma


The prisoner’s dilemma is a central issue in game theory that displays how it is possible for two people not to cooperate with each other even if it is each person’s best interest to do so. Suppose the police arrest two individuals suspected of committing a crime. The police do not have enough evidence to convict both criminals and separate both prisoners in two rooms and offer each the same deal. If one confesses while the other remains silent, the confessor goes free and the other receives a ten-year jail sentence. If both remain silent, they each go to jail for one year on a lesser charge. If they each confess, they both go to jail for five years. In this situation, it is in both prisoners’ interests to cooperate with each other (remain silent) and both want to avoid the ten-year sentence. However, since neither prisoner knows what the other will choose, it would be irrational to cooperate unilaterally (remain silent and possibly go to jail for 10 years) because of the chance that the other will defect (confess and go free). Since the goal is to avoid the maximum sentence, each prisoner should choose to confess because he will either go free or go to jail for five years, each of which is preferable to going to jail for ten years. The prisoner’s dilemma is important because it shows that for people to be perfectly rational actors, they cannot trust the other players in the game, regardless of the greater good they may achieve by cooperating.

In this case, the Nash equilibrium is mutual defection because neither prisoner can improve his position by cooperating with the police. In fact, in a one-shot version of the prisoner's dilemma, you should always defect. But, what if you played an iterated (many times) version of the game?

Two researchers, RAND's Merrill Flood and Melvin Dresher used two friends as guinea pigs and had them play the game one hundred times. Each player was shown the payoff table and had no advance knowledge of what his opponent would do, but they would gain information as each round was played (their individual payoffs would tell each person what the other had chosen in the previous iteration) As stated earlier, mutual defection is the Nash equilibrium in a prisoner's dilemma. However, when Flood and Dresher ran the experiment, player 1 chose the non-equilibrium (cooperation) strategy 68 times and player 2 did it 78 times! Both players kept a log of comments that they wrote down after each round and it shows a struggle to cooperate. The payoffs were skewed in favor of player 1, which meant player 2 could stand to gain more by unilateral defection. However, each time he did this, player 1 retaliated by defecting on the next round. Surprisingly, the "punishments" occurred infrequently and both players returned to mutual cooperation. Flood and Dresher presented the results of this experiment to Nash, who dismissed it because there was too much interaction between the players.

Let me go back to the difference between a one-shot version and an iterated version of this game. As I said earlier, it's best to defect if you're only playing the game one time, but over the "long run" both players gain even more if they cooperated with each other. But, there is a problem with that reasoning, too. There is a concept known as backward induction. Say you and a friend were to play this game one hundred times like Flood's and Dresher's guinea pigs. You would probably quickly understand that mutual cooperation is best over the long run. But when you get to the hundredth game, it actually turns into a one-shot prisoner's dilemma. At the risk of being redundant, in a one-shot version you should always defect. It is safe to assume that your friend realizes that fact as well and probably says to himself, "Why should I be the only one to get the punishment payoff in the last game? I'm going to start defecting in game 99." Wait a minute...you're smart too and you realize he could do exactly that, defect in game 99. So now you start defecting in game 98 because you don't want to get the punishment payoff either. See the point? Both players can logically deduce that the other player will defect at some point because the game cannot go on indefinitely. Therefore, both players start defecting from the first game even though they can maximize their individual payoffs by mutually cooperating.

This leads to two different paradoxes for the prisoner's dilemma; first, in a one-shot version the rational choice is for both players to defect, but their individual payoffs would be higher if they cooperated; second, in an iterated version, the backward induction paradox is in play, meaning that the rational player will get stuck with the punishment payoff every time and the irrational player ends up significantly ahead.

Nobody has been able to solve the prisoner's dilemma and I doubt anyone will be able to in the near future.
Reblog this post [with Zemanta]

No comments:

Post a Comment