Prisoners Dilemma Algorithm

After watching The Dark Knight, I became interested in the prisoner's dilemma concept. There should be an algorithm that maximizes one's own gain, given the situation.

For those who consider this a stranger: http://en.wikipedia.org/wiki/Prisoner%27s_dilemma

Very, very interesting stuff.

Edit: The question is, what, if any, is the most efficient algorithm that exists for prisoners' dilemmas?

+6
performance algorithm game-theory
source share
13 answers

Since there is only one choice, and in the absence of any removable inputs, your algorithm will either:

cooperate = true; 

... or...

 cooperate = false 

It is more interesting to find a strategy for the Iterated Prisoners dilemma, which has been done by many people. For example http://www.iterated-prisoners-dilemma.net/

Even then, this is not “solvable”, since the other player is not predictable.

+12
source share

It seems that the Wikipedia page gives all the answers ... for the dilemma of a one-time prisoner, the most optimal solution for each prisoner (and not both prisoners) is to betray.

For the prisoner's iterative dilemma, it is best to remain silent for the first time, and then do what the other prisoner did the last time.

+7
source share

The whole point of the dilemma is that the optimal solution (both prisoners remain calm) is dangerous because part of the problem is out of your hands. Thus, choosing a suboptimal solution seems to maximize your winnings, but it is still suboptimal

I do not see how the algorithm can provide a solution when part of the problem is unknown.

+3
source share

I recommend reading Axelrod Evolution of Collaboration . This is a computer experiment of competing strategies for the iterative prison dilemma. When I heard about this for the last time, the tit-for-tat strategy came out first. Perhaps he has changed.

+3
source share

For a one-time version of the game, the best strategy should always be defective, since there is no possibility of retaliation.

It becomes more interesting for the re-release, as players can respond to previous versions of their opponents.

If we know in advance how many rounds there will be, then the logical “best” strategy will always be deformed. This is due to the fact that it always makes sense to cause damage at the last turn, since there is no possibility of retaliation. Of course, our reasonable opponent will know this, as well as always make mistakes in the last move. This allows us to make a mistake in the penultimate turn, since in any case there is no chance of cooperation in the final turn. Following this logic to its natural conclusion, we must be mistaken at every step.

When the total number of rounds is unknown, everything becomes more interesting. A good game strategy should try to predict what the opponent will do. I researched using evolutionary algorithms and simple machine learning with enemy modeling to create strategies for the game for my master's degree. If you are really interested, you can read my thesis .

As recommended by Yuval, perhaps the best place to start is the original Axelrod book . If you are really interested in this, it was the 20th anniversary of the follow-up , which included many more recent work on the IPD (Iterated Prisoner Dilemma) by other researchers.

In addition, I fully recommended the William Poundston Prisoners Dilemma , which is part of John von Neumann's biography and a partial introduction to game theory.

+2
source share

Well, in my opinion, pattern recognition is also a huge part of it. Finding another prisoner's habit is how often he stays calm and when he pops. You must also cross-reference your own decisions to determine what you have done to make it react in a certain way.

I think this is a bit more complicated than the wiki explained. This is not what the prisoner did the last time, but everything goes so far that it extends to infinity.

+1
source share

Oh yes. It made me remember this old article about Prisoner Software Development Dilemma

For the algorithmic contest PD look here

This one was good too

0
source share

Since then, you cannot categorically predict the behavior of the second prisoner.

There are all kinds of “solutions” that make fundamental, but very restrictive assumptions about the behavior of the second prisoner, but they can say little about the unconditional problem (which makes it such an insurmountable dilemma).

My two cents, given that you cannot rely on the behavior of second prisoners, is that it comes down to: Are you an optimist or a cynic? Will the two prisoners gather together (honor among thieves), or are they going to land at each other as soon as possible in order to save their throat ...?

0
source share

In addition, in a game with iterative prisoners, the optimal strategy will depend on the other strategies in the game.

In a series against a player who ALWAYS defects always defects - the best strategy. When you play against a player who can collaborate, a strategy that responds but sometimes forgives is likely to be better.

I must add that this applies only to a game of unknown length. Any game with a known length is identical to a one-round game.

0
source share

Trying to find the best solution for prisoners' dilemmas is like trying to find one for Ro-Sham-Bo (scissor benches). The best you can do is simulate your opponent and try to use patterns.

In the early days of game theory and computer science, John von Neumann and Rand Corporation spent a huge amount of skull trying to find an optimal algorithm for solving the Prisoner Dilemma without success, and, in the end, it was mathematically proved that there was no optimal solution.

0
source share

The whole point of the prisoners dilemma is that your optimal strategy is to betray another prisoner. O (1)

0
source share

Mathematically, other messages answer the question, but in reality there may be additional options. No matter how absurd these options are, they will lead to additional possibilities of outcome, and they can lead to an increase in the likelihood of personal gain. For example, in the case of Batman, this will destroy the plot, but he could just kill the Joker - thus destroying any additional effects that the Joker will have on the result. Allowing the Joker to live, Batman involuntarily allows the Joker to only “win” what he needs.

0
source share

The game becomes much more interesting when you step back and view the entire tournament. For example, a few years ago, a PD team won a PD tournament with several entries. One of them was a "master", and the other - "slaves." All of them will begin to play a certain sequence of actions that will allow the masters and slaves to recognize each other. Once a recognized master was mistaken, and the slave will cooperate for the remaining iterations. Thus, the master won the tournament, but at the cost of slaves.

The strategy made economic sense, because in the first place was a cash prize, but the cost of entry was low.

In general, when writing a program for a TD tournament, you need to look at a larger picture:

  • How are prizes awarded?
  • Can you conspire with other members?
  • What are the entry costs? fines?

Otherwise, yes, the one-shot PD defect is the dominant strategy. Axelrod, as already mentioned, showed that tit-for-tat was reliable in a series of tournaments, but in these tournaments no one thought of conspiring with other participants.

0
source share

All Articles