Grade Level
The Prisoner’s Dilemma is a math problem that incorporates ideas from game theory. It aims to mathematically represent behavior in a situation where an individual's success in making choices depends on the choices of others. Because it does not have a “standard probability”, and instead, it involves recursive choices, it’s recommended that the game targets high school students. Less complex versions of the game, however, can be played and understood by students with a basic grasp of probability and deductive logic skills.
The Problem
Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies for the prosecution against the other (“betrays”) and the other remains silent (“cooperates”), the betrayer goes free and the silent accomplice receives the full twenty-year sentence. If both remain silent, both prisoners are sentenced to only five years in jail for a minor charge. If each betrays the other, each receives a nine-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How would you act if you were one of the prisoners?
Materials List
• (100 x # of students in class) = pieces of candy needed (skittles/m&ms recommended)
• Screens made of poster board, cardboard, etc. that can stand between two desks
• Non-transparent containers
• Decks of 54 cards – one deck is needed for every 13 pairs of students
Required Setup
• Arrange student’ desks into pairs, face to face, creating an inner circle and an outer circle.
• Make sure a screen is between each pair of desks so that students cannot communicate with their “partners-in-crime”.
• Each student should be handed 100 pieces of candy, which represents “100 years of life”.
• A non-transparent container should be placed in the middle of each pair of desks, which symbolizes “years in prison”. Every skittle dropped into the container is a year in prison.
• Students in the inner circle should be handed a black and red card of even value (ie: black six of spades and red six of hearts).
• Students in the outer circle should be handed a black and red card of odd value (ie: black jack of clubs and red jack of diamonds).
• Emphasize the students should not communicate with anyone else and should not talk during the progression of the game.
Instructions
• Read “The Problem” to students and tell them that each piece of candy represents a year of life that may be ‘taken’ by the state as part of a jail sentence. The person sitting across from them behind the screen is their partner-in-crime.
• Write the following table on the board:
|
Prisoner B Stays Silent |
Prisoner B Testifies |
||
|
Prisoner A Stays Silent |
Prisoner A: 5 years Prisoner B: 5 years |
Prisoner A: 20 years Prisoner B: goes free |
|
|
Prisoner A Testifies |
Prisoner A: goes free Prisoner B: 20 years |
Prisoner A: 9 years Prisoner B: 9 years |
|
Ask each student to make a decision based on the table of consequences, emphasizing that they cannot communicate with their partners in crime.
• If the student decides to betray their partner-in-crime, he should place his red card in the middle. If the student decides to stay silent, he should place his black card in the middle.
• When everyone has made a decision, ask the students in each pair to take turns looking at the two cards and drop the requisite number of skittles into the “jail container” in the middle of the two desks. (For example: if student A put in a red card and student B put in a black card, student B should put 20 skittles into the container and student A can keep all of their skittles this round.) Students should still not be communicating with each other.
• Play can then continue in two ways:
1. Independent Trials: The outer circle of students should stand up and rotate one to the right, repeating the game with a different partner-in-crime, maintaining lack of communication.
2. Dependent Trials: The students stay in the same pairs and repeat the game with the knowledge of whether or not their partner-in-crime stayed silent or betrayed them last turn. Again, lack of communication should be enforced.
• Goal: At the end of game, possess the most pieces of candy.
Lesson Objectives
• Balance objective mathematic reasoning with subjective personal decision making values
• Develop and understand the strategy for keeping the most pieces of candy in both limited play and extended repetitive play either with different partners or the same partner.
• Understand real-life applications of the Prisoner’s Dilemma in society, economics, politics, etc.
Relevant NTCM Standards
The Prisoner’s Dilemma incorporates various problem solving and reasoning skills outlined by NCTM. Students “engage in a task for which the solution method is not known in advance” and must “draw on their knowledge” of probability and deductive reasoning of pros and cons in order to make a decision of whether to stay quiet or betray their partner-in-crime. Through repetitive play, students should begin to notice “patterns, structure, or regularities” in the consequences of their decisions and should therefore be able to rationally adjust their decision-making to justify their intentions. The scope of the game allows students to develop a mathematical strategy in order to preserve the most pieces of candy by the end.
Student Activity Sheet
Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies for the prosecution against the other (“betrays”) and the other remains silent (“cooperates”), the betrayer goes free and the silent accomplice receives the full twenty-year sentence. If both remain silent, both prisoners are sentenced to only five years in jail for a minor charge. If each betrays the other, each receives a nine-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How would you act if you were one of the prisoners?
Experimental Data (mark a B for betrays or a C for cooperate)
|
Independent Trials (partner varies) game # |
Your action |
Partner’s action |
Amount of life left |
Dependent Trials (constant partner) game # |
Your action |
Partner’s action |
Amount of life left |
|
start |
---- |
---- |
100years |
start |
---- |
---- |
100years |
|
1 |
|
||||||
|
2 |
|
||||||
|
3 |
|
||||||
|
4 |
|
||||||
|
5 |
|
||||||
|
6 |
|
||||||
|
7 |
|
||||||
|
8 |
|
||||||
|
9 |
|
||||||
|
10 |
|
||||||
Reflection Questions (answer on a separate of paper)
1. If you only played the game once with your partner and both of you are trying to minimize your individual jail sentences, what is more rationally beneficial, remaining silent or testifying against your partner? Why?
2. If you and your partner had been able to communicate, would your decision have changed? If so, why?
3. What was your strategy when you repeatedly played games with the same partner? Who ended up spending a shorter amount of time in jail? Why?
4. Would you have chosen a different strategy if you were to repeatedly play against your partner again? If so, which strategy would you adopt? If not, why not?
5. Did your moral or personal values play a part in your strategy? If so, do you think they put you at an advantage or a disadvantage? Why?
Student Activity Sheet Answers
1. If you only played the game once with your partner and both of you are trying to minimize your individual jail sentences, what is more rationally beneficial, remaining silent or testifying against your partner? Why? If we assume that each player cares only about minimizing his or her own time in jail without any concern for the other player, the most rational choice would be to testify. If you testify and your partner stays silent, you get off free. If you testify and your partner also testifies, both of you spend nine years in jail. If both of you cooperate, then each of you has sacrificed the chance to get off free in order to ensure your partner has the chance of cooperating to get a lesser sentence. But since we’re assuming your partner is also trying to minimize his sentence as long as possible, it isn’t likely he’ll be staying silent. Therefore, assuming your partner will testify, it’s better to testify and spend nine years in jail instead of staying silent and spending twenty years in jail.
2. If you and your partner had been able to communicate, would your decision have changed? If so, why? Answers may vary depending on the student, but it should be recognized that by cooperating, both people get an equal number of years in silence. No one is allowed to get off scot-free, and no one is sentenced to twenty-years in jail. Students may reason that even though both can agree to stay silent, one might end up testifying to attempt to get off scot-free without any regard for his partner.
3. What was your strategy when you repeatedly played games with the same partner? Who ended up spending a shorter amount of time in jail? Why? Answers may vary. As proven by computer algorithm, however, the person who spends the shortest amount of time in jail will inadvertently be the person who plays “tit for tat”. The strategy is simply to cooperate on the first round of the game. After that, the player does what his or her opponent did on the previous move.
4. Would you have chosen a different strategy if you were to repeatedly play against your partner again? If so, which strategy would you adopt? If not, why not? Answers may vary.
5. Did your moral or personal values play a part in your strategy? If so, do you think they put you at an advantage or a disadvantage? Why? Answers may vary.
Teaching Notes
In this game, regardless of what the opponent chooses, each player always receives a lesser jail sentence by testifying. We assume that each player cares only about minimizing his or her own time in jail without any concern for the other player. If Prisoner A testifies and Prisoner B stays silent, Prisoner A gets off free while Prisoner B spends twenty years in jail. If Prisoner A testifies and Prisoner B also testifies, both will spend nine years in jail. Since we’re assuming both prisoners are trying to minimize their sentences, it isn’t likely either will be staying silent. A rational player doesn’t even need to guess what their partner will do in order to pick a strategy: If one’s partner testifies, it’s better to testify and spend nine years in jail instead of staying silent and spending twenty years in jail. If one’s partner does not testify, it’s better to testify and spend no years in jail instead of staying silent and spending five years in jail. When trials are independent (your current partner doesn’t know the result of your last game and your next partner won’t know the result of this game) a rational player will always testify.
The game changes significantly when trials are dependent. In this case, if Prisoner A betrays Prisoner B, Prisoner B will have a chance for revenge next turn. Studies of the game have shown that greedy strategies are less effective than strategies that are at least partly cooperative. Computer algorithms and studies have shown that over a long period of time with many players, each with different strategies, selfish strategies and completely altruistic strategies tended to do very poorly while rationally cooperative strategies did better. The best strategy was found to be “tit for tat”. The strategy is to cooperate on the first round of the game. After that, the player does what his or her opponent did on the previous move. Additionally, the most successful players do not testify before their opponents do. However, the player must not be blindly cooperative and trusting or selfish players will take advantage of this.
In society, the likelihood of betrayal in a group of people may be reduced if individuals in the group promote trust through cooperation. Even though they’ve sacrificed themselves, this kind of behavior may set a moral example for the rest of the players in the group. In the long run, cooperation then begins to pay off for the group as a whole. For example, in debates over global warming, all countries will benefit from reduced CO2 emissions, but individual countries are often hesitant to curb emissions. It’s more beneficial for that country to “betray” even though sacrifice of its interests benefits all countries in the long run.
The Prisoner's Dilemma can also apply to the controversy over performance enhancing drugs in athletics. It is ideal that no athlete should take the drugs to avoid the bad side effects. If, however, one athlete takes the drugs, he or she will gain an advantage. If all athletes take the drug, the advantage is taken away, but the negative side effects remain. Ask students if they can think of other real life examples that involve the same concepts of probability and deductive reasoning.
The student activity sheet is for reflection purposes after students have played through the game multiple times and started to recognize patterns in game play, winning strategies, and losing strategies. Students’ answers may vary depending on the depth of understanding of principles of probability and deductive reasoning. Teachers should use the questions as starting points for facilitating discussion about mathematical as well as real-world representations of game theory.
It may be beneficial for students to discuss their thoughts and strategies with partners they’ve worked with to see the thought processes of their “partners” or “opponents”. Instead of giving the “right answer” as proven by computer algorithms, teachers should encourage thoughtful deliberation on proposed strategies including the moral implications or reasons behind students’ decisions. Teachers can access students’ understanding of probability and reasoning through this kind of classroom debate.
After playing the game and completing and discussing the questions on the student activity sheet, an advanced class might also discuss how changing the sentence lengths within the game affect the game’s properties and strategies. For example, the teacher might ask students, “How high can T be
|
without changing the mathematical theory that applies to the game?” Prisoner B Stays Silent |
Prisoner B Testifies |
||
|
Prisoner A Stays Silent |
Prisoner A: L years Prisoner B: L years |
Prisoner A: H years Prisoner B: T years |
|
|
Prisoner A Testifies |
Prisoner A: T years Prisoner B: H years |
Prisoner A: M years Prisoner B: M years |
|
The prisoner’s dilemma is preserved in the independent case as long as H > M > L > T. This condition simply preserves the benefit of always betraying one’s partner. The prisoner’s dilemma is preserved in the dependant case if, additionally, 2L < H+T. This condition preserves the optimal “game sum” (see below for details). If the game is altered so that 2L > H+T than the players can try to cooperate by taking turns between BC and CB. If 2L > H+T, this will maximize end values of skittles better than cooperating with CC every round.
Advanced Topics relating to the Prisoner’s dilemma
If this lesson is presented to an advanced math class, the teacher may also consider including any of the following topics in applied discussion or lecture.
|
Prisoner B Stays Silent |
Prisoner B Testifies |
||
|
Prisoner A Stays Silent |
CC (both cooperate) |
CB |
|
|
Prisoner A Testifies |
BC |
BB (both betray) |
|
The Nash Equilibrium of a game is a scenario in which neither player can make a choice to increase individual profit. Nash equilibrium is a common real-life result of competitions since it results from every player using a greedy strategy (if you can take more, you do) when they have knowledge of the other player’s actions. Therefore, the Nash Equilibrium of the prisoner’s dilemma is BB – both partners betray.
A Pareto Optimal solution to a game is a result from which any change to make someone better off is impossible without making someone else worse off (BC, CB, and BB are all Pareto optimal). Pareto optimal solutions are observed in real life when the participants are considerate of their partners. The prisoner’s dilemma is interesting because the Nash Equilibrium of the game is not Pareto optimal: CC is better for both players. In general, this condition (the equilibrium is not Pareto optimal) is what causes a game to have different results played independently and dependently.
This result is also related to the fact that the Prisoner’s dilemma is not a zero-sum game. A zero-sum game is a game in which any benefit to one player creates an equal loss to the other and vice-versa. A game in which adversaries only trade game counters (skittles) is generally zero-sum since any skittles lost by player A are given to player B. Neither version of the prisoner’s dilemma is zero sum since skittles can be variably completely lost from the game. In the dependant trials version of the prisoner’s dilemma, you can argue that the players should be cooperating to minimize total skittles lost. In this case, they would recognize CC as the best strategy and play accordingly.
A very advanced class might discuss the relationships between these different properties that games can have. These topics also relate to the questions:
• What properties make a game competitive?
• What properties make collaboration beneficial in a game?
• For games that you play often, what properties of these games affect your strategies?
• Can you think of games that have other combinations of the properties above? How do strategies for these games differ from strategies for the prisoner’s dilemma game?
This lessons is written by students at Massachusetts Institute of Technology (M.I.T.), as part of their coursework for 11.124, Introduction to Teaching and Learning Science and Math.


