PrisonersDilemmaAutomaton

The Prisoner's Dilemma is a classic problem in game theory that shows why two completely rational individuals might not cooperate, even if it appears that it is in their best interests to do so. It was originally framed by Merrill Flood and Melvin Dresher while working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence rewards and named it "Prisoner's Dilemma". In the scenario, two individuals are arrested, but the evidence against them is not sufficient for a conviction. Each prisoner faces a choice: cooperate with the other by remaining silent, or betray the other by confessing. The dilemma arises because the outcome depends on the choices made by both individuals. If both cooperate, they receive a moderate sentence. If one betrays the other, the betrayer goes free, while the betrayed receives a harsh sentence. If both betray, they both receive a moderately severe sentence.

In the context of the Iterated Prisoner's Dilemma (IPD), the scenario is repeated multiple times, allowing for the exploration of strategies that consider the history of interactions between the individuals.

Scope

This project is intended as a systematic exploration of the iterated prisoner's dilemma (IPD) rule space. The specific rule is assigned to an agent using the principle of binary finite state machines. The decision-making process of the agent is fully determined by its binary ID. The specific behavior exhibited by an agent in IDP is determined by the binary ID's of both agents.

Project Implementation

The project is implemented using three interrelated Python classes: Agent, Environment, and Tournament. Detailed documentation of the classes is provided in the respective .py files.

`Agent`

In the Iterated Prisoner’s Dilemma (IPD) game, an agent is a player whose objective is to maximize their score. The score is calculated based on a payoff function, and the agent’s decision is based on the history of the game. Each Agent class is assigned a binary ID, which is converted into a single bit and a 2D matrix. The single bit determines the agent’s initial decision, while the 2D matrix, when multiplied with the game’s history, determines the agent’s decision in subsequent rounds. The agent cooperates if the sum of the multiplication is positive or zero, and defects if the sum is negative. The decisions are stored and passed to the agents in the next round.

`Environment`

The Environment class in the Iterated Prisoner's Dilemma (IPD) game serves as the arena where two agents compete. It accepts a payoff function, a length parameter, and a Poisson parameter. The payoff function calculates the agents' scores based on their decisions, while the Poisson parameter introduces randomness to the game's length. The game iterates over the specified length, with agents making decisions based on the game's history. These decisions are stored and passed to the agents in subsequent rounds. The Environment class returns the accumulated points from the payoff function and the history of decisions.

`Tournament`

The Tournament class in the Iterated Prisoner's Dilemma (IPD) game generates agents with unique binary IDs and allows them to compete against each other. The binary IDs are generated by converting a decimal count, ranging from 0 to the number of competitors, into a binary number with leading digits. The number of leading digits is determined by the ceiling of the log2 of the number of competitors. To fully explore the search space, the number of competitors should be an odd power of 2. The binary number is then converted into a numpy array, and all agents are generated and compete against each other. The results of the competition are stored in a numpy array of shape ($n^2$, 6), where n is the number of competitors.

Example Usage

Find an example of how the repository can be used in the prisoners_dilemma.ipynb notebook.

Project Results

The results can also be found in the prisoners_dilemma.ipynb notebook. The highlight of the results is a plot of the score of each agent in the tournament against each other player of the tournament with $2^7$ competitors. In total, there were $2^{14}$ games played.

Requirements

The project was developed using Python 3.11.5. The following packages are required to run the code:

numpy
matplotlib
tqdm

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
automaton		automaton
LICENSE		LICENSE
README.md		README.md
prisoners_dilemma.ipynb		prisoners_dilemma.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrisonersDilemmaAutomaton

Scope

Project Implementation

`Agent`

`Environment`

`Tournament`

Example Usage

Project Results

Requirements

About

Releases

Packages

Languages

License

pietroluigiwilli/PrisonersDilemmaAutomaton

Folders and files

Latest commit

History

Repository files navigation

PrisonersDilemmaAutomaton

Scope

Project Implementation

Agent

Environment

Tournament

Example Usage

Project Results

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Agent`

`Environment`

`Tournament`

Packages