Adding Boltzman Model and WolfSheep Model to Mesa_RL #197

harshmahesheka · 2024-09-06T12:47:05Z

I have added the remaining two examples, the Boltzman Model and the WolfSheep model, to the rlo folder. The remaining thing would be to modify the main README.md to include a description of mesa_rl. Any suggestion on it is welcomed.
Currently, I have kept things similar to the previous pull request. After merging this, we can open an issue and discuss potential changes/improvements that were left behind.

for more information, see https://pre-commit.ci

…into main

for more information, see https://pre-commit.ci

Updating Branch

rht · 2024-09-06T13:21:11Z

rl/boltzmann_money/README.md

+- Visualization Script: Visualize the trained agent's behavior with Mesa's visualization tools, presenting agent movement and Gini values within the grid. You can run `server.py` file to test it with pre-trained model.
+
+## Model Behaviour
+The adgent as seen below learns to move towards a corner of the grid. These brings all the agents together allowing exchange of money between them resulting in reward maximization.


The result is a bit sus to me. How could all agents simultaneously decide to go to 1 corner? This implies they all use the same weight, which is biased toward the top left. They should have instead look for nearby agents, and seek to get closer to their neighbors, until they are all in the same cell.

Yes, all the weights are the same. Stable baseline doesn't' allow multiple weights. This example shows controlling multiple agents from a single weight. Is it not explicit from Readme?

In nowhere in the readme can I find any indication of such fine print. It needs an explicit disclaimer that the behavior is not what ideally we should expect: where the agents seek the other agents.

adgent -- should be agent

Per this discussion I would reword the Model Behaviour block to

Model Behaviour

As stable baselines controls multiple agents with the same weight, this results in the agents learning to move towards a corner of the grid. These brings all the agents together allowing exchange of money between them resulting in reward maximization.

Very cool addition with the .gif... nicely done!

tpike3

Good work @harshmahesheka I am looking forward to presenting this via George Mason. The big comment is please switch to the Solara visualization and there are some other questions and comments throughout.

tpike3 · 2024-09-18T10:38:56Z

rl/boltzmann_money/README.md

+- Visualization Script: Visualize the trained agent's behavior with Mesa's visualization tools, presenting agent movement and Gini values within the grid. You can run `server.py` file to test it with pre-trained model.
+
+## Model Behaviour
+The adgent as seen below learns to move towards a corner of the grid. These brings all the agents together allowing exchange of money between them resulting in reward maximization.


adgent -- should be agent

Per this discussion I would reword the Model Behaviour block to

Model Behaviour

As stable baselines controls multiple agents with the same weight, this results in the agents learning to move towards a corner of the grid. These brings all the agents together allowing exchange of money between them resulting in reward maximization.

Very cool addition with the .gif... nicely done!

tpike3 · 2024-09-18T10:40:23Z

rl/boltzmann_money/model.py

+# Agents can also give money to other agents in the same cell if they have greater wealth.
+# The model is trained by a scientist who believes in an equal society and wants to minimize the Gini coefficient, which measures wealth inequality.
+# The model is trained using the Proximal Policy Optimization (PPO) algorithm from the stable-baselines3 library.
+# The trained model is saved as "ppo_money_model".


A bit nitpicky but can you change this to multi-line string (""") instead of single line string (#)

I would make some changes to this description to something similar to as follows

''' This code implements a multi-agent reinforcement learning (MARL) variation of the Boltzmann Wealth Model. The model observes the distribution of wealth among agents in a grid environment as they randomly exchange one unit of wealth with each other each time step. Each agent can move to neighboring cells and randomly gives money to other agents in the same cell if they have greater wealth. The goal of the agents in this model is to minimize the Gini coefficient, which measures wealth inequality. (A gini coefficient of 1 is one agent has all the money, and gini coefficient of zero is all agents have exactly the same amount). The model is trained using the Proximal Policy Optimization (PPO) algorithm from the stable-baselines3 library. The trained model is saved as "ppo_money_model" '''

tpike3 · 2024-09-18T11:01:36Z

rl/boltzmann_money/server.py

+import os
+
+import mesa
+from mesa.visualization.ModularVisualization import ModularServer


Why are you using the older visualization set up instead of Solara Viz?

tpike3 · 2024-09-18T11:02:00Z

rl/boltzmann_money/server.py

+        MoneyModelRL, [grid, chart], "Money Model", {"N": 10, "width": 10, "height": 10}
+    )
+    server.port = 8521  # The default
+    server.launch()


Please update to Solara visualization

tpike3 · 2024-09-18T11:04:29Z

rl/boltzmann_money/model.py

+        if len(cellmates) > 1:
+            # Choose a random agent from the cellmates
+            other_agent = random.choice(cellmates)
+            if other_agent.wealth > self.wealth:


This is more a nice to have, but this seems to make the result deterministic regardless of the RL, it seems you would get their eventually after enough steps, RL just makes it more efficient.

Is is possible/easy to make it so the agent gets to choose who it gives wealth to so it "learns" to give wealth to someone with less money?

The idea was to keep these examples really simple and easy to train, hence only dimensional action space. But if you want I can change it

tpike3 · 2024-09-18T11:15:45Z

rl/wolf_sheep/model.py

+        """
+        Create a new WolfRL-Sheep model with the given parameters.
+        """
+        super().__init__(


Do you need to inherit all this from mesa_models? My concern is any change to that model will break this model and that inheriting these parameter is not necessary so it makes it unnecessarily brittle.

I inherited it to show how to integrate your code with mesa easily. Not inheriting the code block and rewriting everything would weaken the whole point. Also, if we are changing something in the main example, we need to change it here as well to keep it updated. So, I think we can keep this arrangement, and whenever some major change takes place in the original examples, we verify here as well. The code is relatively simple, so it shouldn't be a major task.

tpike3 · 2024-09-18T11:17:57Z

rl/wolf_sheep/agents.py

+
+
+class SheepRL(Sheep):
+    def step(self):


Do you need to inherit this from mesa_models? My concern is any change to that model will break this model and that inheriting the Sheep class is not necessary so it makes it unnecessarily brittle.

tpike3 · 2024-09-18T11:18:12Z

rl/wolf_sheep/agents.py

+            self.model.schedule.add(lamb)
+
+
+class WolfRL(Wolf):


Do you need to inherit this from mesa_models? My concern is any change to that model will break this model and that inheriting these parameter is not necessary so it makes it unnecessarily brittle.

tpike3 · 2024-09-18T11:19:46Z

rl/wolf_sheep/server.py

+import os
+
+import mesa
+import numpy as np


Please switch this to SolaraViz and not the old server

tpike3 · 2024-09-18T11:22:47Z

rl/wolf_sheep/train_config.py

+        "policy_wolf": PolicySpec(config=PPOConfig.overrides(framework_str="torch")),
+    },
+    "policy_mapping_fn": lambda agent_id, *args, **kwargs: "policy_sheep"
+    if agent_id[0:5] == "sheep"


I am genuinely curious here, why is this only 0 to 5?

Ah, this. Basically, agent_id is "Sheep[number]", and wolf is "Wolf[number]". So, we are checking the first 5 letters in Sheep to partition it. I will add a comment

harshmahesheka · 2024-09-18T11:51:51Z

Good work @harshmahesheka I am looking forward to presenting this via George Mason. The big comment is please switch to the Solara visualization and there are some other questions and comments throughout.

Thanks for the appreciation. As discussed with @EwoutH and @rht, the visualization here is exactly the same as the one from mesa-examples. So, as soon as mesa-examples get updated. We can very easily update here. Basically as soon as we get this merged #154

harshmahesheka and others added 13 commits August 27, 2024 01:21

Seeding RL Folder

270ed3d

[pre-commit.ci] auto fixes from pre-commit.com hooks

7536376

for more information, see https://pre-commit.ci

Formatting Corrections

6eb4dbf

[pre-commit.ci] auto fixes from pre-commit.com hooks

328f655

for more information, see https://pre-commit.ci

Re-formatting

e5450a1

Reformatting

1c854d2

[pre-commit.ci] auto fixes from pre-commit.com hooks

ac5fb17

for more information, see https://pre-commit.ci

Minor corrections

2f1036b

Merge branch 'main' of https://github.com/harshmahesheka/mesa-examples …

ed19603

…into main

Minor corrections

5f471a4

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad2112a

for more information, see https://pre-commit.ci

Adding 2 more examples

f55e8f9

Formatting Code

55024df

harshmahesheka force-pushed the main branch from f57c141 to 55024df Compare September 6, 2024 13:06

pre-commit-ci bot and others added 2 commits September 6, 2024 13:06

[pre-commit.ci] auto fixes from pre-commit.com hooks

a638b54

for more information, see https://pre-commit.ci

Merge pull request #2 from projectmesa/main

cea2bd5

Updating Branch

rht reviewed Sep 6, 2024

View reviewed changes

tpike3 requested changes Sep 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Boltzman Model and WolfSheep Model to Mesa_RL #197

Adding Boltzman Model and WolfSheep Model to Mesa_RL #197

harshmahesheka commented Sep 6, 2024 •

edited

Loading

rht Sep 6, 2024

harshmahesheka Sep 6, 2024

rht Sep 6, 2024

tpike3 Sep 18, 2024

tpike3 left a comment

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

harshmahesheka Sep 18, 2024

tpike3 Sep 18, 2024

harshmahesheka Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

tpike3 Sep 18, 2024

harshmahesheka Sep 18, 2024

harshmahesheka commented Sep 18, 2024 •

edited

Loading

Adding Boltzman Model and WolfSheep Model to Mesa_RL #197

Are you sure you want to change the base?

Adding Boltzman Model and WolfSheep Model to Mesa_RL #197

Conversation

harshmahesheka commented Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Model Behaviour

tpike3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Model Behaviour

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harshmahesheka commented Sep 18, 2024 • edited Loading

harshmahesheka commented Sep 6, 2024 •

edited

Loading

harshmahesheka commented Sep 18, 2024 •

edited

Loading