TransfQMix release #58

mttga · 2024-02-12T17:16:57Z

No description provided.

amacrutherford

looks cool! couple of minor points, also would be interesting to see the SMAX results

amacrutherford · 2024-02-12T18:57:48Z

jaxmarl/wrappers/baselines.py

need to remove this no?

oh yes true, will do

amacrutherford · 2024-02-12T18:59:02Z

jaxmarl/wrappers/transformers.py

will each env need an if clause in __init__? maybe worth putting a note if so

yes every env that wants to use trasformers will need to wrap the environment observations in some way to be readable by the transformer. I could also reorganize it to have per-env specific wrappers if that would look better. I had put a note on the readme but now I see I deleted by mistake

amacrutherford · 2024-02-12T18:59:34Z

baselines/QLearning/utils/fast_attention.py

why not just import from flax?

the multihead attention from the fastattention script is significantly faster and more stable than the default flax one, which is beneficial in RL.

The script was taken in https://github.com/google-research/google-research/blob/master/performer/fast_attention/jax/fast_attention.py

Notice that the use of fast attention is optional:

https://github.com/FLAIROx/JaxMARL/blob/4d78674dc5195683f3cd6e7e9d6799ddd586a714/baselines/QLearning/transf_qmix.py#L51C9-L72C14

amacrutherford · 2024-02-12T18:59:46Z

baselines/QLearning/transf_qmix.py

+    return train
+
+
+def signle_run(config):


yes 🙃 thanks

mttga · 2024-02-13T00:06:47Z

Here are the results average across 4 seeds:

test won: https://api.wandb.ai/links/mttga/2beham1m
test returns: https://api.wandb.ai/links/mttga/6yszv25o

Results are better than QMix for most of the maps with except of 3s5z and 3s5z_vs_3s_6z. But the main advantage of transformers is the potential transferability of the agent parameters and the learned qmix function between scenarios.

amacrutherford · 2024-02-13T10:10:56Z

jaxmarl/wrappers/baselines.py

why we removing the hanabi option?

That was a preliminary way to create a global state vector for Hanabi. I realized that including the players' hands wasn't adding any new information, since those hands are already represented within the concatenated agent observations.

amacrutherford · 2024-02-13T10:11:28Z

Here are the results average across 4 seeds:
* test won: https://api.wandb.ai/links/mttga/2beham1m

* test returns: https://api.wandb.ai/links/mttga/6yszv25o
Results are better than QMix for most of the maps with except of 3s5z and 3s5z_vs_3s_6z. But the main advantage of transformers is the potential transferability of the agent parameters and the learned qmix function between scenarios.

awesome!

mttga added 24 commits November 19, 2023 15:56

utracking_v0

eb0148f

first fastish version

2493371

Merge branch 'qlearning' into transfqmix

1af86f4

transf_qmix v0

a78e10b

transf_qmix_v0

d5e2d82

vdn with a working transformer agent

b288b80

transf agents working in spread

8700e47

transf agents working in mpe

04145fa

Merge branch 'main' into transfqmix

7ccb5c1

remove some implementation that didn't work

f27135c

Merge branch 'utracking' into transfqmix

76bc1ee

hypertuning on transfqmix

cbd2cb0

transf_qmix working in smax

df1f3f4

starting pre-realease phase

56b836c

just in case, qmix_transf and vdn_transf

444e56f

merging with main and keep minimal files

5b36479

Merge branch 'main' into transfqmix_release

d413dd3

stupid typo when choosing the agent net

c8f3dbe

update readmes

5b94ef7

reset deafult config to default

47e0fb4

remove rendundant fast_attention file

4cd2dec

remove utracking

949ea25

remove qmix smax checkpoint files

b962c71

reset mpe_spread config

4d78674

mttga requested a review from amacrutherford February 12, 2024 17:17

amacrutherford reviewed Feb 12, 2024

View reviewed changes

solved alex comments

a16cb30

amacrutherford reviewed Feb 13, 2024

View reviewed changes

amacrutherford merged commit 602616f into main Feb 14, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TransfQMix release #58

TransfQMix release #58

mttga commented Feb 12, 2024

amacrutherford left a comment

amacrutherford Feb 12, 2024

mttga Feb 12, 2024

amacrutherford Feb 12, 2024

mttga Feb 13, 2024

amacrutherford Feb 12, 2024

mttga Feb 13, 2024 •

edited

Loading

amacrutherford Feb 12, 2024

mttga Feb 13, 2024

mttga commented Feb 13, 2024

amacrutherford Feb 13, 2024

mttga Feb 14, 2024

amacrutherford Feb 14, 2024

amacrutherford commented Feb 13, 2024

TransfQMix release #58

TransfQMix release #58

Conversation

mttga commented Feb 12, 2024

amacrutherford left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mttga Feb 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mttga commented Feb 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amacrutherford commented Feb 13, 2024

mttga Feb 13, 2024 •

edited

Loading