-
Notifications
You must be signed in to change notification settings - Fork 0
07‐05‐2024 Weekly Tag Up
Joe Miceli edited this page Jul 7, 2024
·
1 revision
- Chi Hui
- Joe
-
Finished running coordination experiment
- Policies are mixed in the environment and the center agent is evaluated for its performance
- Compared to baseline scenario in which all agents are using the same model
- 15 different scenarios were tested, each policy (queue, asl7, asl10) were tested in 5 different scenarios
- In some cases, the center agent performed better than its baseline
- This is surprising and indicates that being surrounded by different kinds of agents may help the center agent perform its job
- Typically though the returns were very similar to the baseline
- May be an artifact of traffic control applications, all the policies are similar to one another in terms of their high level behavior
- In all cases, the system-wide return was worse than the baseline
- Policies are mixed in the environment and the center agent is evaluated for its performance
-
These results put us in a weird spot - it's not really clear how zero shot coordination helps in this application
-
Also raises questions about how in literature people are calculating the returns for heterogenous systems
- Typically the reward is the same for heterogenous agents (that's not the case in our experiments)
- May be time to pivot again
- Ava could use help with FCP problem on overcook env
- Self-play agent group has best performance but FCP has best generalizability
- Questions about when to assign different players during the FCP training
- Questions about increasing the performance of FCP
- Chi Hui to set up a meeting to discuss
- Joe to review multiHRI repo
- Ava could use help with FCP problem on overcook env
- Still work on addressing bugs/issues in code