Skip to content

07‐05‐2024 Weekly Tag Up

Joe Miceli edited this page Jul 7, 2024 · 1 revision

Attendees

  • Chi Hui
  • Joe

Updates

  • Finished running coordination experiment

    • Policies are mixed in the environment and the center agent is evaluated for its performance
      • Compared to baseline scenario in which all agents are using the same model
    • 15 different scenarios were tested, each policy (queue, asl7, asl10) were tested in 5 different scenarios
    • In some cases, the center agent performed better than its baseline
      • This is surprising and indicates that being surrounded by different kinds of agents may help the center agent perform its job
      • Typically though the returns were very similar to the baseline
      • May be an artifact of traffic control applications, all the policies are similar to one another in terms of their high level behavior
    • In all cases, the system-wide return was worse than the baseline
  • These results put us in a weird spot - it's not really clear how zero shot coordination helps in this application

  • Also raises questions about how in literature people are calculating the returns for heterogenous systems

    • Typically the reward is the same for heterogenous agents (that's not the case in our experiments)

Next Steps

  • May be time to pivot again
    • Ava could use help with FCP problem on overcook env
      • Self-play agent group has best performance but FCP has best generalizability
      • Questions about when to assign different players during the FCP training
      • Questions about increasing the performance of FCP
    • Chi Hui to set up a meeting to discuss
    • Joe to review multiHRI repo
  • Still work on addressing bugs/issues in code
Clone this wiki locally