06‐14‐2024 Weekly Tag Up

Jump to bottom

Joe Miceli edited this page Jun 14, 2024 · 1 revision

Attendees

Joe
Chi Hui

Updates

Option 1
- Use hierarchical RL to organize the ensemble of experts
  - Managers controls multiple workers
  - Each manager controls multiple actors
- We have various experts already (different models) but we could apply this to overcook as well
Option 2
- New SP algorithm to control mixture of experts
- Good option for baselines
- May be questions about why we need ZSC if there is no human involved
  - Adding a new intersection to the environment
- Could leverage HRI work https://github.com/HIRO-group/multiHRI/blob/main/oai_agents/agents/rl.py
- Control mixture is the thing we want to do
  - Could use hierarchical RL to solve the problem (option 1)
    - This would make it difficult to find baseline
  - Or use self play or FCP or other algo to let mixture of experts coordinate with each other
ZSC group setting
- Typically, ZSC just addresses 2 player game (one agent one human)
- Would be beneficial to consider groups
- Could evaluate by incrementally adding new partners
  - Start with 1 partner, then 2 partners, then 3, etc.
NOTE: Diverse experts: Agents have the same actions but different rewards/objectives

Next Steps

Review FCP and SP from HRI repo
Deep dive on Option 2 next week