11‐07‐2023 Weekly Tag Up

Jump to bottom

Joe Miceli edited this page Nov 8, 2023 · 2 revisions

Attendees

Joe
Chi-Hui

Updates

20 round experiment complete (on 4 agent env)
- All hyperparameters left same as what was used for paper
- Results very similar to what we saw in paper
  - Online rollouts show similar G1 performance to the threshold policy (the g1 policy)
  - Online rollouts show G2 performance that's slightly better than the queue policy (the g2 policy)
  - No real trends evident in either case
Ask Joewie about how many rounds to train 9 & 16 agent env
- Probably need 10 because it's more common in academia
Expected to hear back about paper around November 30th
- Probably will have feedback regarding proof
- Experiments (from last week) are the other thing to be preparing
- Set up meeting with Joewie to discuss how to best prepare for review feedback

Online Rollouts During Training