-
Notifications
You must be signed in to change notification settings - Fork 0
02‐21‐2024 Weekly Tag Up
Joe Miceli edited this page Feb 21, 2024
·
1 revision
- Chi-Hui
- Joe
- New normalization scheme didn't have that much impact on performance
- Results of online-rollouts for exp 15 look very similar to exp 14 (same convergence)
- Almost looks like we're not able to control the mean policy at all
- Update lambda learning scheme to use gradient (previously discussed)
- Run new experiment with constraint ratio of 0.75
- Hopefully we will see that mean policy is below threshold