Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 1.13 KB

README.md

File metadata and controls

23 lines (16 loc) · 1.13 KB

MixReg

Stable-Baselines Implementation of MixReg regularization technique for PPO2 (uses impala CNN as feature extractor as specified in paper) https://arxiv.org/abs/2010.10814

Note: Dependency on stable-baselines(2.10.1 at time of writing) python library, https://pypi.org/project/stable-baselines/

Use:

from MIXREG_ImpalaCnn import ImpalaCnn
from mixreg import MIXREG

# Use exactly the same as PPO2
model = MIXREG(ImpalaCnn, env, verbose=0, n_steps = 2048, nminibatches=8)

Results from use:

Performance of Impala CNN compared to Nature CNN feature extraction in base PPO2 model using FruitBot Environment:

alt text

implementation of MixReg outperforms base PPO2 in terms of generalization ability on limited training levels for FruitBot Environment:

alt text

alt text

Source: https://github.com/KristofPusztai/CS-W182-Final-Project