Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 404 Bytes

ModelLogs.md

File metadata and controls

13 lines (7 loc) · 404 Bytes

Model Registry

Trained with with the improved reward function, DISTANCE_PENALTY=4, MINOR_SAFETY_PENALTY=1 and MAJOR_SAFETY_PENALTY=5. No noise during training.

Same reward function. Trained with unbiased noise with standard deviation 0.1

Same reward function. Trained with unbiased noise with standard deviation 1.0