In this project, we tried two different Learning Algorithms for Hierarchical RL on the Taxi-v3 environment from OpenAI gym. SMDP Q-Learning and Intra Option Q-Learning and contrasted them with two other methods that involve hardcoding based on human understanding. We conclude that the solutions learnt by machine are way superior than humans for this problem. Intra Option Q-Learning outperforms SMDP Q-Learning because of better usage of the SARS samples (similar to experience replay). Our algorithms even outperform the Hardcoded Agent. We also demonstrated and concluded the strong effectiveness of state compression on the model performance.
-
Notifications
You must be signed in to change notification settings - Fork 0
In this project, we tried two different Learning Algorithms for Hierarchical RL on the Taxi-v3 environment from OpenAI gym. SMDP Q-Learning and Intra Option Q-Learning and contrasted them with two other methods that involve hardcoding based on human understanding. We conclude that the solutions learnt by machine are way superior than humans for …
License
showman-sharma/taxi-v3-learning
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
In this project, we tried two different Learning Algorithms for Hierarchical RL on the Taxi-v3 environment from OpenAI gym. SMDP Q-Learning and Intra Option Q-Learning and contrasted them with two other methods that involve hardcoding based on human understanding. We conclude that the solutions learnt by machine are way superior than humans for …
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published