After following the excellent course of the University College of London (UCL) on reinforcement learning, I decided to make some implementation to illustrate each lesson and highlight some points (the videos of the course are the best that I have found on Reinforcement Learning). I really suggest watching the videos but also reading the Reinforcement Learning Bible from Barto & Sutton: their last edition of 2018 contains new insights about the latest accomplishements (before 2018 obviously) on TD Gammon, Samuel's Checkers, AlphaGo and AlphaGo Zero that are really worth your dedication.
Following the lessons of the UCL I implemented some of the situations presented in the lessons so as to check my personal understanding of reinforcement learning. And this is how I will organize this folder: For each lesson I will present my codes that implement the class examples. The goal here is therefore to complete the lecture using my personal scripts.
Using my newly acquired experience I applied reinforcement learning to some cases of my own to check if it was able to solve some environements. I tried for instance to code an optimization of elevators so as to minimize the wainting time of users, an implementation on which I am still working on
I don't claim to be a reinforcement learning expert, but following these courses and reading some main references really made me see that there was a different way to solve lots of everyday optimization problems. It showed me that human policies are not optimal in every domain and that reinforcement provides a way to learn the best way of dealing with a problem. We only tell our algorithm what it has to accomplish, we don't tell it how to do it