The main goal of this project is to train trader agents by RL.
Usage of code: you can define paramter as you want. The first parameter is size of sliding windows in time;
second parameter is number of level, which discretizes state into different levels; third parameter is number
of epsisode.
For example
python run_this.py ^^GSPC 5 6 2000
Usage of code: you can define paramter as you want. The first parameter is size of sliding windows in time;
second parameter is number of epsisode.
For example
python run_this_dqn.py ^^GSPC 5 2000
Usage of code: you can define paramter as you want. The first parameter is size of sliding windows in time;
second parameter is number of epsisode.
For example
python run_this_pg.py ^^GSPC 5 2000
Usage of code: you can define paramter as you want. The first parameter is size of sliding windows in time;
second parameter is number of epsisode.
For example
python run_this_AC.py ^^GSPC 5 2000
In fact, these are just some toy models, and performances are poor. In Q learning table experiment, the performance highly depends on the fineness of mesh of state space. In Policy gradient experiment, the training process is hard to converge.
I plan to try Proximal policy gradient with actor critic model i the near future.
Reinforcement_Learning_For_Stock_Prediction
Reinforcement-learning-with-tensorflow