[RL] Improve reward function #11

mjuchli · 2018-03-10T21:58:12Z

Instead of (p_0 - vwap_t) compare against p_0 - (max([p_0; p_t]) + min([p_0; p_t])) / 2 (normalized between -1 and 1). Therefore we have a stable reward for any kind of fluctuation.

The text was updated successfully, but these errors were encountered:

mjuchli added the enhancement label Apr 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL] Improve reward function #11

[RL] Improve reward function #11

mjuchli commented Mar 10, 2018

[RL] Improve reward function #11

[RL] Improve reward function #11

Comments

mjuchli commented Mar 10, 2018