deep-rl-paper-notes/dqn/README.md at master · evancasey/deep-rl-paper-notes · GitHub

Deep Q-Networks

Introduces DQNs for control from raw-pixel inputs, which uses Bellman updates to learn an action-value function and off-policy action selection using the learned action-value function. Demonstrates how experience replay helps the algorithm get an even distribution of experience (eg. avoid "feeback loops").
Learns to play a large number of Atari games. However, the algorithm is extremely sample inefficient.
Papers:
- Playing Atari with Deep Reinforcement Learning (DQN)
- Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Blog posts:

Prioritized experience replay

Introduces prioritized experience replay, an improved version of the experience replay strategy used in the DQN paper. In prioritized experince replay, examples in the experience replay buffer are weighted by TD-error, which measures how surprising the transition was.
Shows improved performance on ALE.
Papers:
- Prioritized experience replay

One step q-learning

Papers:
- One step q-learning

Dueling DQN

Continuous Deep Q-Learning with Model-based Acceleration (NAF)

Continuous version of DQN that efficiently computes argmax(Q).
Papers
- NAF

Q-Prop

Papers:
- Q-Prop

Reactor

Papers:
- The Reactor: A Sample-Efficient Actor-Critic Architecture