Skip to content

Training a vision-based agent with the Deep Q Learning Network (DQN) in Atari's Breakout environment, implementation in Tensorflow.

License

Notifications You must be signed in to change notification settings

andi611/DQN-Deep-Q-Network-Atari-Breakout-Tensorflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning: Deeq Q Learning Network (DQN) Agent playing Atari Breakout

  • Training a vision-based agent with the Deep Q Learning Network (DQN) in Atari's Breakout environment, implementation in Tensorflow.

Environment

  • < Python 3.7 >
  • < OpenAI Gym >
    • Install the OpenAI Gym Atari environment: $ pip3 install opencv-python gym "gym[atari]"
    • Atari environment used: BreakoutNoFrameskip-v4
  • < Tensorflow r.1.12.0 >

Implementation

  • Deep Q Learning Network with the following improvements:
    • Experience Replay
    • Fixed Target Q-Network
    • TD error loss function with: Qtarget = reward + (1-terminal) * (gamma * Qmax(s’)
)
  • DQN network Settings (in agent_dqn.py):

File Description

.
├── ./
|   ├── agent_dqn.py ─────────── DQN model
|   ├── atari_wrapper.py ─────── Atari wrapper
|   ├── environment.py ───────── Gym wrapper
|   ├── runner.py ────────────── Main program for training and testing
|   └── Readme.md ────────────── This file
└── model/
	├── dqn_learning_curve_compare.png ──────── Figure 1  
	├── dqn_best_setting.png ────────────────── Figure 2
	├── dqn_learning_curve.png ──────────────── Figure 3
	├── checkpoint ──────────────────────────── Tensorflow model check point
	├── model_dqn-25581.data-00000-of-00001 ─── Tensorflow model data
	├── model_dqn-25581.meta ────────────────── Tensorflow model meta
	└── model_dqn-25581.index ───────────────── Tensorflow model index

Usage

  • Traing the DQN Agent: $ python3 runner.py --train_dqn
  • Testing the DQN Agent: $ python3 runner.py --test_dqn
  • Testing the DQN Agent with gameplay rendering: $ python3 runner.py --test_dqn --do_render

Learning Curve

  • Single learning curve:
  • With different plotting window: