The directory contains the Deep Q-network architecture inspiered by the well-know one developed by Mnih et al. which has achieved expert gamers results in several Atari 2600 games.
Since training the network on these games could be tremendously time consuming, we decide to use the 'PongNoFrameskip-v4' environment provided by the gym library. This choice has been taken also to assure that the environement do not present the skipping frame modality present in some gym envs. Therefore, to assure that the model receives the correct input, we implemented the skip_frames() function, which samples one out of 4 frames at each step. As previously said, this function is useful in those envs which frame skipping is not present.
Another important aspect is to correctly preprocessing the input to fed in the DQN. The solution is to define the functions process() and stack_frames() which convert each original to a grayscale image and then resize it in order to obtain a frame of dimension
The result obtained are overall satisfactory, during the training phase, as shown in the following plot, where moving average as been applied to depict the trend of the score with respect to the number of games.
In addition we provide the trained model on the 'PongNoFrameskip-v4' environment which can be used with the test function to play as many games as we want. During test phase, it is also possibile to render the games to see what is going on.