Unity Tennis environment

Introduction

This repository contains an implementation of PPO that solves the Unity Tennis environment. The implementation is based on the existing implementations by Shangtong Zhang and Herimiaina Andria-Ntoanina. Mr. Zhang in particular has a DRL repository with modular implementations of several algorithms, be sure to check it out.

The set-up is as follows: the two tennis paddles are trained to bounce the ball back and forth. A paddle receives a reward of +0.1 for bouncing the ball over the net, and a reward of -0.01 for making it fall. At each timestep, each paddle receives its own 8-dimensional observation vector, encoding information about the position and velocity of the ball and paddle. The action space is comprised of two continuous variables: moving towards or away from the net, and jumping.

Solving the Environment

It is clear that each paddle maximises its rewards by cooperating with its adversary. Optimal policy is identical for each paddle. Hence, due to the symmetry of the observation vectors, they may be combined to train a single PPO algorithm that dictates and trains the policy for both paddles.

To solve the environment, the paddles must average score of +0.5 over 100 consecutive episodes, where the score for each episode is taken to be the maximum score obtained by a paddle in that episode.

Solution

We use the Proximal Policy Optimization (PPO) deep reinforcement learning algorithm to solve the environment.

Try it yourself!

Install anaconda from here.
Install unity ml-agents using the instructions here.
Download the Tennis environment from one of the links below. You need only select the environment that matches your operating system:

Download the Tennis_PPO.ipynb notebook from this repository to train the agent. Follow these simple these instructions.
Go to the relevant terminal and create a conda environment:

conda create -n myenv python=3.6

Activate the environment and open jupyter notebooks:

conda activate myenv
jupyter notebook

Then open up the Tennis_PPO notebook and run it.

If you're not on a Mac make sure to change the filename of the environment in the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Report.md		Report.md
Tennis.gif		Tennis.gif
Tennis_PPO.ipynb		Tennis_PPO.ipynb
final.pth		final.pth
trainedplot.png		trainedplot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unity Tennis environment

Introduction

Solving the Environment

Solution

Try it yourself!

About

Releases

Packages

Languages

adamnoach/Tennis_PPO

Folders and files

Latest commit

History

Repository files navigation

Unity Tennis environment

Introduction

Solving the Environment

Solution

Try it yourself!

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages