Skip to content
Amizorach edited this page Sep 13, 2020 · 4 revisions

Welcome to the TanksPowerUp wiki!

Introduction

Watching a computer learn and figure out for its own what needs to be done, is not exactly a typical day in a programmers life. I grew up on the notion that computers can do only what they were programmed to do in the first place. Yet as I started University almost 20 years ago and studdied Cognitive science it became more then clear to me that when testing and studying how a humans in general make descesins we are using techniuqes that can easily be used on a computer program. essintially we were figuring or atleast trying to figure out why we doo certain things with highly simplefied versions of the brains hardware and some emrging software that runs on it. Neural Networks although only covered slightly back then seemd to be magical as they seems to allow the computer to magically do exatly what I intend it to do with out explectly writing the code needed for that exact operation. It took me a few years to relize that these neural networks in their simple form are nothing but a converging mathimatical formula that is adjusted as much as needed until it delivers (or does not deliver) the required result.

With the emergence of simple open tools such as tensorflow it seemd to me just natural to play around with this tool to see what can be done. However I found myself looking at running equations and trying to figure out what went wrong. And so the introduction of a fully graphical interface that allowed me to visualize how the computer learns instead of how the results change (as can be viewed in tools like Tensorboard) was all that was needed to suck me in.

This is a basic tutorial it assumes basic knowledge of Unity and is out to allow the reader to start exploring the MLAgents suite

The tutorial will cover how to get started with writing code for ml agents and not how to install anaconda and set up a ml agents project.

If you need installation help I suggest checking out Unities installation document and unities Setup for window as well as many other good tutorials that can be eaisly found on the net

What will we be creating

This tutorial is a introdutury tutorial you can find more complex scenarios and agents on other branches of this repository and maybe later on I will have more advanced tutorial to cover the attempts made on these othe branches, but for now we will be exploring the basic concepts that drive the mlagents platform. we are going to keep it simple - still as the fun in my eyes is the exploration of the learning process and the ability to view it graphically, we will build some utils that allow us to easily run different settings with out needing to change the code.

Using a simplified model of Unities Tank from the Tank tutorial we will create a agent that will learn how to drive around a area and collect power ups or die.

now you might be asking yourself - why no shooting - will once agina we are going to keep this simple for this tutorial.

So lets get started

Before diving in to the setup of the project or writing the code les stop and take a look at the main mlagent extention objects we will be using

What is MLAgents -

"The Unity Machine Learning Agents Toolkit (ML-Agents Toolkit) is an open-source project that enables games and simulations to serve as environments for training intelligent agents. " From mlagents git

I will not go into detail on how mlagents works and what it does behind the scene but i highly recommend that if you wish to continue learning more you check out the link above to get a better understanding of what and how.

For this tutorial we will treat the whole underlined mlagent package as one black box - we will run the black box using the learn command and receive a usable output that can be fed to our gameobject in order to controll a traind agent

but first we need to understand some basic concepts that will be used through out this tutorial

Neural network: I will not be explaining what a aeural newtowrk is in the tutorial, I find that grasping the details of how a neural network is designed and how it works may be missleading whe nfirst approching Machine learning through MLAgents - so with out commiting to the folowing definition for this tutorial a neural network will be a mathimatical equation that has a input of a array of floats and a output of actions in addition this mathimatical formula is built such that the output is graded externally and the equation adjusts it self to slowly converge towards a better grade.

Although this may sound mystical and complicated I assure you it is not however as in this tutorial we will not be focusing on the neural network at all (it is but a black box to us) it is enough for us to understand that we will be feeding the nn input and receving output

Learning phase: Inference Phase:

Agents: This is the core concept we will be exploring in this tutorial Basically a agent is a gameobject that sends the input into the neural network and receives output that is then converted into actions it takes. During learning it will also grade the neural networks performance by sending a reward may it be positive for good actions and negetive for bad actions.

A agent therefore has the folowing 3 functions

public override void CollectObservations(VectorSensor sensor);
public void OnActionReceived(float[] vectorAction);
public void AddReward (float increment);

Every cycle the agent will collect obseravastions about the area around it and its internal state encode them into a list of floats and send them to the neural network as input

Later on the OnActionReceived callback will be called and the agent will receive a list of actions in the form of a float array- and carry them out.

When it wants to the agent can reward it self using ethier positive or negetive rewards using the AddReward - while in training state this will grade the neural network so it can converge towards the highest grade

Learning envoriment: A agent lives in a enviromant, this enviroment may be static or dynamic and may or may not hold other agents The agent will report information regarding the enviroment using the input of the neural network

Setting up the enviroment

Clone this wiki locally