Q-learning

Q-Learning is a strategy for reinforcement learning. This project is a generic Q-learning library that can be applied to a variety of domains.

Example implementations are provided for Tic-Tac-Toe, Frozen Lake, and Finger Chopsticks. Each demonstrates how Q-learning can be used to find an optimal strategy to maximize result when the state space is relatively small. Tic-Tac-Toe, for example, is a simple game with only a few thousand possible states (fewer if you account for symmetry). For domains where the domains are much greater, some sort of approximation to the actual set of states can be used - like a neural net for example.

How to Run

From git-bash, cygwin, cmd, or online IDE shell (such as codenvy), do

git clone https://github.com/bb4/bb4-Q-learning.git    (to clone the project repository locally)
cd bb4-Q-learning                                      (move into the new local project directory)
./gradlew runTTT                                       (to play Tic-Tac-Toe)
./gradlew runFrozenLake                                (to run the Frozen Lake demo)
./gradlew runChopsticks                                (to play finger chopsticks)

Learn More

See my presentation to JLHS students

Results

Below are some surface plots, created with Plotly, that show how well the Q-learning models learns in different domains. The axes on the base are for epsilon and the number of learning trials (or episodes). It's clear that more learning trials will yield more accuracy. The epsilon parameter determines the amount of random exploration versus exploitation of knowledge learned so far. When epsilon is larger, it meas that each transition is more likely to be selected at random - leading to more exploration of the space.

Tic Tac Toe learning Accuracy for different values of epsilon and number of trial runs.

Frozen Lake learning Accuracy for different values of epsilon and number of trial runs.

Finger chopsticks learning Accuracy for different values of epsilon and number of trial runs.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
gradle/wrapper		gradle/wrapper
results		results
scala-source/com/barrybecker4/qlearning		scala-source/com/barrybecker4/qlearning
scala-test/com/barrybecker4/qlearning		scala-test/com/barrybecker4/qlearning
.gitignore		.gitignore
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q-learning

How to Run

Learn More

Results

About

Releases

Packages

Languages

License

bb4/bb4-Q-learning

Folders and files

Latest commit

History

Repository files navigation

Q-learning

How to Run

Learn More

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages