Implementation of an Q-learning, ϵ-greedy agent that learns how to play the game with the other agents he is connected to.
-
Updated
Sep 11, 2023 - Python
Implementation of an Q-learning, ϵ-greedy agent that learns how to play the game with the other agents he is connected to.
Multi Armed Bandits implementation using the Jester Dataset
Analysis of various multi armed bandit algorithms over normal and heavy-tailed distributions.
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.
DQN agent with e-greedy / softmax policy, experience replay and target network.
This is a project of reinforcement learning which contains two different environments. The first environment is the taxi driver problem in 4x4 space with the simple Q-learning update rule. In this task, we compared the performance of the e-greedy policy and Boltzmann policy. As a second environment, we chose the LunarLander from the open gym. Fo…
Add a description, image, and links to the e-greedy topic page so that developers can more easily learn about it.
To associate your repository with the e-greedy topic, visit your repo's landing page and select "manage topics."