Upper-Confidence-Bounds

I implemented the reinforcement learning based model Upper Confidence Bound in both Python and R

If we use to check if which ad is pleasing customers among many ads then we can use the reinforcement learning approach :

Let we have X ads to display to a customer when he connects to Web
Each time an user logs in we consider it an round
At each roundn we choose one ad to display to the user
At each round n , ad gives reward Ri(n)is the superset of {0,1} : Ri(n) = 1 , if the user clicked on the ad and 0 if the user didn't clicked .
Our goal is to minimize the total rewards we get over many rounds

Steps :

Comparison between UCB and Thompson Sampling :

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Ads_CTR_Optimisation.csv		Ads_CTR_Optimisation.csv
LICENSE		LICENSE
README.md		README.md
upper_confidence_bound.R		upper_confidence_bound.R
upper_confidence_bound.py		upper_confidence_bound.py