- DATA : 10000 rows of data with chosen arm (random) out of 10 actions(articles) and reward (article clicked or not) , with 10-dimensional context vectors of each arm.
- Algorithm choses one of the artcile as the recommendation and improves itself in an online manner.