Practical_RL/week05_explore at spring20 · aps2019project/Practical_RL

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
action_rewards.npy		action_rewards.npy
all_states.npy		all_states.npy
bayes.py		bayes.py
bnn.png		bnn.png
river_swim.png		river_swim.png
week5.ipynb		week5.ipynb

README.md

Slides - here

Exploration and exploitation

[main] David Silver lecture on exploration and expoitation - video
Alternative lecture by J. Schulman - video
Alternative lecture by N. de Freitas (with bayesian opt) - video
Our lectures (russian)
- "mathematical" lecture (by Alexander Vorobev) '17 - slides, video
- "practical" lecture '18 - video
- Seminar - video

More materials

Gittins Index - the less heuristical approach to bandit exploration - article
"Deep" version: variational information maximizing exploration - video
- Same topics in russian - video
Lecture covering intrinsically motivated reinforcement learning - video
- Slides
- Same topics in russian - video
- Note: UCB-1 is not for bernoulli rewards, but for arbitrary r in [0,1], so you can just scale any reward to [0,1] to obtain a peace of mind. It's derived directly from Hoeffding's inequality.

Seminar

In this seminar, you'll be solvilg basic and contextual bandits with uncertainty-based exploration like Bayesian UCB and Thompson Sampling.

You will also need Bayesian Neural Networks. You will need theano/lasagne for this one:

# either
conda install Theano
# or
pip install --upgrade https://github.com/Theano/Theano/archive/master.zip
# and then lasagne
pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip

Everything else is in the notebook :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week05_explore

week05_explore

README.md

Slides - here

Exploration and exploitation

More materials

Seminar

Files

week05_explore

Directory actions

More options

Directory actions

More options

Latest commit

History

week05_explore

Folders and files

parent directory

README.md

Slides - here

Exploration and exploitation

More materials

Seminar