- kernel_vi.py - Approximate Smooth Kernel Value Iteration
- kernel.py - Kernel definition
- gridworld_mdp.py - GridWorld domain
- plot.py - Plot performance metrics of one run
- plot.R - Plot performance metrics across runs
- kernel_vi.sh - Generates multiple runs across random seeds
Requirements: Python, numpy, matplotlib
Description of parameters is provided in the help message
python kernel_vi.py --help
- Value Iteration
python kernel_vi.py --plan plans/plan0.txt --random-slide 0.15 --opt-v plans/opt_v0_rew_5_rs_0.15.txt --max-iter 20 --plot metrics.png
- Approximate Value Iteration with sampled Bellman operator at 10 states
python kernel_vi.py --plan plans/plan0.txt --random-slide 0.15 --opt-v plans/opt_v0_rew_5_rs_0.15.txt --max-iter 100 --s 10 --log-steps 10 --plot metrics.png
- Kernel Value Iteration with Neural Tangent Kernel
python kernel_vi.py --plan plans/plan0.txt --random-slide 0.15 --opt-v plans/opt_v0_rew_5_rs_0.15.txt --max-iter 20 --kernel --kernel-type ntk --plot metrics.png
- Approximate Kernel Value Iteration with Neural Tangent Kernel and sampled Bellman operator at one state
python kernel_vi.py --plan plans/plan0.txt --random-slide 0.15 --opt-v plans/opt_v0_rew_5_rs_0.15.txt --max-iter 20 --s 1 --kernel --kernel-type ntk --plot metrics.png
Generates NUM_RUNS runs of Approximate Smooth Kernel Value Iteration across random seeds. Saves performance metrics across iterations into 'export' directory for each seed.
bash kernel_vi.sh 0 NUM_RUNS
Requirements: R-project, install.packages(c('ggplot2', 'reshape2', 'dplyr'))
Reads 'export' directory from previous step and generates plot.pdf in the current directory
Rscript plot.R
[1] Smirnova, Elena. On Convergence of Neural asynchronous Q-iteration. EWRL, 2022.