OfflineQRL

Offline Quantum Reinforcement Learning (QRL) Framework.

Algorithms:

Conservative Quantum Q-learning (CQ2L): https://ojs.aaai.org/index.php/AAAI/article/view/25872
Batch-Constraint Quantum Q-Learning (BCQQ): https://arxiv.org/pdf/2305.00905.pdf

Environments:

OpenAI CartPole

Commands

Examples of how to use the framework.

Fill buffer by speficying a config file containing the environment config (-c), path to a directory to save buffer (-b) and number of steps to collect (-s):

python main.py -m fill_buffer -c ./config/cart_pole_classical_BCQ.yml -b ./buffers -s 10000

Alternatively, by specifying a path to an experiment with a trained policy (-p), path to a directory to save buffer (-b) and number of steps to collect (-s):

python main.py -m fill_buffer -p ./experiments/cart_pole_classical_DQN -b ./buffers -s 75

Train offline policy by setting mode (-m) to offline, configuration specifying algorithm etc (-c) and buffer to train on (-b):

python main.py -m offline -c ./config/cart_pole_quantum_BCQ.yml -b ./buffers/cart_pole_single_trajectory_100_steps20231005-092357.npz

To continue training from a checkpoint specify path to checkpoint (-r):

python main.py -m offline -c ./config/cart_pole_classical_BCQ.yml -b ./buffers/cart_pole_single_trajectory_100_steps20231005-092357.npz -r ./experiments/CHECKPOINT_NAME

Train online behavior policy by setting mode (-m) to online and configuration specifying algorithm etc (-c):

python main.py -m online -c ./config/cart_pole_classical_DQN.yml

To evaluate an offline policy specify mode (-m) offline_eval, number of evaluation episode (num_eval_eps) and path to model checkpoint (-r):

python main.py -m offline_eval --num_eval_eps 2 -r ./experiments/CHECKPOINT_NAME

To evaluate an online policy specify mode (-m) online_eval, number of evaluation episode (num_eval_eps) and path to model checkpoint (-r):

python main.py -m online_eval --num_eval_eps 2 -r ./experiments/Classical_DQN_CartPole/checkpoints/CHECKPOINT_NAME

Customization

In order to implement your own VQCs, inherit from VqcBase and implement your circuit architecture and observables in a class (see e.g., src/models/skolikVqc).

Furthermore, to add additionaly algorithms create a class that inherits from Policy and overwrite the corresponding abstract methods.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
buffers		buffers
config		config
figures		figures
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OfflineQRL

Commands

Customization

Dependencies

Call Diagram

Class Diagram

License

About

Releases

Packages

Languages

License

maniraman-periyasamy/OfflineQRL

Folders and files

Latest commit

History

Repository files navigation

OfflineQRL

Commands

Customization

Dependencies

Call Diagram

Class Diagram

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages