The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learning
Copyright (C) 2019-2023 Andrea Soltoggio, Pawel Ladosz, Eseoghene Ben-Iwhiwhu, Jeff Dick, Christos Peridis, Saptarshi Nath.
The CT-graph is a benchmark for deep RL build on an exponentially growing tree graph. It can be configured to any arbitrary high complexity to test the limit of and break any deep RL algorithm with exact measurability of the search space, reward sparsity and other properties.
Objectives
The CT-graph is designed to assess the following RL learning properties:
- learning with variable and measurable degrees of partial observability;
- learning action sequences of adjustable length and increasing memory requirements;
- learning with adjustable and measurable sparsity of rewards;
- learning multiple tasks and testing speed of adaptation (lifelong learning scenarios);
- learning multiple tasks where the knowledge of task similarity is a required metrics (meta-learning or multi-task learning);
- learning hierarchical knowledge representation and skill-reuse for fast adaptation to dynamics rewards (lifelong learning scenarios);
- testing attention mechanisms to identify key states from noise or con- founding states;
- testing meta-learning approaches for optimised exploration policies;
- learning a model of the environment;
- learning a combination of innate and learned knowledge to cope with invariant and variant aspects of the environment.
Publications
The CT-graph has been used as a simulation tool in the following papers:
- Nath, Saptarshi, Christos Peridis, Eseoghene Ben-Iwhiwhu, Xinran Liu, Shirin Dora, Cong Liu, Soheil Kolouri, and Andrea Soltoggio (2023) "Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks." Second Conference on Lifelong Learning Agents (CoLLAs). arXiv preprint arXiv:2305.10997 .
- Ben-Iwhiwhu, E., Nath, S., Pilly, P. K., Kolouri, S., & Soltoggio, A. (2022). Lifelong Reinforcement Learning with Modulating Masks. arXiv preprint arXiv:2212.11110.
- Ben-Iwhiwhu, E., Dick, J., Ketz, N. A., Pilly, P. K., & Soltoggio, A. (2022). Context meta-reinforcement learning via neuromodulation. Neural Networks, 152, 70-79. arXiv preprint
- Ladosz, Pawel et al., "Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture," in IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2021.3110281. arXiv preprint
- Dick, Jeffery et al. “Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments.” Frontiers in neurorobotics vol. 14 578675. 23 Dec. 2020, doi:10.3389/fnbot.2020.578675
- Ben-Iwhiwhu, Eseoghene, et al. "Evolving inborn knowledge for fast adaptation in dynamic POMDP problems." Proceedings of the 2020 Genetic and Evolutionary Computation Conference. 2020. arXiv preprint
The CT-graph paper
Soltoggio, Andrea, Eseoghene Ben-Iwhiwhu, Christos Peridis, Pawel Ladosz, Jeffery Dick, Praveen K. Pilly, and Soheil Kolouri. "The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learning." arXiv preprint arXiv:2302.10887 (2023). https://arxiv.org/abs/2302.10887
@article{soltoggio2023configurable, title={The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learning}, author={Soltoggio, Andrea and Ben-Iwhiwhu, Eseoghene and Peridis, Christos and Ladosz, Pawel and Dick, Jeffery and Pilly, Praveen K and Kolouri, Soheil}, journal={arXiv preprint arXiv:2302.10887}, year={2023} }
Installation
pip install -e .
Instructions
Files:
-
gym_CTgraph: folder with the CT-graph code.
-
test_graph.py: script to the perform basic tests of the CT-graph environments.
-
testDimRed.py: script to perform checks on the input image dataset, e.g. dimensionality reduction and visualization with t-SNE.
-
ilearn.py: simple script to perform classification on the input image dataset.
Using tensorboad: tensorboard --logdir='./logs' --port=6707
Graphical representation of depth 1 and depth 2
Example of two CT-graphs with depth 2 and 3
Acknowledgement
This material is based upon work supported by the United States Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA) under Contract No. FA8750-18-C-0103 (L2M: Lifelong Learning Machines) and Contract No. HR00112190132 (ShELL: Shared Experience Lifelong Learning)
Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA).
License
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.