This repository provides the official implementation of our paper: Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation
Poster : NeurIPS2024
#Incremental Learning #Imitation Learning #Skills #NeurIPS2024
Continual Imitation Learning (CiL) involves extracting and accumulating task knowledge from demonstrations across multiple stages and tasks to achieve a multi-task policy. With recent advancements in foundation models, there has been a growing interest in adapter-based CiL approaches, where adapters are introduced in a parameter-efficient way for newly demonstrated tasks. While these approaches effectively isolate parameters for different tasks—helping mitigate catastrophic forgetting—they often limit knowledge sharing across tasks.
We introduce IsCiL, an adapter-based CiL framework that addresses the limitation of knowledge sharing by incrementally learning shareable skills from different demonstrations. This enables sample-efficient task adaptation, especially in non-stationary CiL environments. In IsCiL, demonstrations are mapped into a state embedding space, where proper skills can be retrieved from a prototype-based memory. These retrievable skills are then incrementally refined on their own skill-specific adapters. Our experiments on complex tasks in Franka-Kitchen and MetaWorld demonstrate robust performance of IsCiL in both task adaptation and sample efficiency. Additionally, we provide a simple extension of IsCiL for task unlearning scenarios.
Implementation highlights:
- Incremental creation of skill-specific adapters.
- K-means is used to build skill bases, improving the accuracy of similarity searches between inputs and the corresponding skill.
- Evaluation is performed by applying each skill adapter to a pre-trained model, enabling effective handling of new or changing inputs.
We thank the authors of DMPEL; our method was also evaluated on the LIBERO environment following their setup.
- Installation
- Environment Setup
- Dataset and Environment Setup
- Running the Experiments
- Baselines and Implementation Details
- Evaluation
- Error Management and Troubleshooting
clus is the development version name for iscil. All related implementations are located inside the clus directory.
- Python 3.10.13
- mujoco210
- Create and activate a conda environment using the
environment.ymlfile:conda env create -f environment.yml conda activate iscil
- Verify successful activation:
conda info --envs
We rely on a specific version of gym:
pip install setuptools==65.5.0 "wheel<0.40.0" # Prevents an error when installing gym 0.21.0
pip install gym==0.21.0For more details on why the first line is required, see:
Why is pip install gym failing?
In your project directory, run:
pip install -e .Make sure the following environment variables are set:
export XLA_PYTHON_CLIENT_PREALLOCATE=falseIf the steps above are completed, you can test the Kitchen environment directly. For Metaworld, follow the instructions below.
-
Download environment files from the following link:
Google Drive -
After downloading:
- Unzip the file and locate the
IsCiL_Env/datafolder. - For each environment folder, install it:
# For mmworld cd IsCiL_Env/env/mmworld pip install -e . # For Metaworld cd ../Metaworld pip install -e .
- Return to the root directory if necessary.
- Unzip the file and locate the
The above steps cover both dataset and environment requirements (e.g., Kitchen, Metaworld). Make sure you have everything installed before proceeding.
- Download the
pre_trained_models.zipfile from the provided link:
Download - Move the file to
dataand unzip it. - Check that the contents are properly placed in the
data/pre_trained_modelsdirectory.
To run the IsCiL experiment, navigate to the clus directory and execute:
bash src/IsCiL.shThis script will launch the incremental learning process described in the paper.
IsCiL provides comprehensive evaluation capabilities to assess both standard task performance and generalization to unseen tasks. The evaluation framework measures continual learning metrics including Backward Transfer (BWT), Forward Transfer (FWT), and Area Under the Curve (AUC).
To evaluate trained models on tasks seen during training to create the log files:
python src/l2m_evaluator.py --save_id <experiment_id>This will:
- Load saved models from each training stage
- Evaluate performance on all learned tasks
- Generate evaluation logs and metrics files
Example:
python src/l2m_evaluator.py --save_id HelloIsCiL_complete_0Calculate continual learning metrics from evaluation logs:
python clus/utils/metric.py -al <algo> -g <grep string>The -g parameter is used to filter the logs based on the grep string, which can be the experiment ID or any other identifier.
========================================
Continual Learning Metrics (in %)
========================================
BWT (Backward Transfer): XX.XX%
FWT (Forward Transfer) : XX.XX%
AUC (Average Score) : XX.XX%
========================================
To test generalization on unseen tasks, use the pretrained models and evaluate them on datastreams with unseen tasks:
python src/unseen/unseen_evaluator.py -e <env> -al <algo> -u <unseen_type> -id <evaluation_id>Parameters:
-e/--env: Environment (kitchenormmworld)-al/--algo: Algorithm (iscil,seq,ewc,mtseq)-id/--id: Evaluation ID
Example:
python src/unseen/unseen_evaluator.py -e kitchen -al iscil -id HelloIsCiL_complete_0Calculate continual learning metrics from evaluation logs:
python src/unseen/unseen_metrics.py -al <algo> The metrics calculator displays both overall metrics and unseen-only metrics (suffixed with -A):
========================================
Continual Learning Metrics (in %)
========================================
BWT (Backward Transfer): XX.XX%
FWT (Forward Transfer) : XX.XX%
AUC (Average Score) : XX.XX%
----------------------------------------
BWT-A (Unseen Only) : XX.XX%
FWT-A (Unseen Only) : XX.XX%
AUC-A (Unseen Only) : XX.XX%
========================================
Metrics Explanation:
- BWT (Backward Transfer): Measures knowledge retention. Negative values indicate forgetting, positive values indicate improvement
- FWT (Forward Transfer): Initial performance on new tasks. Higher values indicate better knowledge transfer
- AUC (Average Score): Overall performance across all tasks and phases
Output Locations:
- Training logs:
data/IsCiL_exp/<algo>/<env>/<id>/training_log.log - Unseen evaluation logs:
data/Unseen_experiments/<algo>/<env>/<unseen_type>/<id>/training_log.log - Model checkpoints:
data/IsCiL_exp/<algo>/<env>/<id>/models/
This section provides details about the baseline algorithms and implementation configurations used in our experiments.
IsCiL is compared against several continual learning baselines. All algorithms can be specified using the -al parameter:
| Algorithm | Code | Description | Key Features |
|---|---|---|---|
| IsCiL | iscil |
Our method | Incremental skill learning with dynamic LoRA adapters and multifaceted prototype retrieval |
| Sequential LoRA | seqlora |
Sequential adapter learning | Single large LoRA adapter (dim=64) updated sequentially |
| TAIL | tail |
Task-Adaptive Incremental Learning | Fixed small adapters (dim=16) with task-specific allocation |
| TAIL-G | tailg |
TAIL with sub-goal id | sub-goal specific adapters (dim=4) |
| L2M | l2m / l2mg |
Learn-to-Modulate | 100 learnable keys with small adapters (dim=4) |
- Memory Pool: 100 skill-specific adapters
- LoRA Dimension: 4 (parameter efficient)
- Retrieval: multiple bases for multifaceted skill retrieval
- Key Features:
- Multifaceted-prototype generation based on K-means clustering
- Meta-initialization from existing skills
Sequential LoRA (seqlora):
- Single LoRA adapter with dimension 64
- Updated continuously across all tasks
- No explicit skill separation
TAIL (tail):
- Fixed allocation of adapters per task
- LoRA dimension: 16
- Task-specific adapter selection
- [Tricks] The implementation is the same as seq LoRA, and in the paper, the FWT calculated using metric.py is used as the AUC, since there is no forgetting(BWT=0).
TAIL-G (tailg):
- Similar to TAIL but uses sub-goal level adapter selection
L2M (l2m/l2mg):
- 100 learnable prototype keys
- Small LoRA adapters (dim=4)
- Similarity-based retrieval
- Variants:
l2m(base embeddings),l2mg(split embeddings)
To run experiments with different algorithms:
# IsCiL (our method)
bash src/IsCiL.sh
# Run specific baseline
python src/train.py --algo seqlora --env kitchen --save_id baseline_seq_0
# Run baseline scripts
bash src/baselines.sh # Runs all baseline comparisonsOccasionally, you might encounter errors when running the scripts. Below are common issues and how to fix them.
If errors occur while running:
bash src/IsCiL.shyou can identify problematic imports by checking the console logs. Please note these possible fixes:
Some code modifications may be necessary due to updated JAX libraries:
- Replace occurrences of
jax.linear_utilwithjax.extend.linear_utilin:[qax package directory]/qax/implicit/implicit_array.py[qax package directory]/qax/implicit/implicit_utils.py
If you get errors related to abc, ensure collections.abc.Mapping is used instead of collections.Mapping in:
[D4RL package directory]/kitchen/adept_envs/mujoco_env.py
If you encounter wandb installation errors, simply reinstall wandb:
pip install wandbEnjoy exploring Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation!
@article{lee2024incremental,
title={Incremental learning of retrievable skills for efficient continual task adaptation},
author={Lee, Daehee and Yoo, Minjong and Kim, Woo Kyung and Choi, Wonje and Woo, Honguk},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={17286--17312},
year={2024}
}

