Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation (IsCiL)

This repository provides the official implementation of our paper: Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation

Poster : NeurIPS2024

#Incremental Learning #Imitation Learning #Skills #NeurIPS2024

Overview

Continual Imitation Learning (CiL) involves extracting and accumulating task knowledge from demonstrations across multiple stages and tasks to achieve a multi-task policy. With recent advancements in foundation models, there has been a growing interest in adapter-based CiL approaches, where adapters are introduced in a parameter-efficient way for newly demonstrated tasks. While these approaches effectively isolate parameters for different tasks—helping mitigate catastrophic forgetting—they often limit knowledge sharing across tasks.

We introduce IsCiL, an adapter-based CiL framework that addresses the limitation of knowledge sharing by incrementally learning shareable skills from different demonstrations. This enables sample-efficient task adaptation, especially in non-stationary CiL environments. In IsCiL, demonstrations are mapped into a state embedding space, where proper skills can be retrieved from a prototype-based memory. These retrievable skills are then incrementally refined on their own skill-specific adapters. Our experiments on complex tasks in Franka-Kitchen and MetaWorld demonstrate robust performance of IsCiL in both task adaptation and sample efficiency. Additionally, we provide a simple extension of IsCiL for task unlearning scenarios.

Implementation highlights:

Incremental creation of skill-specific adapters.
1. K-means is used to build skill bases, improving the accuracy of similarity searches between inputs and the corresponding skill.
2. Evaluation is performed by applying each skill adapter to a pre-trained model, enabling effective handling of new or changing inputs.

Related Work

We thank the authors of DMPEL; our method was also evaluated on the LIBERO environment following their setup.

Installation

Note

clus is the development version name for iscil. All related implementations are located inside the clus directory.

Requirements

Python 3.10.13
mujoco210

Step 1: Create a Conda Environment

Create and activate a conda environment using the environment.yml file:
```
conda env create -f environment.yml
conda activate iscil
```
Verify successful activation:
```
conda info --envs
```

Step 2: Install Specific Package Versions

We rely on a specific version of gym:

pip install setuptools==65.5.0 "wheel<0.40.0"    # Prevents an error when installing gym 0.21.0
pip install gym==0.21.0

For more details on why the first line is required, see:
Why is pip install gym failing?

Step 3: Install the Project

In your project directory, run:

pip install -e .

Step 4: Set Environment Variables

Make sure the following environment variables are set:

export XLA_PYTHON_CLIENT_PREALLOCATE=false

Environment Setup

If the steps above are completed, you can test the Kitchen environment directly. For Metaworld, follow the instructions below.

Download and Organize Datasets and Environments

Download environment files from the following link:
Google Drive
After downloading:
- Unzip the file and locate the IsCiL_Env/data folder.
- For each environment folder, install it:
```
# For mmworld
cd IsCiL_Env/env/mmworld
pip install -e .

# For Metaworld
cd ../Metaworld
pip install -e .
```
- Return to the root directory if necessary.

Dataset and Environment Setup

The above steps cover both dataset and environment requirements (e.g., Kitchen, Metaworld). Make sure you have everything installed before proceeding.

Running the Experiments

Download Pre-Trained Models

Download the pre_trained_models.zip file from the provided link:
Download
Move the file to data and unzip it.
Check that the contents are properly placed in the data/pre_trained_models directory.

Run IsCiL

To run the IsCiL experiment, navigate to the clus directory and execute:

bash src/IsCiL.sh

This script will launch the incremental learning process described in the paper.

Evaluation

IsCiL provides comprehensive evaluation capabilities to assess both standard task performance and generalization to unseen tasks. The evaluation framework measures continual learning metrics including Backward Transfer (BWT), Forward Transfer (FWT), and Area Under the Curve (AUC).

[Optional] Standard Evaluation (already included on training)

To evaluate trained models on tasks seen during training to create the log files:

python src/l2m_evaluator.py --save_id <experiment_id>

This will:

Load saved models from each training stage
Evaluate performance on all learned tasks
Generate evaluation logs and metrics files

Example:

python src/l2m_evaluator.py --save_id HelloIsCiL_complete_0

Metrics

Calculate continual learning metrics from evaluation logs:

python clus/utils/metric.py -al <algo> -g <grep string>

The -g parameter is used to filter the logs based on the grep string, which can be the experiment ID or any other identifier.

========================================
 Continual Learning Metrics (in %) 
========================================
BWT (Backward Transfer): XX.XX%
FWT (Forward Transfer) : XX.XX%
AUC (Average Score)    : XX.XX%
========================================

Unseen Task Evaluation

To test generalization on unseen tasks, use the pretrained models and evaluate them on datastreams with unseen tasks:

python src/unseen/unseen_evaluator.py -e <env> -al <algo> -u <unseen_type> -id <evaluation_id>

Parameters:

-e/--env: Environment (kitchen or mmworld)
-al/--algo: Algorithm (iscil, seq, ewc, mtseq)
-id/--id: Evaluation ID

Example:

python src/unseen/unseen_evaluator.py -e kitchen -al iscil -id HelloIsCiL_complete_0

Metrics Calculation

Calculate continual learning metrics from evaluation logs:

python src/unseen/unseen_metrics.py -al <algo>

The metrics calculator displays both overall metrics and unseen-only metrics (suffixed with -A):

========================================
 Continual Learning Metrics (in %) 
========================================
BWT (Backward Transfer): XX.XX%
FWT (Forward Transfer) : XX.XX%
AUC (Average Score)    : XX.XX%
----------------------------------------
BWT-A (Unseen Only)    : XX.XX%
FWT-A (Unseen Only)    : XX.XX%
AUC-A (Unseen Only)    : XX.XX%
========================================

Interpreting Results

Metrics Explanation:

BWT (Backward Transfer): Measures knowledge retention. Negative values indicate forgetting, positive values indicate improvement
FWT (Forward Transfer): Initial performance on new tasks. Higher values indicate better knowledge transfer
AUC (Average Score): Overall performance across all tasks and phases

Output Locations:

Training logs: data/IsCiL_exp/<algo>/<env>/<id>/training_log.log
Unseen evaluation logs: data/Unseen_experiments/<algo>/<env>/<unseen_type>/<id>/training_log.log
Model checkpoints: data/IsCiL_exp/<algo>/<env>/<id>/models/

Baselines and Implementation Details

This section provides details about the baseline algorithms and implementation configurations used in our experiments.

Available Algorithms

IsCiL is compared against several continual learning baselines. All algorithms can be specified using the -al parameter:

Algorithm	Code	Description	Key Features
IsCiL	`iscil`	Our method	Incremental skill learning with dynamic LoRA adapters and multifaceted prototype retrieval
Sequential LoRA	`seqlora`	Sequential adapter learning	Single large LoRA adapter (dim=64) updated sequentially
TAIL	`tail`	Task-Adaptive Incremental Learning	Fixed small adapters (dim=16) with task-specific allocation
TAIL-G	`tailg`	TAIL with sub-goal id	sub-goal specific adapters (dim=4)
L2M	`l2m` / `l2mg`	Learn-to-Modulate	100 learnable keys with small adapters (dim=4)

Implementation Details

IsCiL Configuration

Memory Pool: 100 skill-specific adapters
LoRA Dimension: 4 (parameter efficient)
Retrieval: multiple bases for multifaceted skill retrieval
Key Features:
- Multifaceted-prototype generation based on K-means clustering
- Meta-initialization from existing skills

Baseline Configurations

Sequential LoRA (seqlora):

Single LoRA adapter with dimension 64
Updated continuously across all tasks
No explicit skill separation

TAIL (tail):

Fixed allocation of adapters per task
LoRA dimension: 16
Task-specific adapter selection
[Tricks] The implementation is the same as seq LoRA, and in the paper, the FWT calculated using metric.py is used as the AUC, since there is no forgetting(BWT=0).

TAIL-G (tailg):

Similar to TAIL but uses sub-goal level adapter selection

L2M (l2m/l2mg):

100 learnable prototype keys
Small LoRA adapters (dim=4)
Similarity-based retrieval
Variants: l2m (base embeddings), l2mg (split embeddings)

Running Baseline Experiments

To run experiments with different algorithms:

# IsCiL (our method)
bash src/IsCiL.sh

# Run specific baseline
python src/train.py --algo seqlora --env kitchen --save_id baseline_seq_0

# Run baseline scripts
bash src/baselines.sh  # Runs all baseline comparisons

Error Management and Troubleshooting

Occasionally, you might encounter errors when running the scripts. Below are common issues and how to fix them.

Manual Library Dependency Troubleshooting

If errors occur while running:

bash src/IsCiL.sh

you can identify problematic imports by checking the console logs. Please note these possible fixes:

1) Modify Library Imports (qax)

Some code modifications may be necessary due to updated JAX libraries:

Replace occurrences of jax.linear_util with jax.extend.linear_util in:
- [qax package directory]/qax/implicit/implicit_array.py
- [qax package directory]/qax/implicit/implicit_utils.py

2) D4RL Issues

If you get errors related to abc, ensure collections.abc.Mapping is used instead of collections.Mapping in:

[D4RL package directory]/kitchen/adept_envs/mujoco_env.py

3) W&B Installation Error

If you encounter wandb installation errors, simply reinstall wandb:

pip install wandb

Enjoy exploring Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation!

Citation

@article{lee2024incremental,
  title={Incremental learning of retrievable skills for efficient continual task adaptation},
  author={Lee, Daehee and Yoo, Minjong and Kim, Woo Kyung and Choi, Wonje and Woo, Honguk},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={17286--17312},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
clus		clus
fig		fig
src		src
.gitignore		.gitignore
environment.yml		environment.yml
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Incremental Learning of Retrievable Skills for Efficient Continual Task Adaptation (IsCiL)

Overview

Related Work

Table of Contents

Installation

Note

Requirements

Step 1: Create a Conda Environment

Step 2: Install Specific Package Versions

Step 3: Install the Project

Step 4: Set Environment Variables

Environment Setup

Download and Organize Datasets and Environments

Dataset and Environment Setup

Running the Experiments

Download Pre-Trained Models

Run IsCiL

Evaluation

[Optional] Standard Evaluation (already included on training)

Metrics

Unseen Task Evaluation

Metrics Calculation

Interpreting Results

Baselines and Implementation Details

Available Algorithms

Implementation Details

IsCiL Configuration

Baseline Configurations

Running Baseline Experiments

Error Management and Troubleshooting

Manual Library Dependency Troubleshooting

1) Modify Library Imports (qax)

2) D4RL Issues

3) W&B Installation Error

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages