This repository contains the official implementation of Scalable Offline Model-Based RL with Action chunking.
If you use this code for your research, please consider citing our paper:
@article{park2025_MAC,
title={Scalable Offline Model-Based RL with Action Chunking},
author={Kwanyoung Park, Seohong Park, Youngwoon Lee, Sergey Levine},
journal={arXiv Preprint},
year={2025}
}
This codebase contains implementations of 1) model-free methods, 2) model-based methods, 3) MAC, and ablated version of MAC for ablation studies. Specifically:
# Baselines (Model-free)
from agents.gciql import GCIQLAgent # GCIQL
from agents.ngcsacbc import NGCSACBCAgent # n-step GCSAC+BC
from agents.sharsa import SHARSAAgent # SHARSA
# Baselines (Model-based)
from agents.fmpc import FMPCAgent # FMPC
from agents.leq import LEQAgent # LEQ
from agents.mopo import MOPOAgent # MOPO
from agents.mobile import MOBILEAgent # MOBILE
# Our method (MAC)
from agents.mac import MACAgent # MAC
from agents.mbrs_ac import ACMBRSAgent # MAC (Gau)
from agents.mbfql import MBFQLAgent # MAC (FQL)
from agents.model_ac import ACModelAgent # Model inaccuracy analysis
Please install the libraries using requirements.txt:
pip install -r requirements.txtFor downloading the datasets, please follow the instruction of Horizon Reduction Makes RL Scalable.
For MVE-based MBRL algorithms (MAC, LEQ, FMPC) and model-free RL algorithms (SHARSA, GCIQL, n-step GCSAC+BC):
# MAC in puzzle-4x5-play-oraclerep-v0 (100M)
python main.py --env_name=puzzle-4x5-play-oraclerep-v0 --dataset_dir=<YOUR_DATA_DIRECTORY>/puzzle-4x5-play-100m-v0 --agent=agents/mac.py
# SHARSA in puzzle-4x5-play-oraclerep-v0 (100M)
python main.py --env_name=puzzle-4x5-play-oraclerep-v0 --dataset_dir=<YOUR_DATA_DIRECTORY>/puzzle-4x5-play-100m-v0 --agent=agents/sharsa.py
# MAC in cube-double-play-singletask-task2-v0 (default dataset)
python main.py --env_name=cube-double-play-singletask-task2-v0 --dataset_dir=<YOUR_DATA_DIRECTORY>/cube-double-play-singletask-v0 --agent=agents/mac.py
For MBPO-based algorithms (MOPO, MOBILE):
# MOPO in puzzle-4x5-play-oraclerep-v0 (100M)
python main_mbpo.py --env_name=puzzle-4x5-play-oraclerep-v0 --dataset_dir=<YOUR_DATA_DIRECTORY>/puzzle-4x5-play-100m-v0 --agent=agents/mopo.py
This codebase is built on top of Horizon Reduction makes RL scalable's codebase.