In LIBERO experiments, there is a pre-trained language model: bert-base-cased. We recommend downloading it manually beforehand.
In CortexBench experiments, only partial fine-tuning methods require loading pre-trained models, specifically R3M, MVP, VC-1, Voltron, and MPI. Notably, for the R3M and MVP pre-trained models, we use the reimplementation of Voltron.
For a better understanding of these methods, refer to the relevant tutorials. Of course, referring to the original code is the best approach!
If you want the code to automatically download the required files during execution, set load_path
to None
, as shown below:
if cfg.policy.embedding in ['r3m-rn50', 'r3m-small']:
load_path = os.path.join(cfg.policy.embedding_dir, 'r3m', cfg.policy.embedding) # -> None
self.feature_extractor = load_r3m("r-r3m-vit", load_path=load_path, only_return_model=True)
if cfg.train.ft_method == 'partial_ft':
for param in self.feature_extractor.parameters():
param.requires_grad = False
self.vector_extractor = instantiate_extractor(self.feature_extractor)()
else:
raise ValueError("R3M model type is wrong! The repo only suits for [\"r3m-rn50\" and \"r3m-small\"].")
We recommend downloading these models into the same folder and setting load_path
accordingly.
The directory structure should be as follows:
models
│
├── bert-base-cased
│
├── distilbert-base-uncased
│
├── mpi
│ └── mpi-small
│ ├── MPI-small-state_dict.pt
│ └── MPI-small.json
│
├── mvp
│ └── mvp-small
│ ├── r-mvp.json
│ └── r-mvp.pt
│
├── r3m
│ └── mvp-small
│ ├── r-r3m-vit.json
│ └── r-r3m-vit.pt
├── vc-1
│ └── vc1_vitb.pth
│
└── voltron
└── v-cond-small
├── v-cond.json
└── v-cond.pt
We have also summarized the checkpoint links for all methods from the source code, as follows:
Method | Github | Model |
---|---|---|
R3M | link | ViT-S [ checkpoint | conifg ] |
MVP | link | ViT-S [ checkpoint | conifg ] |
VC-1 | link | ViT-B [ checkpoint ] |
Voltron | link | ViT-S [ checkpoint | conifg ] |
MPI | link | ViT-S [ checkpoint | conifg ] |
Additionally, the MPI initialization model requires the language model: distilbert-base-uncased. We recommend downloading it manually beforehand.