Enhanced Language-guided Robot Navigation with Panoramic Semantic Depth Perception and Cross-modal Fusion
- Please install MatterPort3D simulator from here.
- Install requirements:
Please ensure to use the specified version, as discrepancies between versions can result in errors.
pip install -r requirements.txt
-
Download Datasets, features, and models:
- Annotations from here.
- Features and trained weights (for both pre-trained and fine-tuned) from here.
- Download METER(Optional, only if you want to pre-train SEAT based on METER)from here. Our used meter model is
meter_clip16_224_roberta_pretrain.ckpt
. - Download EnvEdit weights from here.
- For some reason, it may not be possible to access huggingface's model directly, especially when calling roberta's tokenizer. In this case, I recommend going directly to huggingface's official website to download the required file to the local like
datasets/pretrained/roberta
.
The structure should be as follows (using R2R as a detailed example):
datasets ├── R2R │ ├── annotations │ ├── features │ ├── navigator │ ├── pretrain │ └── id_paths.json ├── REVERIE ├── SOON ├── RQ └── EnvEdit
Use the following command to pre-train:
cd rq_train
bash run_rq.sh
cd pretrain_src
bash run_r2r_seat.sh
Use the following command to fine-tune:
cd map_nav_src
bash scripts/run_r2r.sh
Use the following command to valid:
cd map_nav_src
bash scripts/run_r2r_valid.sh
We thank to the authors for their awesome work and sharing: