Skip to content

Latest commit

 

History

History
71 lines (47 loc) · 1.93 KB

scratch.md

File metadata and controls

71 lines (47 loc) · 1.93 KB

Talk2BEV-Dataset from Scratch

Installation

To generate captions, setup the baselines using the following commands:

LLava

git clone https://github.com/haotian-liu/LLaVA parent-folder
mv parent-folder/llava ./
rm -rf parent-folder

Please download the preprocessed weights for vicuna-13b

MiniGPT-4 (optional)

git clone https://github.com/Vision-CAIR/MiniGPT-4 parent-folder
mv parent-folder/minigpt4 ./
rm -rf parent-folder

Please download the preprocessed weights for Vicuna. After downloading the weights, you change the following line in minigpt4/configs/models/minigpt4.yaml.

16: llama_model: "path-to-llama-preprocessed-weights"

Please download the minigpt4 weights here and change the link in eval_configs/minigpt4_eval.yaml:

11: ckpt: 'path-to-prerained_minigpt4_7b-weights'

FastSAM

git clone https://github.com/CASIA-IVA-Lab/FastSAM parent-folder
mv parent-folder/FastSAM/fastsam ./
rm -rf parent-folder

Download the weights from here

Install SAM (optional)

pip3 install segment-anything

Download the sam weights from here.

Base

To generate the base, please run the following commands:

cd data
python3 generate_base.py --data_path <path-to-nuscenes-v1.0-trainval> --save_path <path-to-save> --bev pred/gt

Captioning

To generate the captions for each scene object, please run the following commands:

python3 generate_captions.py --model <captioning-model> --data_path <path-to-base-folder> --json_name pred/gt --start <start_index> --end <end-index>