Skip to content

Latest commit

 

History

History
99 lines (81 loc) · 7.5 KB

README.md

File metadata and controls

99 lines (81 loc) · 7.5 KB

BoxMOTS Usage

Dataset Preparation

KITTI MOTS

  • Download: Go to the official homepage to download the images and labels, and check the label format.
  • Convert format: Use the code mentioned in the issue to convert the label to the COCO format. The label after convertion is provided here.
  • Evaluation: Use the MOTS Tools to evaluate the MOTS performance.
KITTI MOTS Dataset structure sample.
├── KITTI_MOTS
│   ├── annotations
│       ├── train_in_trainval_gt_as_coco_instances.json
|       ├── val_in_trainval_gt_as_coco_instances.json
│   ├── imgs
│       ├── train_in_trainval
│           ├── 0000
|               ├── 000000.png
|               ├── 000001.png
|               ├── ...
|               ├── 000153.png
│           ├── 0001
│           ├── 0003
│           ├── ...
│           ├── 0020
│       ├── val_in_trainval
│           ├── 0002
│           ├── 0006
│           ├── 0007
│           ├── ...
│           ├── 0018

BDD100K MOTS

  • Download: Go to the BDD100K official homepage to download MOTS images and labels, and check the label format.
  • Dataset toolkit and modification: Check this description for official BDD toolkit download and modification.
  • Convert format: Check this page to see how to convert the labels to the COCO label format. The label after convertion with modified toolkit is provided here.
  • Evaluation: Check this page to see how to evaluate the MOTS performance on this dataset.
BDD100K MOTS Dataset structure sample.
├── bdd100k
│   ├── images
│       ├── seg_track_20
│           ├── train
│               ├── 000d4f89-3bcbe37a
│                   ├── 000d4f89-3bcbe37a-0000001.jpg
│                   ├── ...
│               ├── 000d35d3-41990aa4
│               ├── ...
│           ├── val
│   ├── labels
│       ├── seg_track_20
│           ├── bitmasks
|           ├── colormaps
│           ├── from_rles
│               ├── train_seg_track.json
|               ├── val_seg_track.json
|           ├── polygons
│           ├── rles

Pipeline

  1. Run GMA to extract the optical flow information of KITTI/BDD.
  2. Train the BoxMOTS model with the boxmots folder. Check Training for details.
  3. Run SSIS to get the shadow detectin results of KITTI/BDD.
  4. Combine the shadow detection result to refine the model's segmentation result. This code for KITTI, and this code for BDD.
  5. Run StrongSORT for the data association. This step generates mask-based trajectories, which is the final output of the MOTS task.

Installation

Please check the my_install doc for installation details of the boxmots folder.

Training

  1. Optical flow data. Make sure you have obtained the optical flow information of the dataset with GMA before training.

  2. BoxInst weights. BoxMOTS uses BoxInst for instance segmentation. Check its official page to download the BoxInst_MS_R_50_3x.pth model weights. It will be loaded to initialize BoxMOTS when the training starts.

  3. KITTI MOTS. Please use the my_script_kitti script, and change the config file in the script to this config to train the full model (using optical flow information) on KITTI MOTS. You can also train a base model without using optical flow with the base config first, and then train the full model with the base model weights as initialization by modifying the WEIGHTS part in the config. You may also need to adjust other hyper-parameters, like loss weights, learning rate decay steps, and so on. You may refer to this config as a sample. This may give you more stable training process and (probably) a little better performance.

  4. BDD100K MOTS. Use the my_script_bdd script, and change the config file in the script to this config to train the full model (using optical flow information) on BDD100K MOTS.

  5. BoxMOTS weights. The trained BoxMOTS models are provided here: the model trained on KITTI, and the model trained on BDD.

Inference

Note that we perform evaluation on the validation set in the training process, and we save both segmentation and embedding results for each evaluation. Hence you can directly use these results as the inference outputs for the model, which is trained by a specific number of iterations, without the need to perform the inference process explicitly.

If you want to do inference only with pretrained model, check this demo on YouTube-VIS 2019 for reference.

Notes

  • Recommended project organization: 4 conda environments for the 4 components (BoxMOTS, GMA, SSIS, StrongSORT) to avoid package conflicts.