Skip to content

Commit

Permalink
Merge pull request #17 from SimarKareer/merged_branch
Browse files Browse the repository at this point in the history
Merged branch (accel + discrim)
  • Loading branch information
SimarKareer authored Jan 27, 2024
2 parents 391bd61 + a1a8f23 commit 62b1142
Show file tree
Hide file tree
Showing 200 changed files with 34,863 additions and 3,138 deletions.
35 changes: 31 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,34 @@ tools/exps/mic_viper/mic_viper_v3a2.sh
tools/exps/mic_viper/mic_viper_v3overcap.sh
tools/exps/lwarp2
tools/exps/lwarp
*_4.sh
*_6.sh
*_1.sh
*_*.sh
tools/testing
*4.sh
*6.sh
*1.sh
*5.sh
# *_*.sh
computationGraph.dot
computationGraph.png
computationGraph2.png
computationGraph2.svg
missingkeys.txt
multInher.py
tools/slurm_train_analysis_salloc.sh
tools/exps/debug/speedDebug.sh
tools/exps/filter_eval/eval.sh
tools/exps/lwarpv3/lwarp copy.sh
tools/exps/lwarpv3/lwarp_debugPreempt.sh
tools/exps/lwarpv3/lwarp_overcapDebug.sh
tools/exps/lwarpv3/lwarp_sourceonly.sh
tools/exps/lwarpv3/lwarp_warp1e-1mix1-FILL-PLWeight_8.sh
tools/exps/lwarpv4/lwarp_debug_2.sh
tools/exps/lwarpv4/lwarp.sh
tools/exps/mmdg/mmdg.sh
tools/exps/mmv1/mm_daformer_04-20-2023-16-21-17.sh
tools/exps/mmv1/mmHRDA.sh
tools/exps/mmv3/mmv3RGB.sh
tools/exps/sourceEval/sourceEval.sh
tools/exps/sourceEval/sourceEval2.sh
tools/testing/
tools/exps/*
testing/
271 changes: 270 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# VideoDA
Domain adaptation for semantic segmentation using video!

This repo is built off of mmseg. I used the [MIC repo](https://github.com/lhoyer/MIC/tree/master)
This repo is built off of mmsegmentation, with the [MIC repo](https://github.com/lhoyer/MIC/tree/master)

## Installation
Modification of these [instructions](https://github.com/lhoyer/MIC/tree/master/seg).
Expand All @@ -12,3 +12,272 @@ Modification of these [instructions](https://github.com/lhoyer/MIC/tree/master/s
- `git submodule update --recursive` will pull my mmcv submodule
- Simply run `MMCV_WITH_OPS=1 pip install -e . -v` inside the `submodules/mmcv` directory
4. `pip install -e .` inside mmseg root dir


## Key Contributions to mmsegmentation Repo
We have made a number of key contributions to this open source mmsegmentation repo to support video domain adaptative segmentation experiments for future researchers to build off of.

Firstly, we consolidated the HRDA + MIC works into the mmsegmentation repository. By adding the SOTA ImageDA work into this repository,researchers have the capability of easily switching between models, backbones, segmentation heads, and architectures for experimentation and ablation studies.

We added key datasets for the VideoDA benchmark (ViperSeq -> CityscapesSeq, SynthiaSeq -> CityscapesSeq) to mmsegmentation, along with our own constructed shift (ViperSeq -> BDDVid, SynthiaSeq -> BDDVid) , and allowed for the capability of loading consecutive images along with the corresponding optical flow based on a frame distance specified. This enables researchers to easily start work on VideoDA related problems or benchmark current ImageDA appraoches on this setting.

In additon, we provide implementations of common VideoDA techniques such as Video Discriminators, ACCEL architectures + consistent mixup, and a variety of pseudo-label refinement strategies.

All experiments we report in our paper have been made avaiabile in the repository, with each experiment's corresponding bash script to help with reproducability. The next section covers these scripts.

The following files are where key changes were made:

**VideoDA Dataset Support**
- `mmseg/datasets/viperSeq.py`
- `mmseg/datasets/cityscapesSeq.py`
- `mmseg/datasets/SynthiaSeq.py`
- `mmseg/datasets/SynthiaSeq.py`
- `mmseg/datasets/bddSeq.py`

**Consecutive Frame/Optical Flow Support**
- `mmseg/datasets/seqUtils.py`
- `tools/aggregate_flows/flow/my_utils.py`
- `tools/aggregate_flows/flow/util_flow.py`

**VideoDA techinques**
- Video Discriminator:
- `mmseg/models/uda/dacsAdvseg.py`
- PL Refinement:
- `mmseg/models/uda/dacs.py`
- ACCEL + Consistent Mixup:
- `mmseg/models/segmentors/accel_hrda_encoder_decoder.py`
- `mmseg/models/utils/dacs_transforms.py`

**Dataset and Model Configurations**
- `configs/_base_/datasets/*`
- `configs/mic/*`

**Experiment Scripts**
- `tools/experiments/*`

## Dataset Setup (Cityscapes-Seq, Synthia-Seq, Viper)

Please download the following datasets, which will be used in Video-DAS experiments

Datasets:
* [Cityscapes-Seq](https://www.cityscapes-dataset.com/)
```bash
mmseg/datasets/cityscapesSeq/ % dataset root
mmseg/datasets/cityscapesSeq/leftImg8bit_sequence % leftImg8bit_sequence_trainvaltest.zip
mmseg/datasets/cityscapesSeq/gtFine % gtFine_trainvaltest.zip
```

* [Viper](https://playing-for-benchmarks.org/download/):
```bash
mmseg/datasets/viper/ % dataset root
mmseg/datasets/viper/train/img % Images: Frames: *0, *1, *[2-9]; Sequences: 01-77; Format: jpg
mmseg/datasets/viper/train/cls % Semantic Class Labels: Frames: *0, *1, *[2-9]; Sequences: 01-77; Format: png
```

* [Synthia-Seq](http://synthia-dataset.cvc.uab.cat/SYNTHIA_SEQS/SYNTHIA-SEQS-04-DAWN.rar)

We will use `SEQS-04-DAWN/RGB/Stereo_Left/Omni_F` and `SEQS-04-DAWN/GT/LABELS/Stereo_Left/Omni_F`
```bash
mmseg/datasets/viper/SynthiaSeq/ % SYNTHIA-Seq dataset root
mmseg/datasets/viper/SynthiaSeq/SYNTHIA-SEQS-04-DAWN %SYNTHIA-SEQS-04-DAWN
```

After downlaoding all datasets, we must generate sample class statistics on our source datasets (Viper, Synthia-Seq) and convert class labels into Cityscapes-Seq classes.

For both Viper and Synthia-Seq, perform the following:
```bash
python tools/convert_datasets/viper.py datasets/viper --gt-dir train/cls/

python tools/convert_datasets/synthiaSeq.py datasets/SynthiaSeq/SYNTHIA-SEQS-04-DAWN --gt-dir GT/LABELS/Stereo_Left/Omni_F


```

## Creating BDDVid VideoDA Shift
We introduce support for a new target domain dataset derived from BDD10k, which to our knowledge has not been studied previously in the context of Video-DAS. BDD10k has a series of 10,000 driving images across a variety of conditions. Of these 10,000 images, we identify 3,429 with valid corresponding video clips in the BDD100k dataset, making this subset suitable for Video-DAS. We refer to this subset as BDDVid. Next, we split these 3,429 images into 2,999 train samples and 430 evaluation samples. In BDD10k, the labeled frame is generally the 10th second in the 40-second clip, but not always. To mitigate this, we ultimately only evaluate images in BDD10k that perfectly correspond with the segmentation annotation, while at training time we use frames directly extracted from BDD100k video clips. The following instructions below will give detail in how to set up BDDVid Dataset.

1. **Download Segmentation Labels for BDD10k (https://bdd-data.berkeley.edu/portal.html#download) images.**

2. **Download all BDD100k video parts: **

```bash
cd datasets/BDDVid/setup/download
python download.py --file_name bdd100k_video_links.txt
```

Note: Make sure to specify the correct output directory in `download.py` for where you want the video zips to be stored.

3. **Unzip all video files**

```bash
cd ../unzip
python unzip.py
```

Note: Make sure to specify the directory for where the video zips are stored and the output directory for where files should be unzipped in `unzip.py`

4. **Unpack each video sequence and extract the corresponding frame**

```bash
cd ../unpack_video
```

Create a text file with paths to each video unzipped. Refer to `video_path_train.txt` and `video_path_val.txt` as an example.

```bash
python unpack_video.py
```

Note: You will run the script twice, based on the split we are unpacking for (train or val). Edit the `split` varibale to specify train or val, and the `file_path` variable, which refers to the list of all video paths for the given split.

Also, note that through experimentation and analysis, we determined frame 307 in the videos is the closest to the images in the BDD10k dataset. We deal with the possible slight label mismatch problem in later steps to counter this issue.

5. **Download [BDD10k](https://bdd-data.berkeley.edu/portal.html#download) ("10k Images") and its labels ("Segmentation" tab), and unzip them.**

6. **Copy Segmentation labels for train and val in BDDVid**

```bash
cd ../bdd10k_img_labels
python grab_labels.py
```

Note: Run this 2 times for each split (train, val). Edit the `orig_dataset` with the path to the original BDD10k dataset train split, which was downlaoded in step 5.

7. **Fix Image-Label Mismatch**

We will be creating 2 new folders to deal with the image-label mismatch at frame 307 described in step (4).

(1) `train_orig_10k`
- same as train, but the 307 frame is from the original BDD10k dataset. Use this directory for supervised BDD jobs

(2) `val_orig_10k`
- same as val, but the 307 frame is from the original BDD10k dataset. *ALWAYS* use this split, as we want to compute validation over the actual image and label.

```bash
python get_orig_images.py
```

Note: Run this 2 times for each split (train, val). Edit the `orig_dataset` with the path to the original BDD10k dataset train spit, which was downloaded in step 5.


BDDVid is finally setup! For UDA jobs, use the `train` and `val_orig_10k` split. For supervised jobs with BDDVid, use `train_orig_10k` and `val_orig_10k`.


## Generated Flows Dataset

In order to leverage temporal information in videos, we heavily rely on the use of between sequential frames for many of our approaches. For each dataset, we generated flows using [FlowFormer](https://github.com/drinkingcoder/FlowFormer-Official). We have hosted all our generated flows [here](https://huggingface.co/datasets/hoffman-lab/Unified-VideoDA-Generated-Flows) for each dataset, and provide instructions for downloading each datasets flows.


## Reproducing Experiment Results

All experiments conducted in the paper have corresponding scripts for reproducability inside the repository.

All experiment scripts are located in `tools/experiments/*`, with scripts being separated by the different shifts and VideoDA techniques.

### Viper -> CityscapesSeq
<br>

<ins>**Table 2: ImageDA methods on VideoDA benchmarks:**</ins>

Segformer Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + MIC| `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda.sh` |
| HRDA | `./tools/experiments/viper_csSeq/baselines/viper_csseq_hrda.sh` |
| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda.sh` |
| Source Only | `./tools/experiments/viper_csSeq/baselines/viper_source_hrda.sh` |

DLV2 Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + MIC| `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda_dlv2.sh` |
| HRDA | `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_dlv2_consis.sh` |
| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda_dlv2.sh` |
| Source Only| `./tools/experiments/viper_csSeq/baselines/viper_source_hrda_dlv2.sh` |
<br>

<ins>**Table 3: HRDA + MIC Ablation Study**</ins>

DLV2 Backbone
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA - MRFusion| `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion.sh` |
| HRDA - MRFusion - Rare class sampling | `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion_no_rcs.sh` |
| HRDA - MRFusion - Rare class sampling - ImgNet feature distance reg| `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion_no_rcs_no_imnet.sh` |

<br>

<ins>**Table 4: Combining existing Video-DA methods with HRDA**</ins>

DLV2 Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| Source Only| `./tools/experiments/viper_csSeq/baselines/viper_source_hrda_dlv2.sh` |
| HRDA | `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda_dlv2.sh` |
| (TPS) HRDA + Accel + Consis Mixup + PL Refine Warp Frame | `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_warp_frame.sh` |
| (DAVSN) HRDA + Accel + Consis Mixup + Video Discrim + PL Refine Max confidence| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim_consis_filter.sh` |
| (UDA-VSS) HRDA + Accel + Video Discrim + PL Refine Consis Filter| `tools/experiments/viper_csSeq/video_discrim/viper_csseq_hrda_dlv2_video_discrim_consis.sh` |
| (MOM) HRDA + Accel + Consis Mixup + PL Refine Consis Filter| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_consis_filter.sh` |
| HRDA + Video Discrim. | `./tools/experiments/viper_csSeq/video_discrim/viper_csseq_hrda_dlv2_video_discrim.sh` |
| HRDA + Accel + Consis Mixup| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup.sh` |
| HRDA + PL refine Consis Filter| `tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_mic_hrda_dlv2_consis.sh` |
| HRDA + Accel + Consis Mixup + Video Discrim| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim.sh` |
| HRDA + Accel + Consis Mixup + Video Discrim + PL Refine Consis Filter| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim_consis_filter.sh` |
| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda_dlv2.sh` |

<br>
<br>


Note: [For Tables 5-8] To train with forward or backwards flow, edit `FRAME_OFFSET` (positive values = forward, negative values = backwards) in `configs/_base_/datasets/uda_viper_CSSeq.py` along with `cs_train_flow_dir` and `cs_val_flow_dir`.

<ins>**Table 5: Psuedo-label Refinement on HRDA + MIC, Segformer Backbone**</ins>

Segformer Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + MIC + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_mic_hrda_max_conf.sh` |
| HRDA + MIC + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/rare_class_filter/viper_csseq_mic_hrda_rare_class_filter.sh` |
| HRDA + MIC + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_mic_hrda_warp_frame.sh` |
| HRDA + MIC + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_mic_hrda_oracle.sh` |
<br>

<ins>**Table 6: Psuedo-label Refinement on HRDA, Segformer Backbone**</ins>

Segformer Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_consis.sh` |
| HRDA + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_hrda_max_conf.sh` |
| HRDA + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_hrda_warp_frame.sh` |
| HRDA + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_hrda_oracle.sh` |
<br>

<ins>**Table 7: Psuedo-label Refinement on HRDA + MIC, DLV2 Backbone**</ins>

DLV2 Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + MIC + PL Refine Consis Filter| `../tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_mic_hrda_dlv2_consis.sh` |
| HRDA + MIC + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_mic_hrda_dlv2_max_conf.sh` |
| HRDA + MIC + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_mic_hrda_dlv2_warp_frame.sh` |
| HRDA + MIC + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_mic_hrda_dlv2_oracle.sh` |

<br>

<ins>**Table 8: Psuedo-label Refinement on HRDA, DLV2 Backbone**</ins>

Segformer Backbone:
| Experiment | Training Script |
| ------------- | ------------- |
| HRDA + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_dlv2_consis.sh` |
| HRDA + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_hrda_dlv2_max_conf.sh` |
| HRDA + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_hrda_dlv2_warp_frame.sh` |
| HRDA + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_hrda_dlv2_oracle.sh` |

### Other Shifts

SynthiaSeq -> CityscapesSeq, SynthiaSeq -> BDDVid, ViperSeq --> BDDVid experiment scripts follow directory structure as the Viper Experiments. You can find all relevant experiments reported in the paper at `tools/experiments/*`.



Loading

0 comments on commit 62b1142

Please sign in to comment.