Merge pull request #17 from SimarKareer/merged_branch

Merged branch (accel + discrim)
SimarKareer · Jan 27, 2024 · 62b1142 · 62b1142
2 parents 391bd61 + a1a8f23
commit 62b1142
Show file tree

Hide file tree

Showing 200 changed files with 34,863 additions and 3,138 deletions.
diff --git a/.gitignore b/.gitignore
@@ -193,7 +193,34 @@ tools/exps/mic_viper/mic_viper_v3a2.sh
 tools/exps/mic_viper/mic_viper_v3overcap.sh
 tools/exps/lwarp2
 tools/exps/lwarp
-*_4.sh
-*_6.sh
-*_1.sh
-*_*.sh
+tools/testing
+*4.sh
+*6.sh
+*1.sh
+*5.sh
+# *_*.sh
+computationGraph.dot
+computationGraph.png
+computationGraph2.png
+computationGraph2.svg
+missingkeys.txt
+multInher.py
+tools/slurm_train_analysis_salloc.sh
+tools/exps/debug/speedDebug.sh
+tools/exps/filter_eval/eval.sh
+tools/exps/lwarpv3/lwarp copy.sh
+tools/exps/lwarpv3/lwarp_debugPreempt.sh
+tools/exps/lwarpv3/lwarp_overcapDebug.sh
+tools/exps/lwarpv3/lwarp_sourceonly.sh
+tools/exps/lwarpv3/lwarp_warp1e-1mix1-FILL-PLWeight_8.sh
+tools/exps/lwarpv4/lwarp_debug_2.sh
+tools/exps/lwarpv4/lwarp.sh
+tools/exps/mmdg/mmdg.sh
+tools/exps/mmv1/mm_daformer_04-20-2023-16-21-17.sh
+tools/exps/mmv1/mmHRDA.sh
+tools/exps/mmv3/mmv3RGB.sh
+tools/exps/sourceEval/sourceEval.sh
+tools/exps/sourceEval/sourceEval2.sh
+tools/testing/
+tools/exps/*
+testing/
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # VideoDA
 Domain adaptation for semantic segmentation using video!
 
-This repo is built off of mmseg.  I used the [MIC repo](https://github.com/lhoyer/MIC/tree/master)
+This repo is built off of mmsegmentation, with the [MIC repo](https://github.com/lhoyer/MIC/tree/master)
 
 ## Installation
 Modification of these [instructions](https://github.com/lhoyer/MIC/tree/master/seg).
@@ -12,3 +12,272 @@ Modification of these [instructions](https://github.com/lhoyer/MIC/tree/master/s
     - `git submodule update --recursive` will pull my mmcv submodule
     - Simply run `MMCV_WITH_OPS=1 pip install -e . -v` inside the `submodules/mmcv` directory
 4. `pip install -e .` inside mmseg root dir
+
+
+## Key Contributions to mmsegmentation Repo
+We have made a number of key contributions to this open source mmsegmentation repo to support video domain adaptative segmentation experiments for future researchers to build off of. 
+
+Firstly, we consolidated the HRDA + MIC works into the  mmsegmentation repository. By adding the SOTA ImageDA work into this repository,researchers have the capability of easily switching between models, backbones, segmentation heads, and architectures for experimentation and ablation studies.
+
+We added key datasets for the VideoDA benchmark (ViperSeq -> CityscapesSeq, SynthiaSeq -> CityscapesSeq) to mmsegmentation, along with our own constructed shift (ViperSeq -> BDDVid, SynthiaSeq -> BDDVid) , and allowed for the capability of loading consecutive images along with the corresponding optical flow based on a frame distance specified. This enables researchers to easily start work on VideoDA related problems or benchmark current ImageDA appraoches on this setting.
+
+In additon, we provide implementations of common VideoDA techniques such as Video Discriminators, ACCEL architectures + consistent mixup, and a variety of pseudo-label refinement strategies.
+
+All experiments we report in our paper have been made avaiabile in the repository, with each experiment's corresponding bash script to help with reproducability. The next section covers these scripts.
+
+The following files are where key changes were made:
+
+**VideoDA Dataset Support**
+- `mmseg/datasets/viperSeq.py`
+- `mmseg/datasets/cityscapesSeq.py`
+- `mmseg/datasets/SynthiaSeq.py`
+- `mmseg/datasets/SynthiaSeq.py`
+- `mmseg/datasets/bddSeq.py`
+
+**Consecutive Frame/Optical Flow Support**
+- `mmseg/datasets/seqUtils.py`
+- `tools/aggregate_flows/flow/my_utils.py`
+- `tools/aggregate_flows/flow/util_flow.py`
+
+**VideoDA techinques**
+- Video Discriminator:
+    - `mmseg/models/uda/dacsAdvseg.py`
+- PL Refinement:
+    - `mmseg/models/uda/dacs.py`
+- ACCEL + Consistent Mixup:
+    - `mmseg/models/segmentors/accel_hrda_encoder_decoder.py`
+    - `mmseg/models/utils/dacs_transforms.py`
+
+**Dataset and Model Configurations**
+- `configs/_base_/datasets/*`
+- `configs/mic/*`
+
+**Experiment Scripts**
+- `tools/experiments/*`
+
+## Dataset Setup (Cityscapes-Seq, Synthia-Seq, Viper)
+
+Please download the following datasets, which will be used in Video-DAS experiments
+
+Datasets:
+* [Cityscapes-Seq](https://www.cityscapes-dataset.com/)
+```bash
+mmseg/datasets/cityscapesSeq/                       % dataset root
+mmseg/datasets/cityscapesSeq/leftImg8bit_sequence   % leftImg8bit_sequence_trainvaltest.zip
+mmseg/datasets/cityscapesSeq/gtFine                 % gtFine_trainvaltest.zip
+```
+
+* [Viper](https://playing-for-benchmarks.org/download/): 
+```bash
+mmseg/datasets/viper/                            % dataset root
+mmseg/datasets/viper/train/img                   % Images: Frames: *0, *1, *[2-9]; Sequences: 01-77; Format: jpg
+mmseg/datasets/viper/train/cls                   % Semantic Class Labels: Frames: *0, *1, *[2-9]; Sequences: 01-77; Format: png
+```
+
+* [Synthia-Seq](http://synthia-dataset.cvc.uab.cat/SYNTHIA_SEQS/SYNTHIA-SEQS-04-DAWN.rar) 
+
+We will use `SEQS-04-DAWN/RGB/Stereo_Left/Omni_F` and `SEQS-04-DAWN/GT/LABELS/Stereo_Left/Omni_F`
+```bash
+mmseg/datasets/viper/SynthiaSeq/                      % SYNTHIA-Seq dataset root
+mmseg/datasets/viper/SynthiaSeq/SYNTHIA-SEQS-04-DAWN      %SYNTHIA-SEQS-04-DAWN
+```
+
+After downlaoding all datasets, we must generate sample class statistics on our source datasets (Viper, Synthia-Seq) and convert class labels into Cityscapes-Seq classes.
+
+For both Viper and Synthia-Seq, perform the following:
+```bash
+python tools/convert_datasets/viper.py datasets/viper --gt-dir train/cls/
+
+python tools/convert_datasets/synthiaSeq.py datasets/SynthiaSeq/SYNTHIA-SEQS-04-DAWN --gt-dir GT/LABELS/Stereo_Left/Omni_F
+
+
+```
+
+## Creating BDDVid VideoDA Shift
+We introduce support for a new target domain dataset derived from BDD10k, which to our knowledge has not been studied previously in the context of Video-DAS. BDD10k has a series of 10,000 driving images across a variety of conditions.  Of these 10,000 images, we identify 3,429 with valid corresponding video clips in the BDD100k dataset, making this subset suitable for Video-DAS. We refer to this subset as BDDVid. Next, we split these 3,429 images into 2,999 train samples and 430 evaluation samples. In BDD10k, the labeled frame is generally the 10th second in the 40-second clip, but not always. To mitigate this, we ultimately only evaluate images in BDD10k that perfectly correspond with the segmentation annotation, while at training time we use frames directly extracted from BDD100k video clips. The following instructions below will give detail in how to set up BDDVid Dataset.
+
+1. **Download Segmentation Labels for BDD10k (https://bdd-data.berkeley.edu/portal.html#download) images.**
+
+2. **Download all BDD100k video parts: **
+
+    ```bash
+    cd datasets/BDDVid/setup/download
+    python download.py --file_name bdd100k_video_links.txt
+    ```
+
+    Note: Make sure to specify the correct output directory in `download.py` for where you want the video zips to be stored.
+
+3. **Unzip all video files**
+
+    ```bash
+    cd ../unzip
+    python unzip.py
+    ```
+
+    Note: Make sure to specify the directory for where the video zips are stored and the output directory for where files should be unzipped in `unzip.py`
+
+4. **Unpack each video sequence and extract the corresponding frame**
+
+   ```bash
+   cd ../unpack_video
+   ```
+
+    Create a text file with paths to each video unzipped. Refer to `video_path_train.txt` and `video_path_val.txt` as an example.
+
+    ```bash
+    python unpack_video.py
+    ```
+
+    Note: You will run the script twice, based on the split we are unpacking for (train or val). Edit the `split` varibale to specify train or val, and the `file_path` variable, which refers to the list of all video paths for the given split.
+
+    Also, note that through experimentation and analysis, we determined frame 307 in the videos is the closest to the images in the BDD10k dataset. We deal with the possible slight label mismatch problem in later steps to counter this issue.
+
+5. **Download [BDD10k](https://bdd-data.berkeley.edu/portal.html#download) ("10k Images") and its labels ("Segmentation" tab), and unzip them.**
+
+6. **Copy Segmentation labels for train and val in BDDVid**
+
+    ```bash
+    cd ../bdd10k_img_labels
+    python grab_labels.py
+    ```
+
+    Note: Run this 2 times for each split (train, val). Edit the `orig_dataset` with the path to the original BDD10k  dataset train split, which was downlaoded in step 5.
+
+7. **Fix Image-Label Mismatch**
+
+    We will be creating 2 new folders to deal with the image-label mismatch at frame 307 described in step (4).
+
+    (1) `train_orig_10k`
+        - same as train, but the 307 frame is from the original BDD10k dataset. Use this directory for supervised BDD jobs
+
+    (2) `val_orig_10k`
+        - same as val, but the 307 frame is from the original BDD10k dataset. *ALWAYS* use this split, as we want to compute validation over the actual image and label. 
+
+    ```bash
+    python get_orig_images.py
+    ```
+
+    Note: Run this 2 times for each split (train, val). Edit the `orig_dataset` with the path to the original BDD10k dataset train spit, which was downloaded in step 5.
+
+
+BDDVid is finally setup! For UDA jobs, use the `train` and `val_orig_10k` split. For supervised jobs with BDDVid, use `train_orig_10k` and `val_orig_10k`.
+
+
+## Generated Flows Dataset
+
+In order to leverage temporal information in videos, we heavily rely on the use of between sequential frames for many of our approaches. For each dataset, we generated flows using [FlowFormer](https://github.com/drinkingcoder/FlowFormer-Official). We have hosted all our generated flows [here](https://huggingface.co/datasets/hoffman-lab/Unified-VideoDA-Generated-Flows) for each dataset, and provide instructions for downloading each datasets flows.
+
+
+## Reproducing Experiment Results
+
+All experiments conducted in the paper have corresponding scripts for reproducability inside the repository. 
+
+All experiment scripts are located in `tools/experiments/*`, with scripts being separated by the different shifts and VideoDA techniques.
+
+### Viper -> CityscapesSeq
+<br>
+
+<ins>**Table 2: ImageDA methods on VideoDA benchmarks:**</ins>
+
+Segformer Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + MIC| `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda.sh` |
+| HRDA | `./tools/experiments/viper_csSeq/baselines/viper_csseq_hrda.sh` |
+| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda.sh` |
+| Source Only | `./tools/experiments/viper_csSeq/baselines/viper_source_hrda.sh` |
+
+DLV2 Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + MIC| `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda_dlv2.sh` |
+| HRDA | `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_dlv2_consis.sh` |
+| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda_dlv2.sh` |
+| Source Only| `./tools/experiments/viper_csSeq/baselines/viper_source_hrda_dlv2.sh` |
+<br>
+
+<ins>**Table 3: HRDA + MIC Ablation Study**</ins>
+
+DLV2 Backbone
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA - MRFusion| `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion.sh` |
+| HRDA - MRFusion - Rare class sampling | `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion_no_rcs.sh` |
+| HRDA - MRFusion - Rare class sampling - ImgNet feature distance reg| `./tools/experiments/viper_csSeq/mic_hrda_component_ablation/viper_csseq_hrda_dlv2_no_MRFusion_no_rcs_no_imnet.sh` |
+
+<br>
+
+<ins>**Table 4: Combining existing Video-DA methods with HRDA**</ins>
+
+DLV2 Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| Source Only| `./tools/experiments/viper_csSeq/baselines/viper_source_hrda_dlv2.sh` |
+| HRDA | `./tools/experiments/viper_csSeq/baselines/viper_csseq_mic_hrda_dlv2.sh` |
+| (TPS) HRDA + Accel + Consis Mixup + PL Refine Warp Frame | `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_warp_frame.sh` |
+| (DAVSN) HRDA + Accel + Consis Mixup + Video Discrim + PL Refine Max confidence| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim_consis_filter.sh` |
+| (UDA-VSS) HRDA + Accel + Video Discrim + PL Refine Consis Filter| `tools/experiments/viper_csSeq/video_discrim/viper_csseq_hrda_dlv2_video_discrim_consis.sh` |
+| (MOM) HRDA + Accel + Consis Mixup + PL Refine Consis Filter| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_consis_filter.sh` |
+| HRDA + Video Discrim. | `./tools/experiments/viper_csSeq/video_discrim/viper_csseq_hrda_dlv2_video_discrim.sh` |
+| HRDA + Accel + Consis Mixup| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup.sh` |
+| HRDA + PL refine Consis Filter| `tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_mic_hrda_dlv2_consis.sh` |
+| HRDA + Accel + Consis Mixup + Video Discrim| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim.sh` |
+| HRDA + Accel + Consis Mixup + Video Discrim + PL Refine Consis Filter| `tools/experiments/viper_csSeq/accel/viper_csseq_hrda_dlv2_accel_consis_mixup_video_discrim_consis_filter.sh` |
+| Target Only| `./tools/experiments/csSeq/supervised/csSeq_supervised_hrda_dlv2.sh` |
+
+<br>
+<br>
+
+
+Note: [For Tables 5-8] To train with forward or backwards flow, edit `FRAME_OFFSET`  (positive values = forward, negative values = backwards) in `configs/_base_/datasets/uda_viper_CSSeq.py` along with `cs_train_flow_dir` and `cs_val_flow_dir`.
+
+<ins>**Table 5: Psuedo-label Refinement on HRDA + MIC, Segformer Backbone**</ins>
+
+Segformer Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + MIC + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_mic_hrda_max_conf.sh` |
+| HRDA + MIC + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/rare_class_filter/viper_csseq_mic_hrda_rare_class_filter.sh` |
+| HRDA + MIC + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_mic_hrda_warp_frame.sh` |
+| HRDA + MIC + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_mic_hrda_oracle.sh` |
+<br>
+
+<ins>**Table 6: Psuedo-label Refinement on HRDA, Segformer Backbone**</ins>
+
+Segformer Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_consis.sh` |
+| HRDA + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_hrda_max_conf.sh` |
+| HRDA + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_hrda_warp_frame.sh` |
+| HRDA + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_hrda_oracle.sh` |
+<br>
+
+<ins>**Table 7: Psuedo-label Refinement on HRDA + MIC, DLV2 Backbone**</ins>
+
+DLV2 Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + MIC + PL Refine Consis Filter| `../tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_mic_hrda_dlv2_consis.sh` |
+| HRDA + MIC + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_mic_hrda_dlv2_max_conf.sh` |
+| HRDA + MIC + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_mic_hrda_dlv2_warp_frame.sh` |
+| HRDA + MIC + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_mic_hrda_dlv2_oracle.sh` |
+
+<br>
+
+<ins>**Table 8: Psuedo-label Refinement on HRDA, DLV2 Backbone**</ins>
+
+Segformer Backbone:
+| Experiment | Training Script |
+| ------------- | ------------- |
+| HRDA + PL Refine Consis Filter| `./tools/experiments/viper_csSeq/pl_refinement/consis/viper_csseq_hrda_dlv2_consis.sh` |
+| HRDA + PL Refine Max Confidence | `./tools/experiments/viper_csSeq/pl_refinement/max_conf/viper_csseq_hrda_dlv2_max_conf.sh` |
+| HRDA + PL Refine Warp Frame| `./tools/experiments/viper_csSeq/pl_refinement/warp_frame/viper_csseq_hrda_dlv2_warp_frame.sh` |
+| HRDA + PL Refine Oracle | `./tools/experiments/viper_csSeq/pl_refinement/oracle/viper_csseq_hrda_dlv2_oracle.sh` |
+
+### Other Shifts
+
+SynthiaSeq -> CityscapesSeq, SynthiaSeq -> BDDVid, ViperSeq --> BDDVid experiment scripts follow directory structure as the Viper Experiments. You can find all relevant experiments reported in the paper at `tools/experiments/*`.
+
+
+