Merge pull request #6 from TillBeemelmanns/main

Major Update - Support for nuScenes dataset, Semantic Kitti, update to Tensorflow 2.9
ika-rwth-aachen · Jul 20, 2022 · 8fbc021 · 8fbc021
2 parents 3264853 + 044132b
commit 8fbc021
Show file tree

Hide file tree

Showing 124 changed files with 2,338 additions and 182 deletions.
diff --git a/README.md b/README.md
@@ -1,39 +1,31 @@
-## Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.6
-[![DOI](https://zenodo.org/badge/346761395.svg)](https://zenodo.org/badge/latestdoi/346761395)
+## Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.9.1 with SqueezeSeg
 
 ![](assets/video2.gif)
 
-This repository contains implementations of SqueezeSegV2 [1], Darknet52 [2] and Darknet21 [2] for semantic point cloud
-segmentation implemented in Keras/Tensorflow 2.6. The repository contains the model architectures, training, evaluation and
-visualisation scripts.
+This repository contains implementations of SqueezeSegV2 [1], Darknet53 [2] and Darknet21 [2] for semantic point cloud
+segmentation implemented in Keras/Tensorflow 2.9.1 The repository contains the model architectures, training, evaluation and
+visualisation scripts. We also provide scripts to load and train the public dataset
+[Semantic Kitti](http://www.semantic-kitti.org/) and [NuScenes](https://www.nuscenes.org/).
 
 ## Usage
 
-
 #### Installation
-All required libraries are listed in the `requirements.txt` file. You may install them within a [virtual environment](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#creating-a-virtual-environment) with:
+All required libraries are listed in the `requirements.txt` file. You may install them within a 
+[virtual environment](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#creating-a-virtual-environment)
+with:
 ```bash
 pip install -r requirements.txt
 ```
 
-For visualizations using matplotlib, you will need to install `tkinter`:
-
-`sudo apt-get install python3-tk`
-
 #### Data Format
 This repository relies on the data format as in [1]. A dataset has the following file structure:
 ```
 .
-├── ImageSet
-    ├── train.txt
-    ├── val.txt
-    ├── test.txt
 ├── train
 ├── val
 ├── test
 ```
-The data samples are located in the directories `train`, `val` and `test`. The `*.txt` files within the directory
-`ImageSet` contain the filenames for the corresponding samples in data directories.
+The data samples are located in the directories `train`, `val` and `test`.
 
  A data sample is stored as a numpy `*.npy` file. Each file contains
 a tensor of size `height X width X 6`. The 6 channels correspond to
@@ -49,10 +41,15 @@ For points in the point cloud that are not present (e.g. due to no reflection),
 A sample dataset can be found in the directory `data`.
 
 #### Sample Dataset
-This repository provides a sample dataset which can be used as a template for your own dataset. The directory 
-`sample_dataset` a small `train` and `val` split with 32 samples and 3 samples, respectively. The samples in the train
-dataset are automatically annoated by cross-modal label transfer while the validation set was manually annotated.
-
+This repository provides several sample dataset which can be used as a template for your own dataset. The directory 
+`dataset_samples` contains the directories
+```
+.
+├── nuscenes
+├── sample_dataset
+├── semantic_kitti
+```
+Each directory in turn contains a `train` and `val` split with 32 train samples and 3 validation samples.
 
 #### Data Normalization
 For a proper data normalization it is necessary to iterate over training set and determine the __mean__ and __std__
@@ -61,16 +58,16 @@ values for each of the input fields. The script `preprocessing/inspect_training_
 ```bash
 # pclsegmentation/pcl_segmentation
 $ python3 preprocessing/inspect_training_data.py \
---input_dir="../sample_dataset/train/" \
---output_dir="../sample_dataset/ImageSet"
+--input_dir="../dataset_samples/sample_dataset/train/" \
+--output_dir="../dataset_samples/sample_dataset/ImageSet"
 ```
 The glob pattern `*.npy` is applied to the `input_dir` path. The script computes and prints the mean and std values
-for the five input fields. These values should be set in the configuration files in [pcl_segmentation/configs](pcl_segmentation/configs) as the arrays `mc.INPUT_MEAN` and
- `mc.INPUT_STD`.
+for the five input fields. These values should be set in the configuration files in
+[pcl_segmentation/configs](pcl_segmentation/configs) as the arrays `mc.INPUT_MEAN` and `mc.INPUT_STD`.
 
 #### Training
 The training of the segmentation networks can be evoked by using the `train.py` script. It is possible to choose between
-three different network architectures: `squeezesegv2` [1],  `darknet21` [2] and `darknet52` [2].
+three different network architectures: `squeezesegv2` [1],  `darknet21` [2] and `darknet53` [2].
 The training script uses the dataset splits `train` and `val`. The metrics for both splits are constantly computed
 during training. The Tensorboard callback also uses the `val` split for visualisation of the current model prediction.
 ```bash
@@ -110,7 +107,8 @@ $ python3 inference.py \
 ```
 
 ## Docker
-We also provide a docker environment for __training__, __evaluation__ and __inference__. All script can be found in the directory `docker`.
+We also provide a docker environment for __training__, __evaluation__ and __inference__. All script can be found in the 
+directory `docker`.
 
 First, build the environment with 
 ```bash
@@ -136,6 +134,46 @@ For inference on the sample dataset execute:
 ./docker_inference.sh
 ```
 
+## Datasets
+In the directory [dataset_convert](dataset_convert) you will find conversion scripts to convert following datasets
+to a format that can be read by the data pipeline implemented in this repository.
+
+### NuScenes
+Make sure that you have installed `nuscenes-devkit` and that you downloaded the nuScenes dataset correctly. Then execute
+the script `nu_dataset.py`
+```bash
+# dataset_convert
+$ python3 nu_dataset.py \
+--dataset /root/path/nuscenes \
+--output_dir /root/path/nuscenes/converted
+```
+The script will generate `*.npy` files into the directory `converted`. It will automatically create a train/val split.
+Make sure to create two empty directories `train` and `val`. The current implementation will perform a class reduction. 
+
+
+### Semantic Kitti
+The [Semantic Kitti](http://www.semantic-kitti.org/) dataset can be converted with the script `semantic_kitti.py`.
+```bash
+# dataset_convert
+$ python3 semantic_kitti.py \
+--dataset /root/path/semantic_kitti \
+--output_dir /root/path/semantic_kitti/converted
+```
+The script will generate `*.npy` files into the directory `converted`. It will automatically create a train/val split.
+Make sure to create two empty directories `train` and `val`. The current implementation will perform a class reduction. 
+
+### Generic PCD dataset
+The script [`pcd_dataset.py`](dataset_convert/pcd_dataset.py) allows the conversion of a labeled `*.pcd` dataset. 
+As input dataset define the directory that contains all `*.pcd` files. The pcd files need to have the field `label`.
+Check the script for more details.
+
+```bash
+# dataset_convert
+$ python3 pcd_dataset.py \
+--dataset /root/path/pcd_dataset \
+--output_dir /root/path/pcd_dataset/converted
+```
+
 ## Tensorboard
 The implementation also contains a Tensorboard callback which visualizes the most important metrics such as the __confusion
 matrix__, __IoUs__, __MIoU__, __Recalls__, __Precisions__, __Learning Rates__, different __losses__ and the current model
@@ -164,29 +202,20 @@ The network architectures are based on
 - [1] [SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a 
 LiDAR Point Cloud](https://github.com/xuanyuzhou98/SqueezeSegV2)
 - [2] [RangeNet++: Fast and Accurate LiDAR Semantic Segmentation](https://github.com/PRBonn/lidar-bonnetal)
-
+- [3] [Semantic Kitti](http://www.semantic-kitti.org/)
+- [4] [nuScenes](https://www.nuscenes.org/)
 
 ### TODO
 - [x] Faster input pipeline using TFRecords preprocessing
 - [x] Docker support
 - [ ] Implement CRF Postprocessing for SqueezeSegV2
-- [ ] Implement a Dataloader for the Semantic Kitti dataset
+- [x] Implement dataloader for Semantic Kitti dataset
+- [x] Implement dataloader for nuScenes dataset
+- [ ] None class handling
+- [ ] Add results for Semantic Kitti and nuScenes 
+- [x] Update to Tensorflow 2.9
 
 ### Author of this Repository
 [Till Beemelmanns](https://github.com/TillBeemelmanns)
 
-Mail: `till.beemelmanns (at) ika.rwth-aachen.de`
-
-### Citation 
-
-We hope the provided code can help in your research. If this is the case, please cite:
-
-```
-@misc{Beemelmanns2021,
-  author = {Till Beemelmanns},
-  title = {Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.6},
-  year = 2021,
-  url = {https://github.com/ika-rwth-aachen/PCLSegmentation},
-  doi={10.5281/zenodo.4665751}
-}
-```
+Mail: `till.beemelmanns (at) ika.rwth-aachen.de`
diff --git a/dataset_convert/README.md b/dataset_convert/README.md
@@ -0,0 +1,139 @@
+## nuScenes dataset
+
+This directory contains a script `nu_dataset.py` which converts pointcloud samples in [nuScenes dataset](https://www.nuscenes.org/nuscenes#lidarseg) to a data format which can be used to train the neural networks developed for PCL segmentation. Moreover, `laserscan.py` includes definations of some of the classes used in`nu_dataset.py`.
+
+The directory should have the file structure:
+```
+├── ImageSet
+    ├── train.txt
+    ├── val.txt
+    ├── test.txt
+├── train
+├── val
+├── test
+├── nu_dataset.py
+├── laserscan.py
+
+```
+The data samples are located in the directories `train`, `val` and `test`. The `*.txt` files within the directory `ImageSet` contain the filenames for the corresponding samples in data directories.
+
+### Conversion Script 
+
+Conversion script `nu_dataset.py` uses some of the functions and classes defined in [nuScenes devkit](https://github.com/nutonomy/nuscenes-devkit) and in [API for SemanticKITTI](https://github.com/PRBonn/semantic-kitti-api#readme). It opens pointcloud scans, spherically projects the points in these scans into 2D and store these projections as `*.npy` files. Each of these `*.npy` files contains a tensor of size `32 X 1024 X 6`. The 6 channels correspond to
+
+0. X-Coordinate in [m]
+1. Y-Coordinate in [m]
+2. Z-Coordinate in [m]
+3. Intensity
+4. Depth in [m]
+5. Label ID
+
+The script stores the projections in `nuscenes_dataset/train` or `nuscenes_dataset/val` directory of PCL segmentation repository.
+
+```bash
+./nu_dataset.py --dataset /path/to/nuScenes/dataset/ --output_dir /path/to/PCLSegmentation/ -v
+```
+where:
+- `dataset` is the path to the nuScenes dataset where the `/data/sets/nuscenes` directory is.
+- `output_dir` is the output path to the PCL segmentation repository.
+- `v`is a flag. If it is used, the projections are stored in validation set, otherwise they are stored in training set.
+
+To be able to run the script, the instructions explaining how to use nuScenes devkit and how to download the dataset can be found [here](https://github.com/nutonomy/nuscenes-devkit#nuscenes-lidarseg).
+
+
+## SemanticKITTI dataset
+
+This directory contains the `semantic_kitti.py` which converts pointcloud samples in [SemanticKITTI dataset](http://www.semantic-kitti.org/) to a data format which can be used to train the neural networks developed for PCL segmentation. It also includes a small `train` and `val` split with 20 samples and 2 samples, respectively.
+
+### Conversion Scripts 
+
+The scripts use some of the functions and classes defined in [API for SemanticKITTI](https://github.com/PRBonn/semantic-kitti-api#readme). They open pointcloud scans, project the points in these scans into 2D and store these projections as `*.npy` files. Each of these `*.npy` files contains a tensor of size `64 X 1024 X 6`. The 6 channels correspond to
+
+0. X-Coordinate in [m]
+1. Y-Coordinate in [m]
+2. Z-Coordinate in [m]
+3. Intensity (with range [0-255])
+4. Depth in [m]
+5. Label ID
+
+#### Downloading SemanticKITTI dataset
+
+To be able to run the scripts, firstly, SemanticKITTI dataset should be downloaded. Information about files in this dataset and how to download it is provided [here](http://www.semantic-kitti.org/dataset.html)
+
+SemanticKITTI dataset is organized in the following format:
+
+```
+/kitti/dataset/
+          └── sequences/
+                  ├── 00/
+                  │   ├── poses.txt
+                  │   ├── image_2/
+                  │   ├── image_3/
+                  │   ├── labels/
+                  │   │     ├ 000000.label
+                  │   │     └ 000001.label
+                  |   ├── voxels/
+                  |   |     ├ 000000.bin
+                  |   |     ├ 000000.label
+                  |   |     ├ 000000.occluded
+                  |   |     ├ 000000.invalid
+                  |   |     ├ 000001.bin
+                  |   |     ├ 000001.label
+                  |   |     ├ 000001.occluded
+                  |   |     ├ 000001.invalid
+                  │   └── velodyne/
+                  │         ├ 000000.bin
+                  │         └ 000001.bin
+                  ├── 01/
+                  ├── 02/
+                  .
+                  .
+                  .
+                  └── 21/
+```
+#### Using API for SemanticKITTI
+
+##### semantic_kittit_sequence.py  
+
+The script projects the scans in the specified sequence and stores the projections in `semantic_kitti_dataset/train` or `semantic_kitti_dataset/val` directory of PCL segmentation repository.
+
+```bash
+./semantic_kitti_sequence.py --sequence 00 --dataset /path/to/kitti/dataset/ --output_dir /path/to/PCLSegmentation/ -v
+```
+where:
+- `sequence` is the sequence to be accessed (optional, default value is 00).
+- `dataset` is the path to the kitti dataset where the `sequences` directory is.
+- `output_dir` is the output path to the PCL segmentation repository. 
+- `v`is a flag. If it is used, the projections are stored in validation set, otherwise they are stored in training set.
+
+##### semantic_kitti.py 
+
+The script randomly picks a specified number of scans from all sequences and stores their projections in `semantic_kitti_dataset/train` and `semantic_kitti_dataset/val` directory of PCL segmentation repository. 
+
+```bash
+./semantic_kitti.py --dataset /path/to/kitti/dataset/ --output_dir /path/to/PCLSegmentation/ -n 20 -s 0.8 -v
+```
+where:
+- `dataset` is the path to the kitti dataset where the `sequences` directory is.
+- `output_dir` is the output path to the PCL segmentation repository. 
+- `n`is the number of training samples (projections) to be used in training and validation sets. Maximum is 23201. Default is 20.
+- `s`is the split ratio of samples between training and validation sets. It should be between 0 and 1. Default is 0.9.
+- `v` is a flag. If it is used, the projections consist of 32 layers instead of 64. The script extracts 32 specified layers from the SemanticKITTI projections which are 64-layered.
+
+### Generalization to VLP-32 Data
+
+The ultimate goal is to have a network which is trained on higher resolution KITTI dataset and does semantic segmentation on VLP-32 lidar data. KITTI pointcloud projections used in training should be modified in such a way which makes them similar to VLP-32 data. One method is to extract 32 specified layers from the KITTI point cloud projections. However, the network has not been able to generalize to VLP-32 Data well yet. The tested layer configurations are as follows. 
+
+#### Tested Layer Configurations
+
+`layers` array, which is defined in `conversion_3.py` script, specifies 32 layers which will be extracted from KITTI projections. 
+
+- layers = np.arange(16,48)
+- configuration 3 is used, but intensity is not used as a feature in semantic segmentation.
+- layers = np.concatenate([np.array([14, 15, 17, 24, 26, 30, 31, 34, 36, 37, 39, 41, 43, 45]), np.arange(46, 64)])
+- layers = [0, 4, 8, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 30, 31, 32, 33, 35, 37, 39, 41, 43, 45, 47, 49, 50, 51, 52, 55, 59, 63] 
+- directly projecting KITTI point clouds into 32-layered projections instead of extracting 32 layers from 64-layered projections.
+- layers = np.concatenate([np.array([14, 15, 16, 17, 25, 26, 27, 31, 33, 36, 39, 41, 43, 45]), np.arange(46, 64)])
+- layers = np.arange(1,64,2)
+
+