Skip to content

Commit

Permalink
Merge pull request #6 from TillBeemelmanns/main
Browse files Browse the repository at this point in the history
Major Update - Support for nuScenes dataset, Semantic Kitti, update to Tensorflow 2.9
  • Loading branch information
TillBeemelmanns authored Jul 20, 2022
2 parents 3264853 + 044132b commit 8fbc021
Show file tree
Hide file tree
Showing 124 changed files with 2,338 additions and 182 deletions.
117 changes: 73 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,31 @@
## Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.6
[![DOI](https://zenodo.org/badge/346761395.svg)](https://zenodo.org/badge/latestdoi/346761395)
## Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.9.1 with SqueezeSeg

![](assets/video2.gif)

This repository contains implementations of SqueezeSegV2 [1], Darknet52 [2] and Darknet21 [2] for semantic point cloud
segmentation implemented in Keras/Tensorflow 2.6. The repository contains the model architectures, training, evaluation and
visualisation scripts.
This repository contains implementations of SqueezeSegV2 [1], Darknet53 [2] and Darknet21 [2] for semantic point cloud
segmentation implemented in Keras/Tensorflow 2.9.1 The repository contains the model architectures, training, evaluation and
visualisation scripts. We also provide scripts to load and train the public dataset
[Semantic Kitti](http://www.semantic-kitti.org/) and [NuScenes](https://www.nuscenes.org/).

## Usage


#### Installation
All required libraries are listed in the `requirements.txt` file. You may install them within a [virtual environment](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#creating-a-virtual-environment) with:
All required libraries are listed in the `requirements.txt` file. You may install them within a
[virtual environment](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#creating-a-virtual-environment)
with:
```bash
pip install -r requirements.txt
```

For visualizations using matplotlib, you will need to install `tkinter`:

`sudo apt-get install python3-tk`

#### Data Format
This repository relies on the data format as in [1]. A dataset has the following file structure:
```
.
├── ImageSet
├── train.txt
├── val.txt
├── test.txt
├── train
├── val
├── test
```
The data samples are located in the directories `train`, `val` and `test`. The `*.txt` files within the directory
`ImageSet` contain the filenames for the corresponding samples in data directories.
The data samples are located in the directories `train`, `val` and `test`.

A data sample is stored as a numpy `*.npy` file. Each file contains
a tensor of size `height X width X 6`. The 6 channels correspond to
Expand All @@ -49,10 +41,15 @@ For points in the point cloud that are not present (e.g. due to no reflection),
A sample dataset can be found in the directory `data`.

#### Sample Dataset
This repository provides a sample dataset which can be used as a template for your own dataset. The directory
`sample_dataset` a small `train` and `val` split with 32 samples and 3 samples, respectively. The samples in the train
dataset are automatically annoated by cross-modal label transfer while the validation set was manually annotated.

This repository provides several sample dataset which can be used as a template for your own dataset. The directory
`dataset_samples` contains the directories
```
.
├── nuscenes
├── sample_dataset
├── semantic_kitti
```
Each directory in turn contains a `train` and `val` split with 32 train samples and 3 validation samples.

#### Data Normalization
For a proper data normalization it is necessary to iterate over training set and determine the __mean__ and __std__
Expand All @@ -61,16 +58,16 @@ values for each of the input fields. The script `preprocessing/inspect_training_
```bash
# pclsegmentation/pcl_segmentation
$ python3 preprocessing/inspect_training_data.py \
--input_dir="../sample_dataset/train/" \
--output_dir="../sample_dataset/ImageSet"
--input_dir="../dataset_samples/sample_dataset/train/" \
--output_dir="../dataset_samples/sample_dataset/ImageSet"
```
The glob pattern `*.npy` is applied to the `input_dir` path. The script computes and prints the mean and std values
for the five input fields. These values should be set in the configuration files in [pcl_segmentation/configs](pcl_segmentation/configs) as the arrays `mc.INPUT_MEAN` and
`mc.INPUT_STD`.
for the five input fields. These values should be set in the configuration files in
[pcl_segmentation/configs](pcl_segmentation/configs) as the arrays `mc.INPUT_MEAN` and `mc.INPUT_STD`.

#### Training
The training of the segmentation networks can be evoked by using the `train.py` script. It is possible to choose between
three different network architectures: `squeezesegv2` [1], `darknet21` [2] and `darknet52` [2].
three different network architectures: `squeezesegv2` [1], `darknet21` [2] and `darknet53` [2].
The training script uses the dataset splits `train` and `val`. The metrics for both splits are constantly computed
during training. The Tensorboard callback also uses the `val` split for visualisation of the current model prediction.
```bash
Expand Down Expand Up @@ -110,7 +107,8 @@ $ python3 inference.py \
```

## Docker
We also provide a docker environment for __training__, __evaluation__ and __inference__. All script can be found in the directory `docker`.
We also provide a docker environment for __training__, __evaluation__ and __inference__. All script can be found in the
directory `docker`.

First, build the environment with
```bash
Expand All @@ -136,6 +134,46 @@ For inference on the sample dataset execute:
./docker_inference.sh
```

## Datasets
In the directory [dataset_convert](dataset_convert) you will find conversion scripts to convert following datasets
to a format that can be read by the data pipeline implemented in this repository.

### NuScenes
Make sure that you have installed `nuscenes-devkit` and that you downloaded the nuScenes dataset correctly. Then execute
the script `nu_dataset.py`
```bash
# dataset_convert
$ python3 nu_dataset.py \
--dataset /root/path/nuscenes \
--output_dir /root/path/nuscenes/converted
```
The script will generate `*.npy` files into the directory `converted`. It will automatically create a train/val split.
Make sure to create two empty directories `train` and `val`. The current implementation will perform a class reduction.


### Semantic Kitti
The [Semantic Kitti](http://www.semantic-kitti.org/) dataset can be converted with the script `semantic_kitti.py`.
```bash
# dataset_convert
$ python3 semantic_kitti.py \
--dataset /root/path/semantic_kitti \
--output_dir /root/path/semantic_kitti/converted
```
The script will generate `*.npy` files into the directory `converted`. It will automatically create a train/val split.
Make sure to create two empty directories `train` and `val`. The current implementation will perform a class reduction.

### Generic PCD dataset
The script [`pcd_dataset.py`](dataset_convert/pcd_dataset.py) allows the conversion of a labeled `*.pcd` dataset.
As input dataset define the directory that contains all `*.pcd` files. The pcd files need to have the field `label`.
Check the script for more details.

```bash
# dataset_convert
$ python3 pcd_dataset.py \
--dataset /root/path/pcd_dataset \
--output_dir /root/path/pcd_dataset/converted
```

## Tensorboard
The implementation also contains a Tensorboard callback which visualizes the most important metrics such as the __confusion
matrix__, __IoUs__, __MIoU__, __Recalls__, __Precisions__, __Learning Rates__, different __losses__ and the current model
Expand Down Expand Up @@ -164,29 +202,20 @@ The network architectures are based on
- [1] [SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a
LiDAR Point Cloud](https://github.com/xuanyuzhou98/SqueezeSegV2)
- [2] [RangeNet++: Fast and Accurate LiDAR Semantic Segmentation](https://github.com/PRBonn/lidar-bonnetal)

- [3] [Semantic Kitti](http://www.semantic-kitti.org/)
- [4] [nuScenes](https://www.nuscenes.org/)

### TODO
- [x] Faster input pipeline using TFRecords preprocessing
- [x] Docker support
- [ ] Implement CRF Postprocessing for SqueezeSegV2
- [ ] Implement a Dataloader for the Semantic Kitti dataset
- [x] Implement dataloader for Semantic Kitti dataset
- [x] Implement dataloader for nuScenes dataset
- [ ] None class handling
- [ ] Add results for Semantic Kitti and nuScenes
- [x] Update to Tensorflow 2.9

### Author of this Repository
[Till Beemelmanns](https://github.com/TillBeemelmanns)

Mail: `till.beemelmanns (at) ika.rwth-aachen.de`

### Citation

We hope the provided code can help in your research. If this is the case, please cite:

```
@misc{Beemelmanns2021,
author = {Till Beemelmanns},
title = {Semantic Segmentation of LiDAR Point Clouds in Tensorflow 2.6},
year = 2021,
url = {https://github.com/ika-rwth-aachen/PCLSegmentation},
doi={10.5281/zenodo.4665751}
}
```
Mail: `till.beemelmanns (at) ika.rwth-aachen.de`
139 changes: 139 additions & 0 deletions dataset_convert/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
## nuScenes dataset

This directory contains a script `nu_dataset.py` which converts pointcloud samples in [nuScenes dataset](https://www.nuscenes.org/nuscenes#lidarseg) to a data format which can be used to train the neural networks developed for PCL segmentation. Moreover, `laserscan.py` includes definations of some of the classes used in`nu_dataset.py`.

The directory should have the file structure:
```
├── ImageSet
├── train.txt
├── val.txt
├── test.txt
├── train
├── val
├── test
├── nu_dataset.py
├── laserscan.py
```
The data samples are located in the directories `train`, `val` and `test`. The `*.txt` files within the directory `ImageSet` contain the filenames for the corresponding samples in data directories.

### Conversion Script

Conversion script `nu_dataset.py` uses some of the functions and classes defined in [nuScenes devkit](https://github.com/nutonomy/nuscenes-devkit) and in [API for SemanticKITTI](https://github.com/PRBonn/semantic-kitti-api#readme). It opens pointcloud scans, spherically projects the points in these scans into 2D and store these projections as `*.npy` files. Each of these `*.npy` files contains a tensor of size `32 X 1024 X 6`. The 6 channels correspond to

0. X-Coordinate in [m]
1. Y-Coordinate in [m]
2. Z-Coordinate in [m]
3. Intensity
4. Depth in [m]
5. Label ID

The script stores the projections in `nuscenes_dataset/train` or `nuscenes_dataset/val` directory of PCL segmentation repository.

```bash
./nu_dataset.py --dataset /path/to/nuScenes/dataset/ --output_dir /path/to/PCLSegmentation/ -v
```
where:
- `dataset` is the path to the nuScenes dataset where the `/data/sets/nuscenes` directory is.
- `output_dir` is the output path to the PCL segmentation repository.
- `v`is a flag. If it is used, the projections are stored in validation set, otherwise they are stored in training set.

To be able to run the script, the instructions explaining how to use nuScenes devkit and how to download the dataset can be found [here](https://github.com/nutonomy/nuscenes-devkit#nuscenes-lidarseg).


## SemanticKITTI dataset

This directory contains the `semantic_kitti.py` which converts pointcloud samples in [SemanticKITTI dataset](http://www.semantic-kitti.org/) to a data format which can be used to train the neural networks developed for PCL segmentation. It also includes a small `train` and `val` split with 20 samples and 2 samples, respectively.

### Conversion Scripts

The scripts use some of the functions and classes defined in [API for SemanticKITTI](https://github.com/PRBonn/semantic-kitti-api#readme). They open pointcloud scans, project the points in these scans into 2D and store these projections as `*.npy` files. Each of these `*.npy` files contains a tensor of size `64 X 1024 X 6`. The 6 channels correspond to

0. X-Coordinate in [m]
1. Y-Coordinate in [m]
2. Z-Coordinate in [m]
3. Intensity (with range [0-255])
4. Depth in [m]
5. Label ID

#### Downloading SemanticKITTI dataset

To be able to run the scripts, firstly, SemanticKITTI dataset should be downloaded. Information about files in this dataset and how to download it is provided [here](http://www.semantic-kitti.org/dataset.html)

SemanticKITTI dataset is organized in the following format:

```
/kitti/dataset/
└── sequences/
├── 00/
│   ├── poses.txt
│ ├── image_2/
│   ├── image_3/
│   ├── labels/
│ │ ├ 000000.label
│ │ └ 000001.label
| ├── voxels/
| | ├ 000000.bin
| | ├ 000000.label
| | ├ 000000.occluded
| | ├ 000000.invalid
| | ├ 000001.bin
| | ├ 000001.label
| | ├ 000001.occluded
| | ├ 000001.invalid
│   └── velodyne/
│ ├ 000000.bin
│ └ 000001.bin
├── 01/
├── 02/
.
.
.
└── 21/
```
#### Using API for SemanticKITTI

##### semantic_kittit_sequence.py

The script projects the scans in the specified sequence and stores the projections in `semantic_kitti_dataset/train` or `semantic_kitti_dataset/val` directory of PCL segmentation repository.

```bash
./semantic_kitti_sequence.py --sequence 00 --dataset /path/to/kitti/dataset/ --output_dir /path/to/PCLSegmentation/ -v
```
where:
- `sequence` is the sequence to be accessed (optional, default value is 00).
- `dataset` is the path to the kitti dataset where the `sequences` directory is.
- `output_dir` is the output path to the PCL segmentation repository.
- `v`is a flag. If it is used, the projections are stored in validation set, otherwise they are stored in training set.

##### semantic_kitti.py

The script randomly picks a specified number of scans from all sequences and stores their projections in `semantic_kitti_dataset/train` and `semantic_kitti_dataset/val` directory of PCL segmentation repository.

```bash
./semantic_kitti.py --dataset /path/to/kitti/dataset/ --output_dir /path/to/PCLSegmentation/ -n 20 -s 0.8 -v
```
where:
- `dataset` is the path to the kitti dataset where the `sequences` directory is.
- `output_dir` is the output path to the PCL segmentation repository.
- `n`is the number of training samples (projections) to be used in training and validation sets. Maximum is 23201. Default is 20.
- `s`is the split ratio of samples between training and validation sets. It should be between 0 and 1. Default is 0.9.
- `v` is a flag. If it is used, the projections consist of 32 layers instead of 64. The script extracts 32 specified layers from the SemanticKITTI projections which are 64-layered.

### Generalization to VLP-32 Data

The ultimate goal is to have a network which is trained on higher resolution KITTI dataset and does semantic segmentation on VLP-32 lidar data. KITTI pointcloud projections used in training should be modified in such a way which makes them similar to VLP-32 data. One method is to extract 32 specified layers from the KITTI point cloud projections. However, the network has not been able to generalize to VLP-32 Data well yet. The tested layer configurations are as follows.

#### Tested Layer Configurations

`layers` array, which is defined in `conversion_3.py` script, specifies 32 layers which will be extracted from KITTI projections.

- layers = np.arange(16,48)
- configuration 3 is used, but intensity is not used as a feature in semantic segmentation.
- layers = np.concatenate([np.array([14, 15, 17, 24, 26, 30, 31, 34, 36, 37, 39, 41, 43, 45]), np.arange(46, 64)])
- layers = [0, 4, 8, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 30, 31, 32, 33, 35, 37, 39, 41, 43, 45, 47, 49, 50, 51, 52, 55, 59, 63]
- directly projecting KITTI point clouds into 32-layered projections instead of extracting 32 layers from 64-layered projections.
- layers = np.concatenate([np.array([14, 15, 16, 17, 25, 26, 27, 31, 33, 36, 39, 41, 43, 45]), np.arange(46, 64)])
- layers = np.arange(1,64,2)


Loading

0 comments on commit 8fbc021

Please sign in to comment.