Official PyTorch implementation of StixelNExT, from the following paper:
Toward Monocular Low-Weight Perception for Object Segmentation and Free Space Detection. IV 2024.
Marcel Vosshans, Omar Ait-Aider, Youcef Mezouar and Markus Enzweiler
University of Esslingen, UCA Sigma Clermont
[Xplore][arXiv]
If you find our work useful in your research please consider citing our paper:
@INPROCEEDINGS{StixelNExT,
booktitle={2024 IEEE Intelligent Vehicles Symposium (IV)},
title = {StixelNExT: Toward Monocular Low-Weight Perception for Object Segmentation and Free Space Detection},
author = {Vosshans, Marcel and
Ait-Aider, Omar and
Mezouar, Youcef and
Enzweiler, Markus},
booktitle = {2024 IEEE Intelligent Vehicles Symposium (IV)},
year = {2024},
pages={2154-2161},
keywords={Training;Space vehicles;Adaptation models;Laser radar;Image recognition;Intelligent vehicles;Training data},
doi={10.1109/IV55156.2024.10588680}
}
StixelNExT is a low-weight CNN with roughly 1.5 mio. parameters to segment obstacles in the 2D plane and divide them into
multiple objects. It is trainable within ~10 epochs without pre-trained weights.
Recommended is a fresh Python Venv (Version >= 3.7), you can install the dependencies with:
sudo apt-get install python3-venv
python3 -m venv venv # on project folder level
source venv/bin/activate
pip install -r requirements.txtWe ran our experiments with PyTorch 2.1.2, CUDA 11.8, Python 3.8.10 and Ubuntu 20.04.05 LTS.
You can predict a single image with the following script, just needed a target_image.png and weights.
python predict_single_img.py --image_path test_image.png --weights StixelNExT_prime-sunset-157_epoch-8_test-error-0.23861433565616608
python predict_single_img.py # or ... for default valuesPretrained model weights (used in our paper) can be downloaded here (KITTI).
We also published our pipeline to generate ground truth from any dataset (dataloader necessary). Mandatory is a camera, a LiDAR and the corresponding projection: StixelGENerator.
We also provide an already generated dataset, based on the public available KITTI dataset. It can be downloaded here (35,48 GB).
We used Weights & Biases for organizing our trainings, so check your W&B python API key login or write a workaround.
- Use the config.yaml file to configure your paths and settings for the training and copy it to project level
- Run train.py (in your IDE or with
source venv/bin/activate && python train.pyin that case, don't forget to change the file permissionschmod +x your_script.py) - Get your weights (checkpoints) from the saved_models folder and test it with
predict_single_img.py
For evaluation, we provide another repository with both metrics: "The Naive" and the "The Fairer". StixelNExT-Eval.
The next step involves the addition of depth estimation. Future research will focus on incorporating end-to-end monocular depth estimation into StixelNExT: StixelNExT Pro.