-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f45b5da
Showing
30 changed files
with
8,012 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
outputs/ | ||
lightning_logs/ | ||
.env | ||
mlruns/ | ||
.neptune/ | ||
logs/ | ||
__pycache__/ |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
# Acoustic Traffic Simulation and Counting | ||
Welcome to the repository of the acoustic based traffic monitoring system for the [DCASE 2024 Task 10](https://dcase.community/challenge2024/) . | ||
|
||
This code also accompanies the publication of the ICASSP 2024 paper | ||
"[Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks?](https://arxiv.org/abs/2401.09308)" by S.Damiano, L. Bondi, S. Ghaffarzadegan, A. Guntoro, T. van Waterschoot. | ||
|
||
|
||
## Purpose of the project | ||
|
||
This software is a research prototype, solely developed for and published as per the scope above. It will neither be maintained nor monitored in any way. | ||
|
||
## Introduction | ||
|
||
Acoustic traffic counting based on microphone arrays offers and alternative to vision, coil, and radar-based systems. When traffic density grows above a few vehicles per minute, data-driven solutions are often required to satisfy the precision requirement of end users. As per any data-driven system, more data leads to better performance, and often to higher costs. | ||
|
||
This project aims at replacing large-scale acoustic data collection from real traffic scenarios with large-scale traffic simulation. Synthetic data is used to pre-train a model, and a small amount of real-world data is used to fine-tune the model. | ||
|
||
The framework we propose is summarized by the following diagram. | ||
|
||
<img src="assets/framework.svg"> | ||
|
||
Reading the diagram from left to right: | ||
|
||
- Engine sounds are simulated via [enginesound](https://github.com/DasEtwas/enginesound) and stored. A pool of pre-computed engine sounds is made available on [Zenodo](https://zenodo.org/records/10700792). | ||
- The sound of a moving vehicle in front of a microphone array (single pass-by) is simulated from the combination of engine sound and simulated road-tire interactions. Acoustic propagation, dependent on the geometry of a specific site, is used to generate a pool of single pass-bys. A single pass-by is a multi-channel audio file that simulates a linear trajectory of a vehicle at constant speed. A pool of pre-computed pass-bys is made available on [Zenodo](https://zenodo.org/records/10700792). | ||
- Traffic statistics, e.g. number of vehicles per hour and road speed limit, are used to generate a traffic model, i.e. a distribution of pass-by events over time for different types of vehicles and direction of travel. | ||
- A model is pre-trained on 60 seconds long segments generated online starting from the pool of single pass-by events and the traffic model. | ||
- The model is fine-tuned on real data. | ||
- Model inference and evaluation is performed on real data. | ||
|
||
We are considering traffic scenarios with two lane roads, and two categories of vehicles: cars and commercial vehicles (cv). | ||
The task is to count number of vehicles per direction of travel (left, right) and vehicle type (car, cv) are passing by in 60 seconds of audio. | ||
|
||
## Conceptual modules | ||
|
||
The current repository is structured around three modules: | ||
|
||
1. <strong>Modeling</strong>: from site specifications to traffic model (`atsc.modeling`) | ||
- Input: site specifications, including traffic statistics and geometry information. | ||
- Output: traffic model in the form of a list of pass-by events for different vehicle types, at different speed and direction of travel. | ||
2. <strong>Simulation</strong>: from site specifications to acoustic simulation (`atsc.simulation`) | ||
- Input: site specifications, including traffic statistics and geometry information. | ||
- Output: simulated acoustic pass-bys audio files for each vehicle type and direction of travel, specific to a site location. | ||
3. <strong>Counting</strong>: training, inference, and evaluation of acoustic traffic counting system. (`atsc.counting`) | ||
- Input: real or simulated acoustic data. | ||
- Output: counting of vehicles in each audio segment divided by vehicle type and direction of travel | ||
(`car_left`, `car_right`, `cv_left`, `cv_right`). | ||
|
||
## Data and directory structure | ||
|
||
We have released real world data from six different locations covering various traffic conditions. | ||
For each site, a pool of engine sounds, and single pass-by events are available for download | ||
from [Zenodo](https://zenodo.org/records/10700792). | ||
|
||
This repository works with three main folders, configurable in [dev.yaml](atsc/configs/env/dev.yaml): | ||
|
||
- `real_root`: data collected from real world divided per location. | ||
``` | ||
real_root | ||
├── locX | ||
│ ├── meta.json [contains meta information of traffic condition and sensor setup corresponding to the location] | ||
│ ├── test [test flac files inside] | ||
│ ├── test.csv | ||
│ ├── train [train flac files inside] | ||
│ ├── train.csv | ||
│ ├── val [val flac files inside] | ||
│ └── val.csv | ||
``` | ||
|
||
- `engine_sound_dir`: engine sounds for each vehicle type. | ||
``` | ||
engine_sound_dir | ||
├── car [car engine sounds] | ||
└── cv [commercial vehicles engine sounds] | ||
``` | ||
|
||
- `work_folder`: folder where all artifacts and outputs are stored. This includes the `simulation` subfolder with simulated single pass-bys. You can download an example of simulated samples form Zenodo or simulate your own samples with the `atsc.simulation.events` module. | ||
|
||
``` | ||
work_folder | ||
├── modeling [output of atsc.modeling.traffic module] | ||
│ ├── locX | ||
│ │ ├── train.csv [traffic model for simulated data, training set] | ||
│ │ └── val.csv [traffic model for simulated data, validations set] | ||
├── simulation [output of atsc.simulation.events module, available pre-computed on Zenodo] | ||
│ ├── locX | ||
│ │ ├── car | ||
│ │ │ ├── left [flac files with simulated single pass-bys inside] | ||
│ │ │ └── right [flac files with simulated single pass-bys inside] | ||
├── counting [output of atsc.counting module] | ||
│ ├── locX | ||
│ │ ├── alias_a | ||
│ │ │ ├── checkpoints [output of atsc.counting.training module] | ||
│ │ │ │ ├── best.ckpt [checkpoint with smallest validation loss] | ||
│ │ │ │ └── last.ckpt [last checkpoint saved] | ||
│ │ │ ├── inference [output of atsc.counting.inference module] | ||
│ │ │ │ └── test.csv [inference results on test split] | ||
│ │ │ ├── evaluation [output of atsc.counting.evaluation module] | ||
│ │ │ │ └── test.csv [evaluation results on test split] | ||
``` | ||
|
||
## Getting started | ||
|
||
To set up the code, you have to do the following: | ||
|
||
1) Clone this repository. | ||
2) Make sure poetry is installed following [this](https://python-poetry.org/docs/) guide. | ||
3) Set up and activate the Python environment: | ||
``` | ||
poetry install | ||
poetry shell | ||
``` | ||
4) Download the dataset from [Zenodo](https://zenodo.org/records/10700792) and decompress `locX.zip` in `real_root/` and `engine-sounds.zip` in `engine_sound_dir/`, respectively. | ||
|
||
## Replicating the baseline | ||
For the baseline model, you need to: | ||
1) **Configure**": Adjust folders in [dev.yaml](atsc/configs/env/dev.yaml). | ||
2) **Modeling**: Create a traffic modeling system for each location using the meta-data provided for each site: | ||
```bash | ||
poetry run python -m atsc.modeling.traffic site=<locX> | ||
``` | ||
The outputs of this code are two csv files `train.csv` and `val.csv` for each location saved at `<work_folder>/modeling`. Each file contains a list of pass-by events that are defined based on 4 parameters: | ||
``` | ||
`vehicle_type`: car or commercial vehicles (cv) , | ||
`timestamp`: date and time of the day , | ||
`direction`: direction of travel: right or left, | ||
`speed`: speed of the vehicle. | ||
``` | ||
3) **Acoustic simulation**: Based on the generated traffic models from the previous step, generate pool of single pass-by events per site. | ||
```bash | ||
poetry run python -m atsc.simulation.events site=<locX> simulation.num_workers=8 | ||
``` | ||
The outputs of this stage are audio files of single vehicle pass-by event for each vehicle type and travel direction specific for each site saved at `<work_folder>/simulation`. | ||
|
||
This step can take quite some time. You can download pre-simulated data `simulation.zip` from | ||
[Zenodo](https://zenodo.org/records/10700792) and extract it in `<work_folder>/simulation`. | ||
|
||
4) **Training**: Train a regression model to count number of vehicles in 4 categories of `car_left`, `car_right`, `cv_left`, `cv_right`. As an example, | ||
you can train your model using simulation data and later fine-tune it using real data. | ||
Training on simulation: | ||
```bash | ||
poetry run python -m atsc.counting.training site=<locX> training.learning_rate=0.001 training/train_dataset=sim training/val_dataset=sim training.tags=["train-sim","val-sim"] training.alias=pretrain | ||
``` | ||
Fine-tuning on real data: | ||
```bash | ||
poetry run python -m atsc.counting.training site=<locX> training.learning_rate=0.0001 training/train_dataset=real training/val_dataset=real training.tags=["train-real","val-real","finetune"] training.pretrained_model=pretrain training.alias=finetune | ||
``` | ||
|
||
The output of this step is saved at `<work_folder>/counting` | ||
|
||
5) **Model inference**: Run model inference on real data and the desired split (train, val, test). | ||
```bash | ||
poetry run python -m atsc.counting.inference site=<locX> inference.alias=finetune | ||
``` | ||
|
||
The output of this step is saved at `<work_folder>/counting` | ||
|
||
6) **Evaluation**: Calculate final metrics for your model: | ||
```bash | ||
poetry run python -m atsc.counting.evaluation site=<locX> inference.alias=finetune | ||
``` | ||
|
||
The output of this step is saved at `<work_folder>/counting` | ||
|
||
## Contributors | ||
|
||
The code in this repository was curated by: | ||
|
||
- Luca Bondi <luca.bondi@us.bosch.com> | ||
- Stefano Damiano <stefano.damiano@ekuleuven.be> | ||
- Winston Lin <winston.lin@us.bosch.com> | ||
- Shabnam Ghaffarzadegan <shabnam.ghaffarzadegan@us.bosch.com> | ||
|
||
## License | ||
|
||
Acoustic Traffic Simulation and Counting is open-sourced under the GPL-3.0 license. See the [LICENSE](LICENSE) file for details. | ||
|
||
For a list of other open source components included in Acoustic Traffic Simulation and Counting, see the file [3rd-party-licenses.txt](3rd-party-licenses.txt). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Copyright 2024 Robert Bosch GmbH. | ||
# SPDX-License-Identifier: GPL-3.0-only | ||
|
||
"""Acoustic Traffic Simulation and Counting package.""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
--- | ||
# Please take a look at Hydra documentation for more information | ||
# https://hydra.cc/docs/tutorials/basic/your_first_app/defaults/ | ||
defaults: | ||
- env: dev # reference to environment specific config in atsc/configs/env/ | ||
- model: baseline # reference to model config in atsc/configs/model/ | ||
- training/train_dataset: real # reference to train dataset config in atsc/configs/training/train_dataset/ | ||
- training/val_dataset: real # reference to val dataset config in atsc/configs/training/val_dataset/ | ||
- _self_ # values in this file overwrite the defaults from files above | ||
|
||
# Site name, user input requested | ||
site: ??? | ||
|
||
# --- Traffic flow modeling configuration --- | ||
traffic: | ||
|
||
seed: 151 | ||
min_time_between_consecutive_events: 2s | ||
|
||
# Traffic statistics are available as number of vehicles per hour, not per vehicle type. | ||
# The following parameters are used to generate the maximum amount of cars and commercial vehicles | ||
# starting from the traffic statistics. | ||
car_max_fraction: 0.9 | ||
cv_max_fraction: 0.3 | ||
|
||
train: | ||
num_hours: 24 # Number of hours to simulate traffic for | ||
start_datetime: 2024-01-01T00:00:00 # initial timestamp for simulation | ||
output_path: ${env.work_folder}/modeling/${site}/train.csv # Path where the generated traffic flow is saved | ||
val: | ||
num_hours: 24 | ||
start_datetime: 2024-02-01T00:00:00 | ||
output_path: ${env.work_folder}/modeling/${site}/val.csv | ||
|
||
# --- Acoustic simulation configuration --- | ||
simulation: | ||
# Parameter for generation of single pass-by events for the site | ||
seed: 152 # Seed to control events generation | ||
vehicle_types: # Vehicle types to simulate | ||
- car | ||
- cv | ||
directions: # Vehicle directions to simulate | ||
- left | ||
- right | ||
init_counter: 0 # Start simulation at this index | ||
num_events: 50 # Simulate this many events | ||
num_workers: 4 # Number of parallel workers in generation | ||
lane_width: 3.5 # Width of the road lane [m] | ||
event_duration: 30 # Duration of the generated events [s] | ||
source_model: hm+bd # Source model for the acoustic simulation. See atsc/simulation/events.py for available models. | ||
|
||
# Output folder for generated events. Events are generated in a vehicle_type/direction/ folder structure. | ||
output_folder: ${env.work_folder}/simulation/${site} | ||
|
||
|
||
# --- Training configuration --- | ||
|
||
training: | ||
|
||
# Alias for the training run. If null it is created automatically with coolname | ||
alias: | ||
|
||
# Output directory for the training run | ||
output_folder: ${env.work_folder}/counting/${site}/${training.alias} | ||
|
||
# List of tags added to the logger | ||
tags: | ||
|
||
# Random seed | ||
seed: 42 | ||
|
||
# Batch size | ||
batch_size: 64 | ||
|
||
# Number of data loading workers | ||
num_workers: 2 | ||
|
||
# Learning rate | ||
learning_rate: 0.001 | ||
|
||
# Alias of pre-trained run, or path to pretrained model checkpoint | ||
pretrained_model: | ||
|
||
# Parameters passed to lightning.Trainer | ||
# https://lightning.ai/docs/pytorch/stable/common/trainer.html | ||
trainer: | ||
max_epochs: 100 | ||
log_every_n_steps: 5 | ||
accelerator: auto | ||
|
||
callbacks: | ||
# Parameters passed to lightning.pytorch.callbacks.EarlyStopping | ||
# https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html | ||
early_stopping: | ||
monitor: val/loss_epoch | ||
verbose: true | ||
patience: 20 | ||
# Parameters passed to lightning.pytorch.callbacks.ModelCheckpoint | ||
# https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html | ||
model_checkpoint: | ||
monitor: val/loss_epoch | ||
verbose: true | ||
save_last: true | ||
filename: best | ||
dirpath: ${training.output_folder}/checkpoints/ | ||
|
||
# --- Inference configuration --- | ||
inference: | ||
|
||
# Accelerator used by lightning.Trainer | ||
accelerator: auto | ||
|
||
# Trained model alias to read the best checkpoint from | ||
alias: ${training.alias} | ||
|
||
# Batch size | ||
batch_size: 64 | ||
|
||
# Number of data loading workers | ||
num_workers: 2 | ||
|
||
# Split to run inference on | ||
split: test | ||
|
||
# Parameters passed to atsc.counting.data.TrafficCountDataset | ||
dataset: | ||
root: ${env.real_root}/${site} | ||
index: ${env.real_root}/${site}/${inference.split}.csv | ||
|
||
# Output path for inference results (csv) | ||
output_path: ${env.work_folder}/counting/${site}/${inference.alias}/inference/${inference.split}.csv | ||
|
||
# --- Evaluation configuration --- | ||
evaluation: | ||
|
||
# Output path for evaluation results (csv) | ||
output_path: ${env.work_folder}/counting/${site}/${inference.alias}/evaluation/${inference.split}.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
# Root folder of real-world data | ||
real_root: ./real-data | ||
|
||
# Folder of engine sounds | ||
engine_sound_dir: ./engine-sounds | ||
|
||
# Work folder for traffic model, generated events, trained models, inference results, evaluation outputs | ||
work_folder: ./workfolder |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
# Model instantiated with hydra.utils.instantiate | ||
# https://hydra.cc/docs/advanced/instantiate_objects/overview/ | ||
|
||
# Must be an instance of lightning.LightningModule | ||
# https://lightning.ai/docs/pytorch/stable/common/lightning_module.html | ||
|
||
_target_: atsc.counting.models.baseline.Baseline | ||
n_channels: 4 | ||
n_mels: 96 | ||
n_gcc: 48 | ||
stft_params: # https://pytorch.org/docs/stable/generated/torch.stft.html | ||
hop_length: 160 | ||
melscale_params: # https://pytorch.org/audio/main/generated/torchaudio.transforms.MelScale.html | ||
n_mels: ${model.n_mels} | ||
optimizer: | ||
_partial_: true | ||
_target_: torch.optim.Adam | ||
lr: ${training.learning_rate} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
--- | ||
# Real dataset | ||
_target_: atsc.counting.data.TrafficCountDataset | ||
root: ${env.real_root}/${site} | ||
index: ${env.real_root}/${site}/train.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
--- | ||
# Synthetic dataset, segments generated on the fly | ||
_target_: atsc.counting.data.SyntheticTrafficDataset | ||
sim_events_root: ${simulation.output_folder} | ||
event_duration: ${simulation.event_duration} | ||
random: true | ||
traffic_model_path: ${traffic.train.output_path} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
--- | ||
# Real dataset | ||
_target_: atsc.counting.data.TrafficCountDataset | ||
root: ${env.real_root}/${site} | ||
index: ${env.real_root}/${site}/val.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
--- | ||
# Synthetic dataset, segments generated on the fly | ||
_target_: atsc.counting.data.SyntheticTrafficDataset | ||
sim_events_root: ${simulation.output_folder} | ||
event_duration: ${simulation.event_duration} | ||
random: false | ||
traffic_model_path: ${traffic.val.output_path} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Copyright 2024 Robert Bosch GmbH. | ||
# SPDX-License-Identifier: GPL-3.0-only | ||
|
||
"""Acoustic traffic counting subpackage.""" |
Oops, something went wrong.