Skip to content

Commit

Permalink
initial version
Browse files Browse the repository at this point in the history
  • Loading branch information
luca-bondi committed Mar 22, 2024
0 parents commit f45b5da
Show file tree
Hide file tree
Showing 30 changed files with 8,012 additions and 0 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
outputs/
lightning_logs/
.env
mlruns/
.neptune/
logs/
__pycache__/
651 changes: 651 additions & 0 deletions 3rd-party-licenses.txt

Large diffs are not rendered by default.

540 changes: 540 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

178 changes: 178 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# Acoustic Traffic Simulation and Counting
Welcome to the repository of the acoustic based traffic monitoring system for the [DCASE 2024 Task 10](https://dcase.community/challenge2024/) .

This code also accompanies the publication of the ICASSP 2024 paper
"[Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks?](https://arxiv.org/abs/2401.09308)" by S.Damiano, L. Bondi, S. Ghaffarzadegan, A. Guntoro, T. van Waterschoot.


## Purpose of the project

This software is a research prototype, solely developed for and published as per the scope above. It will neither be maintained nor monitored in any way.

## Introduction

Acoustic traffic counting based on microphone arrays offers and alternative to vision, coil, and radar-based systems. When traffic density grows above a few vehicles per minute, data-driven solutions are often required to satisfy the precision requirement of end users. As per any data-driven system, more data leads to better performance, and often to higher costs.

This project aims at replacing large-scale acoustic data collection from real traffic scenarios with large-scale traffic simulation. Synthetic data is used to pre-train a model, and a small amount of real-world data is used to fine-tune the model.

The framework we propose is summarized by the following diagram.

<img src="assets/framework.svg">

Reading the diagram from left to right:

- Engine sounds are simulated via [enginesound](https://github.com/DasEtwas/enginesound) and stored. A pool of pre-computed engine sounds is made available on [Zenodo](https://zenodo.org/records/10700792).
- The sound of a moving vehicle in front of a microphone array (single pass-by) is simulated from the combination of engine sound and simulated road-tire interactions. Acoustic propagation, dependent on the geometry of a specific site, is used to generate a pool of single pass-bys. A single pass-by is a multi-channel audio file that simulates a linear trajectory of a vehicle at constant speed. A pool of pre-computed pass-bys is made available on [Zenodo](https://zenodo.org/records/10700792).
- Traffic statistics, e.g. number of vehicles per hour and road speed limit, are used to generate a traffic model, i.e. a distribution of pass-by events over time for different types of vehicles and direction of travel.
- A model is pre-trained on 60 seconds long segments generated online starting from the pool of single pass-by events and the traffic model.
- The model is fine-tuned on real data.
- Model inference and evaluation is performed on real data.

We are considering traffic scenarios with two lane roads, and two categories of vehicles: cars and commercial vehicles (cv).
The task is to count number of vehicles per direction of travel (left, right) and vehicle type (car, cv) are passing by in 60 seconds of audio.

## Conceptual modules

The current repository is structured around three modules:

1. <strong>Modeling</strong>: from site specifications to traffic model (`atsc.modeling`)
- Input: site specifications, including traffic statistics and geometry information.
- Output: traffic model in the form of a list of pass-by events for different vehicle types, at different speed and direction of travel.
2. <strong>Simulation</strong>: from site specifications to acoustic simulation (`atsc.simulation`)
- Input: site specifications, including traffic statistics and geometry information.
- Output: simulated acoustic pass-bys audio files for each vehicle type and direction of travel, specific to a site location.
3. <strong>Counting</strong>: training, inference, and evaluation of acoustic traffic counting system. (`atsc.counting`)
- Input: real or simulated acoustic data.
- Output: counting of vehicles in each audio segment divided by vehicle type and direction of travel
(`car_left`, `car_right`, `cv_left`, `cv_right`).

## Data and directory structure

We have released real world data from six different locations covering various traffic conditions.
For each site, a pool of engine sounds, and single pass-by events are available for download
from [Zenodo](https://zenodo.org/records/10700792).

This repository works with three main folders, configurable in [dev.yaml](atsc/configs/env/dev.yaml):

- `real_root`: data collected from real world divided per location.
```
real_root
├── locX
│ ├── meta.json [contains meta information of traffic condition and sensor setup corresponding to the location]
│ ├── test [test flac files inside]
│ ├── test.csv
│ ├── train [train flac files inside]
│ ├── train.csv
│ ├── val [val flac files inside]
│ └── val.csv
```

- `engine_sound_dir`: engine sounds for each vehicle type.
```
engine_sound_dir
├── car [car engine sounds]
└── cv [commercial vehicles engine sounds]
```

- `work_folder`: folder where all artifacts and outputs are stored. This includes the `simulation` subfolder with simulated single pass-bys. You can download an example of simulated samples form Zenodo or simulate your own samples with the `atsc.simulation.events` module.

```
work_folder
├── modeling [output of atsc.modeling.traffic module]
│ ├── locX
│ │ ├── train.csv [traffic model for simulated data, training set]
│ │ └── val.csv [traffic model for simulated data, validations set]
├── simulation [output of atsc.simulation.events module, available pre-computed on Zenodo]
│ ├── locX
│ │ ├── car
│ │ │ ├── left [flac files with simulated single pass-bys inside]
│ │ │ └── right [flac files with simulated single pass-bys inside]
├── counting [output of atsc.counting module]
│ ├── locX
│ │ ├── alias_a
│ │ │ ├── checkpoints [output of atsc.counting.training module]
│ │ │ │ ├── best.ckpt [checkpoint with smallest validation loss]
│ │ │ │ └── last.ckpt [last checkpoint saved]
│ │ │ ├── inference [output of atsc.counting.inference module]
│ │ │ │ └── test.csv [inference results on test split]
│ │ │ ├── evaluation [output of atsc.counting.evaluation module]
│ │ │ │ └── test.csv [evaluation results on test split]
```

## Getting started

To set up the code, you have to do the following:

1) Clone this repository.
2) Make sure poetry is installed following [this](https://python-poetry.org/docs/) guide.
3) Set up and activate the Python environment:
```
poetry install
poetry shell
```
4) Download the dataset from [Zenodo](https://zenodo.org/records/10700792) and decompress `locX.zip` in `real_root/` and `engine-sounds.zip` in `engine_sound_dir/`, respectively.

## Replicating the baseline
For the baseline model, you need to:
1) **Configure**": Adjust folders in [dev.yaml](atsc/configs/env/dev.yaml).
2) **Modeling**: Create a traffic modeling system for each location using the meta-data provided for each site:
```bash
poetry run python -m atsc.modeling.traffic site=<locX>
```
The outputs of this code are two csv files `train.csv` and `val.csv` for each location saved at `<work_folder>/modeling`. Each file contains a list of pass-by events that are defined based on 4 parameters:
```
`vehicle_type`: car or commercial vehicles (cv) ,
`timestamp`: date and time of the day ,
`direction`: direction of travel: right or left,
`speed`: speed of the vehicle.
```
3) **Acoustic simulation**: Based on the generated traffic models from the previous step, generate pool of single pass-by events per site.
```bash
poetry run python -m atsc.simulation.events site=<locX> simulation.num_workers=8
```
The outputs of this stage are audio files of single vehicle pass-by event for each vehicle type and travel direction specific for each site saved at `<work_folder>/simulation`.

This step can take quite some time. You can download pre-simulated data `simulation.zip` from
[Zenodo](https://zenodo.org/records/10700792) and extract it in `<work_folder>/simulation`.

4) **Training**: Train a regression model to count number of vehicles in 4 categories of `car_left`, `car_right`, `cv_left`, `cv_right`. As an example,
you can train your model using simulation data and later fine-tune it using real data.
Training on simulation:
```bash
poetry run python -m atsc.counting.training site=<locX> training.learning_rate=0.001 training/train_dataset=sim training/val_dataset=sim training.tags=["train-sim","val-sim"] training.alias=pretrain
```
Fine-tuning on real data:
```bash
poetry run python -m atsc.counting.training site=<locX> training.learning_rate=0.0001 training/train_dataset=real training/val_dataset=real training.tags=["train-real","val-real","finetune"] training.pretrained_model=pretrain training.alias=finetune
```

The output of this step is saved at `<work_folder>/counting`

5) **Model inference**: Run model inference on real data and the desired split (train, val, test).
```bash
poetry run python -m atsc.counting.inference site=<locX> inference.alias=finetune
```

The output of this step is saved at `<work_folder>/counting`

6) **Evaluation**: Calculate final metrics for your model:
```bash
poetry run python -m atsc.counting.evaluation site=<locX> inference.alias=finetune
```

The output of this step is saved at `<work_folder>/counting`

## Contributors

The code in this repository was curated by:

- Luca Bondi <luca.bondi@us.bosch.com>
- Stefano Damiano <stefano.damiano@ekuleuven.be>
- Winston Lin <winston.lin@us.bosch.com>
- Shabnam Ghaffarzadegan <shabnam.ghaffarzadegan@us.bosch.com>

## License

Acoustic Traffic Simulation and Counting is open-sourced under the GPL-3.0 license. See the [LICENSE](LICENSE) file for details.

For a list of other open source components included in Acoustic Traffic Simulation and Counting, see the file [3rd-party-licenses.txt](3rd-party-licenses.txt).
3 changes: 3 additions & 0 deletions assets/framework.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions atsc/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Copyright 2024 Robert Bosch GmbH.
# SPDX-License-Identifier: GPL-3.0-only

"""Acoustic Traffic Simulation and Counting package."""
137 changes: 137 additions & 0 deletions atsc/configs/baseline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
# Please take a look at Hydra documentation for more information
# https://hydra.cc/docs/tutorials/basic/your_first_app/defaults/
defaults:
- env: dev # reference to environment specific config in atsc/configs/env/
- model: baseline # reference to model config in atsc/configs/model/
- training/train_dataset: real # reference to train dataset config in atsc/configs/training/train_dataset/
- training/val_dataset: real # reference to val dataset config in atsc/configs/training/val_dataset/
- _self_ # values in this file overwrite the defaults from files above

# Site name, user input requested
site: ???

# --- Traffic flow modeling configuration ---
traffic:

seed: 151
min_time_between_consecutive_events: 2s

# Traffic statistics are available as number of vehicles per hour, not per vehicle type.
# The following parameters are used to generate the maximum amount of cars and commercial vehicles
# starting from the traffic statistics.
car_max_fraction: 0.9
cv_max_fraction: 0.3

train:
num_hours: 24 # Number of hours to simulate traffic for
start_datetime: 2024-01-01T00:00:00 # initial timestamp for simulation
output_path: ${env.work_folder}/modeling/${site}/train.csv # Path where the generated traffic flow is saved
val:
num_hours: 24
start_datetime: 2024-02-01T00:00:00
output_path: ${env.work_folder}/modeling/${site}/val.csv

# --- Acoustic simulation configuration ---
simulation:
# Parameter for generation of single pass-by events for the site
seed: 152 # Seed to control events generation
vehicle_types: # Vehicle types to simulate
- car
- cv
directions: # Vehicle directions to simulate
- left
- right
init_counter: 0 # Start simulation at this index
num_events: 50 # Simulate this many events
num_workers: 4 # Number of parallel workers in generation
lane_width: 3.5 # Width of the road lane [m]
event_duration: 30 # Duration of the generated events [s]
source_model: hm+bd # Source model for the acoustic simulation. See atsc/simulation/events.py for available models.

# Output folder for generated events. Events are generated in a vehicle_type/direction/ folder structure.
output_folder: ${env.work_folder}/simulation/${site}


# --- Training configuration ---

training:

# Alias for the training run. If null it is created automatically with coolname
alias:

# Output directory for the training run
output_folder: ${env.work_folder}/counting/${site}/${training.alias}

# List of tags added to the logger
tags:

# Random seed
seed: 42

# Batch size
batch_size: 64

# Number of data loading workers
num_workers: 2

# Learning rate
learning_rate: 0.001

# Alias of pre-trained run, or path to pretrained model checkpoint
pretrained_model:

# Parameters passed to lightning.Trainer
# https://lightning.ai/docs/pytorch/stable/common/trainer.html
trainer:
max_epochs: 100
log_every_n_steps: 5
accelerator: auto

callbacks:
# Parameters passed to lightning.pytorch.callbacks.EarlyStopping
# https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html
early_stopping:
monitor: val/loss_epoch
verbose: true
patience: 20
# Parameters passed to lightning.pytorch.callbacks.ModelCheckpoint
# https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html
model_checkpoint:
monitor: val/loss_epoch
verbose: true
save_last: true
filename: best
dirpath: ${training.output_folder}/checkpoints/

# --- Inference configuration ---
inference:

# Accelerator used by lightning.Trainer
accelerator: auto

# Trained model alias to read the best checkpoint from
alias: ${training.alias}

# Batch size
batch_size: 64

# Number of data loading workers
num_workers: 2

# Split to run inference on
split: test

# Parameters passed to atsc.counting.data.TrafficCountDataset
dataset:
root: ${env.real_root}/${site}
index: ${env.real_root}/${site}/${inference.split}.csv

# Output path for inference results (csv)
output_path: ${env.work_folder}/counting/${site}/${inference.alias}/inference/${inference.split}.csv

# --- Evaluation configuration ---
evaluation:

# Output path for evaluation results (csv)
output_path: ${env.work_folder}/counting/${site}/${inference.alias}/evaluation/${inference.split}.csv
9 changes: 9 additions & 0 deletions atsc/configs/env/dev.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
# Root folder of real-world data
real_root: ./real-data

# Folder of engine sounds
engine_sound_dir: ./engine-sounds

# Work folder for traffic model, generated events, trained models, inference results, evaluation outputs
work_folder: ./workfolder
19 changes: 19 additions & 0 deletions atsc/configs/model/baseline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
# Model instantiated with hydra.utils.instantiate
# https://hydra.cc/docs/advanced/instantiate_objects/overview/

# Must be an instance of lightning.LightningModule
# https://lightning.ai/docs/pytorch/stable/common/lightning_module.html

_target_: atsc.counting.models.baseline.Baseline
n_channels: 4
n_mels: 96
n_gcc: 48
stft_params: # https://pytorch.org/docs/stable/generated/torch.stft.html
hop_length: 160
melscale_params: # https://pytorch.org/audio/main/generated/torchaudio.transforms.MelScale.html
n_mels: ${model.n_mels}
optimizer:
_partial_: true
_target_: torch.optim.Adam
lr: ${training.learning_rate}
5 changes: 5 additions & 0 deletions atsc/configs/training/train_dataset/real.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
# Real dataset
_target_: atsc.counting.data.TrafficCountDataset
root: ${env.real_root}/${site}
index: ${env.real_root}/${site}/train.csv
7 changes: 7 additions & 0 deletions atsc/configs/training/train_dataset/sim.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
# Synthetic dataset, segments generated on the fly
_target_: atsc.counting.data.SyntheticTrafficDataset
sim_events_root: ${simulation.output_folder}
event_duration: ${simulation.event_duration}
random: true
traffic_model_path: ${traffic.train.output_path}
5 changes: 5 additions & 0 deletions atsc/configs/training/val_dataset/real.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
# Real dataset
_target_: atsc.counting.data.TrafficCountDataset
root: ${env.real_root}/${site}
index: ${env.real_root}/${site}/val.csv
7 changes: 7 additions & 0 deletions atsc/configs/training/val_dataset/sim.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
# Synthetic dataset, segments generated on the fly
_target_: atsc.counting.data.SyntheticTrafficDataset
sim_events_root: ${simulation.output_folder}
event_duration: ${simulation.event_duration}
random: false
traffic_model_path: ${traffic.val.output_path}
4 changes: 4 additions & 0 deletions atsc/counting/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Copyright 2024 Robert Bosch GmbH.
# SPDX-License-Identifier: GPL-3.0-only

"""Acoustic traffic counting subpackage."""
Loading

0 comments on commit f45b5da

Please sign in to comment.