Ph.D. candidate in ICT working on Computational Auditory Scene Analysis (CASA) & Deep Learning for Audio.
Researcher | Developer | Electroacoustic Musician
My research focuses on neural audio representations, audio tagging, acoustic scene classification, sound event detection, and efficient inference. Alongside research, I am an electroacoustic music composer, and live performer working at the intersection between machine learning, sound, digital signal processing, and artistic practices β someone who designs, studies, and uses sound both as a scientific object and as an artistic material.
-
Research: Computational Auditory Scene Analysis, Deep Learning, Machine Learning, Audio Tagging, Acoustic Scene Classification, Sound Event Detection, Neural Audio Embeddings, Music Information Retrieval, Digital Signal Processing.
-
Artistic Practice: Electroacoustic music composition, live performance, sound design, interactive and immersive audio systems, spatial sound, soundscapes.
| Category | Technologies |
|---|---|
| ML / DL | |
| DSP & Analysis | |
| Audio & Code | |
| Publishing |
- e2panns β From Large Scale Audio Tagging to Real-Time Explainable Emergency Vehicle Siren Detection.
- audioset-tools β A Python/PyTorch utility framework for taxonomy-aware AudioSet curation, dataset preparation, and reproducible audio research.
- torch-audio-embeddings β A Standardized Framework for Deploying and inference with Audio Neural Networks embedding models.
- StrumKANet β Kolmogorov Arnold Network (KAN) NAS Framework for Strumming Pattern Recognition.
- torch_amt β MATLAB Auditory Modeling Toolbox porting (in PyTorch).
Selected Publications
-
S. Giacomelli, M. Giordano, C. Rinaldi, and F. Graziosi, "AudioSet-Tools: A Python Framework for Taxonomy-Aware AudioSet Curation and Reproducible Audio Research," EURASIP Journal of Audio, Speech and Music Processing, vol. 2026, no. 2, 2026. DOI: 10.1186/s13636-025-00436-z
-
M. Giordano, S. Giacomelli, C. Rinaldi, and F. Graziosi, "Real-Time Emergency Vehicle Siren Detection with Efficient CNNs on Embedded Hardware," in 2025 IEEE 6th International Symposium on the Internet of Sounds (IS2), 2025, pp. 1-10. DOI: 10.1109/IS264627.2025.11284671
-
M. Pennese, S. Giacomelli, and C. Rinaldi, "A Kolmogorov Arnold Network NAS Framework for Strumming Pattern Recognition in Technology-Enhanced Pop/Rock Music Education," in 2025 IEEE 6th International Symposium on the Internet of Sounds (IS2), 2025, pp. 1-10. DOI: 10.1109/IS264627.2025.11284580
-
S. Giacomelli, M. Giordano, and C. Rinaldi, "The OCON model: an Old but Gold Solution for Distributable Supervised Classification," in 2024 IEEE Symposium on Computers and Communications (ISCC), 2024, pp. 1-7. DOI: 10.1109/ISCC61673.2024.10733621
Selected Datasets & Toolkits
-
AudioSet-EV v2 β A refined AudioSet-derived distribution of emergency vehicle siren sounds. Zenodo: 10.5281/zenodo.18668076
-
The Strummin' Dataset β An international pop/rock curated audio selection for strumming patterns recognition. Zenodo: 10.5281/zenodo.15862786
- Audio deep learning models for scene/event analysis.
- Efficient convolutional and transformers-based neural networks for audio inference.
- Dataset curation and benchmarking pipelines.
- Artistic research in electroacoustic performance and composition (electroacoustic feedback systems).
- Room Acoustics and features extraction.
- Sound source separation and Semantic sound scene segmentation.
- π Website: Personal Homepage
- π Google Scholar: Stefano Giacomelli
- πΌ LinkedIn: Stefano Giacomelli
- π ORCID: 0009-0009-0438-1748
- π§ Institutional Email: stefano.giacomelli@graduate.univaq.it