This project implements a multi-class multi-label classifier that can detect 10 different bird songs. Here is a list of birds that are supported in this project:
- AMRO - American Robin (Turdus migratorius)
- BHCO - Brown-headed Cowbird (Molothrus ater)
- CHSW - Chimney Swift (Chaetura pelagica)
- EUST - European Starling (Sturnus vulgaris)
- GRCA - Gray Catbird (Dumetella carolinensis)
- HOSP - House Sparrow (Passer domesticus)
- HOWR - House Wren (Troglodytes aedon)
- NOCA - Northern Cardinal (Cardinalis cardinalis)
- RBGU - Ring-billed Gull (Larus delawarensis)
- RWBL - Red-winged Blackbird (Agelaius phoeniceus)
The data used for the project was collected over 3 years: 2021, 2022, 2023. The data for each year included nearly 3000 1-minute recordings of these birds in 11 different locations. Each recording included 1 or multiple bird types.
NOTE: The data was provided in both .mp3
format for the audio recordings and in the .png
format for the spectrograms.
For a detailed explanation of the problem please read the following document.
To ensure that the model's performance was optimal, the team performed a lot of pre-processing on the given audio data. This included removing the background noise from the audio and normalizing the bird sounds. These pre-processing scripts are in the pre-processing scripts folder.
Here is an example of the pre-processing results:
Original Spectrogram | Updated Spectrogram |
![]() |
![]() |
For the solution, the team decided to utilize Feed-Forward Neural Networks (FNNs) and Vision Transformers. Here is an overview of our design:
For a detailed description of our design please review the following project report
To view our models' detailed implementation, please navigate to the following bird_classifer Jupyter notebook!
The team performed 5-fold cross-validation on the data and was able to obtain a mean F1 score of 0.592 with a standard deviation of 0.012.
Here is a graph detailing the performance metrics of each individual model:
The team further performed a blind test on the 2023 data and obtained an F1 Score of 0.5916 which is very close to the estimated F1 score. This resulted in the team obtaining a precision score of 39.862.