A high-performance audio fingerprinting engine implemented in C++17 for music recognition.
- AudioDecoder: Converts WAV/FLAC/OGG files to mono 44.1kHz float arrays using libsndfile
- SpectrogramGenerator: Generates spectrograms using FFTW3 (4096-sample window)
- PeakExtractor: Extracts constellation points and generates fingerprint hashes
- C++17 compiler (GCC 7+, Clang 5+, MSVC 2017+)
- CMake 3.16+
- FFTW3 library
- libsndfile library
# Open MSYS2 MINGW64 terminal and install dependencies
pacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-cmake mingw-w64-x86_64-fftw mingw-w64-x86_64-libsndfile mingw-w64-x86_64-pkg-config# From MSYS2 MINGW64 terminal
cd "/c/Users/Aradhya/Desktop/Code/Software project"
mkdir -p build && cd build
cmake -G "MinGW Makefiles" ..
cmake --build ../shazam_cli fingerprint path/to/audio.wav./shazam_cli info path/to/audio.wav./shazam_cli test./benchmark
./benchmark --extended # Include longer duration tests├── CMakeLists.txt # Build configuration
├── include/
│ ├── Types.hpp # Common types and constants
│ ├── AudioDecoder.hpp # Audio file decoder
│ ├── SpectrogramGenerator.hpp # FFT spectrogram
│ └── PeakExtractor.hpp # Peak extraction & hashing
├── src/
│ ├── AudioDecoder.cpp
│ ├── SpectrogramGenerator.cpp
│ ├── PeakExtractor.cpp
│ ├── main.cpp # CLI interface
│ └── benchmark.cpp # Performance benchmarks
├── build/ # Build output
└── tests/ # Test files
| Requirement | Description | Status |
|---|---|---|
| FR-01 | Audio recording up to 20 seconds | ✓ |
| FR-02 | PCM conversion at 44.1kHz mono | ✓ |
| FR-03 | FFT with 4096-sample window | ✓ |
| FR-04 | Peak extraction with silence detection | ✓ |
| NFR-01 | Response time < 3 seconds | ✓ |
| NFR-02 | Memory usage < 512MB | ✓ |
Based on the Wang paper "An Industrial-Strength Audio Search Algorithm":
- Decode: Convert audio to mono PCM at 44.1kHz
- FFT: Apply windowed FFT (Hann window, 4096 samples, 50% overlap)
- Peaks: Find local maxima in spectrogram (constellation points)
- Hash: Create combinatorial pairs within target zone
- Hash = (freq1 << 20) | (freq2 << 10) | time_delta
MIT License - See LICENSE file