Skip to content

High-performance audio fingerprinting engine in C++17 for music recognition

Notifications You must be signed in to change notification settings

AradhyaChhabdi/shazam-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Shazam-CPP: Audio Fingerprinting Engine

A high-performance audio fingerprinting engine implemented in C++17 for music recognition.

Features

  • AudioDecoder: Converts WAV/FLAC/OGG files to mono 44.1kHz float arrays using libsndfile
  • SpectrogramGenerator: Generates spectrograms using FFTW3 (4096-sample window)
  • PeakExtractor: Extracts constellation points and generates fingerprint hashes

Requirements

  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2017+)
  • CMake 3.16+
  • FFTW3 library
  • libsndfile library

Installation (MSYS2 on Windows)

# Open MSYS2 MINGW64 terminal and install dependencies
pacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-cmake mingw-w64-x86_64-fftw mingw-w64-x86_64-libsndfile mingw-w64-x86_64-pkg-config

Building

# From MSYS2 MINGW64 terminal
cd "/c/Users/Aradhya/Desktop/Code/Software project"
mkdir -p build && cd build
cmake -G "MinGW Makefiles" ..
cmake --build .

Usage

Fingerprint an audio file

./shazam_cli fingerprint path/to/audio.wav

Show audio file info

./shazam_cli info path/to/audio.wav

Run self-test

./shazam_cli test

Run benchmarks (for trial results)

./benchmark
./benchmark --extended  # Include longer duration tests

Project Structure

├── CMakeLists.txt           # Build configuration
├── include/
│   ├── Types.hpp            # Common types and constants
│   ├── AudioDecoder.hpp     # Audio file decoder
│   ├── SpectrogramGenerator.hpp  # FFT spectrogram
│   └── PeakExtractor.hpp    # Peak extraction & hashing
├── src/
│   ├── AudioDecoder.cpp
│   ├── SpectrogramGenerator.cpp
│   ├── PeakExtractor.cpp
│   ├── main.cpp             # CLI interface
│   └── benchmark.cpp        # Performance benchmarks
├── build/                   # Build output
└── tests/                   # Test files

SRS Compliance

Requirement Description Status
FR-01 Audio recording up to 20 seconds
FR-02 PCM conversion at 44.1kHz mono
FR-03 FFT with 4096-sample window
FR-04 Peak extraction with silence detection
NFR-01 Response time < 3 seconds
NFR-02 Memory usage < 512MB

Algorithm Overview

Based on the Wang paper "An Industrial-Strength Audio Search Algorithm":

  1. Decode: Convert audio to mono PCM at 44.1kHz
  2. FFT: Apply windowed FFT (Hann window, 4096 samples, 50% overlap)
  3. Peaks: Find local maxima in spectrogram (constellation points)
  4. Hash: Create combinatorial pairs within target zone
    • Hash = (freq1 << 20) | (freq2 << 10) | time_delta

License

MIT License - See LICENSE file

About

High-performance audio fingerprinting engine in C++17 for music recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published