Skip to content

RipThatSet is a Python script made to gather songs from a DJ Set through Shazam and return a predicted track list.

License

Notifications You must be signed in to change notification settings

b1scoito/ripthatset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RipThatSet

RipThatSet is a powerful command-line tool that uses the Shazam API to analyze audio files and identify tracks. It's particularly useful for identifying tracks in DJ sets, live recordings, or any long-form audio content.

Features

  • Identifies multiple tracks in a single audio file
  • Shows timestamps for each identified track
  • Detects and reports unidentified segments (gaps)
  • Supports rotating IP proxies for better rate limit handling
  • Configurable segment length and matching criteria
  • JSON output for further processing
  • Smart CPU utilization for faster processing
  • Progress tracking with ETA

Installation

# Clone the repository
git clone https://github.com/b1scoito/ripthatset.git
cd ripthatset

# Install with Poetry
poetry install

Usage

Basic usage:

poetry run ripthatset your_audio_file.mp3

With rotating proxy (recommended):

poetry run ripthatset your_audio_file.mp3 --proxy "http://customer:pass@pr.oxylabs.io:7777"

Save results to JSON:

poetry run ripthatset your_audio_file.mp3 --json-output results.json

Important Note About Proxies

Using a rotating IP proxy service is HIGHLY RECOMMENDED! The Shazam API has strict rate limiting, which can significantly slow down the analysis process or cause failures when using a single IP address. A rotating proxy service (like OxyLabs) helps by:

  • Distributing requests across multiple IPs
  • Avoiding rate limit issues
  • Significantly improving processing speed
  • Providing more reliable results

We recommend using OxyLabs or similar services that offer:

  • High rotation frequency
  • Large IP pool
  • Reliable uptime
  • Good geographic distribution

Options

Arguments:
  audio_file              Audio file to analyze [required]

Options:
  --segment-length        Segment length in milliseconds [default: 12000]
  --proxy                 HTTP/HTTPS proxy URL
  --json-output          Save results to JSON file
  --min-matches          Minimum segment matches required [default: 2]
  --min-confidence       Minimum confidence score (0-1) [default: 0.5]
  --max-gap              Maximum segment gap in cluster [default: 3]
  --min-cluster          Minimum segments in cluster [default: 2]
  --show-gaps            Show unidentified gaps [default: True]
  --min-gap-duration     Minimum gap duration in seconds [default: 30]
  --verbose              Enable verbose output
  --cpu-count            Number of CPU cores to use
  --help                 Show help message

Advanced Configuration

Segment Length

  • Default is 12 seconds (12000ms)
  • Shorter segments (8-10s) might catch more tracks but increase processing time
  • Longer segments (15-20s) are faster but might miss short tracks

Match Criteria

  • min-matches: Minimum times a track must be detected (default: 2)
  • min-confidence: Minimum confidence score for matches (default: 0.5)
  • max-gap: Maximum segments between matches in a cluster (default: 3)
  • min-cluster: Minimum segments in a cluster (default: 2)

Gap Detection

  • show-gaps: Show unidentified sections (default: True)
  • min-gap-duration: Minimum duration to report a gap (default: 30s)

Performance

  • cpu-count: Control CPU usage (default: auto-detected)
  • Uses smart batching based on system resources
  • Adjusts concurrent processing based on available CPUs

Output Example

1. Artist One - Track Title (00:00) [segments: 1,2,3, confidence: 1.00, total matches: 5]
2. ID - ID (03:24) [duration: 05:36]
3. Artist Two - Another Track (09:00) [segments: 30,31,32, confidence: 0.95, total matches: 4]

Output Format Explanation

  • Timestamp: (MM:SS)
  • Segments: List of segments where track was detected
  • Confidence: Match confidence score (0-1)
  • Total matches: Number of times track was detected
  • For gaps: Shows duration instead of match details

Development

Contributions are welcome! Please ensure you:

  1. Write tests for new features
  2. Follow the existing code style
  3. Update documentation as needed

License

GNU GPLv3 License - see LICENSE file for details

About

RipThatSet is a Python script made to gather songs from a DJ Set through Shazam and return a predicted track list.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages