Update for librosa 0.7 #7

bmcfee · 2019-07-08T13:36:32Z

Leaving a marker here that the benchmarks should rerun on librosa 0.7.0 (and probably include version numbers more generally).

Quick summary of changes:

uses soundfile by default now
falls back on audioread only if libsndfile can't decode (e.g. for mp3, but maybe other codecs depending on your site build of libsndfile)

As an aside, we always had API support for excerpts and seeking. It wasn't terribly efficient because audioread didn't support that universally, but it should be almost no overhead relative to soundfile now. The only additional overhead would be downmixing or resample-on-load, but those shouldn't be included in the benchmarks anyway.

faroit · 2019-07-08T13:40:55Z

Thanks, I will re-ran the benchmark asap. Can't wait to have a fast audio loading in librosa 👍

As an aside, we always had API support for excerpts and seeking. It wasn't terribly efficient because audioread didn't support that universally, but it should be almost no overhead relative to soundfile now.

I just updated the table, I oversaw the seek support.

bmcfee · 2019-07-08T14:26:03Z

Great, thanks! As an aside, we also have some helpers for metadata (duration, samplerate). The same conditions apply there -- sndfile by default, backing out to audioread if necessary. So I would expect it to product a somewhat jagged set of plots.

faroit · 2019-09-30T08:46:37Z

sorry for the delay, I am about to also add tf 2 and tf.io support and will reran the benchmark once for all of these.
I the meantime, I added version numbers to the readme table so users can see that these were computed with an old version of librosa

faroit · 2020-01-21T19:55:25Z

@bmcfee #8 is almost finished. Took quite some time to get things right for tensorflow-io. Anyway, concerning librosa, it looks as you predicted:

any idea why the soundfile backend is even faster than using soundfile directly - aka. is there anything I could optimize?

python_audio_loading_benchmark/loaders.py

Lines 59 to 61 in d71fbe6

    
           def load_soundfile(fp): 
        
               sig, rate = sf.read(fp) 
        
               return sig

bmcfee · 2020-01-21T20:41:09Z

any idea why the soundfile backend is even faster than using soundfile directly - aka. is there anything I could optimize?

It shouldn't be faster, but it looks like the differences are within the error bars. (Viz suggestion: use dots/swarms instead of bar plots.) Probably this is down to cache effects and general system load fluctuations.

If I understand correctly, it looks like you're implementing your own benchmark code by calling time.time() and loading a bunch of files in sequence:

python_audio_loading_benchmark/benchmark_np.py

Lines 85 to 91 in d71fbe6

    
           start = time.time() 
        
           for fp in dataset.audio_files: 
        
               audio = dataset.loader_function(fp) 
        
               np.max(audio) 
        
           end = time.time()

I guess the point here is to average over a bunch of file loads to get a sense of the average behavior, but I think you could do a little bit better by using the timeit utility separately for each file. If you load each file individually several times, that can neutralize caching and warm-start effects that could linger from a previous run. The statistics get a bit more involved, and it will take longer, but it should cut down on the variance.

faroit added a commit that referenced this issue Jul 8, 2019

librosa does indeed support seeking (See #7)

022be62

faroit self-assigned this Jul 8, 2019

faroit added the enhancement New feature or request label Jul 8, 2019

nryant mentioned this issue Apr 16, 2020

.mp3 files not supported for SAD pyannote/pyannote-audio#347

Closed

faroit mentioned this issue May 12, 2020

tensorflow2, librosa8, better plots #8

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update for librosa 0.7 #7

Update for librosa 0.7 #7

bmcfee commented Jul 8, 2019

faroit commented Jul 8, 2019

bmcfee commented Jul 8, 2019

faroit commented Sep 30, 2019

faroit commented Jan 21, 2020

bmcfee commented Jan 21, 2020

Update for librosa 0.7 #7

Update for librosa 0.7 #7

Comments

bmcfee commented Jul 8, 2019

faroit commented Jul 8, 2019

bmcfee commented Jul 8, 2019

faroit commented Sep 30, 2019

faroit commented Jan 21, 2020

bmcfee commented Jan 21, 2020