Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for librosa 0.7 #7

Open
bmcfee opened this issue Jul 8, 2019 · 5 comments
Open

Update for librosa 0.7 #7

bmcfee opened this issue Jul 8, 2019 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@bmcfee
Copy link
Contributor

bmcfee commented Jul 8, 2019

Leaving a marker here that the benchmarks should rerun on librosa 0.7.0 (and probably include version numbers more generally).

Quick summary of changes:

  • uses soundfile by default now
  • falls back on audioread only if libsndfile can't decode (e.g. for mp3, but maybe other codecs depending on your site build of libsndfile)

As an aside, we always had API support for excerpts and seeking. It wasn't terribly efficient because audioread didn't support that universally, but it should be almost no overhead relative to soundfile now. The only additional overhead would be downmixing or resample-on-load, but those shouldn't be included in the benchmarks anyway.

@faroit faroit self-assigned this Jul 8, 2019
@faroit
Copy link
Owner

faroit commented Jul 8, 2019

Thanks, I will re-ran the benchmark asap. Can't wait to have a fast audio loading in librosa 👍

As an aside, we always had API support for excerpts and seeking. It wasn't terribly efficient because audioread didn't support that universally, but it should be almost no overhead relative to soundfile now.

I just updated the table, I oversaw the seek support.

@faroit faroit added the enhancement New feature or request label Jul 8, 2019
@bmcfee
Copy link
Contributor Author

bmcfee commented Jul 8, 2019

Great, thanks! As an aside, we also have some helpers for metadata (duration, samplerate). The same conditions apply there -- sndfile by default, backing out to audioread if necessary. So I would expect it to product a somewhat jagged set of plots.

@faroit
Copy link
Owner

faroit commented Sep 30, 2019

sorry for the delay, I am about to also add tf 2 and tf.io support and will reran the benchmark once for all of these.
I the meantime, I added version numbers to the readme table so users can see that these were computed with an old version of librosa

@faroit
Copy link
Owner

faroit commented Jan 21, 2020

@bmcfee #8 is almost finished. Took quite some time to get things right for tensorflow-io. Anyway, concerning librosa, it looks as you predicted:

any idea why the soundfile backend is even faster than using soundfile directly - aka. is there anything I could optimize?

def load_soundfile(fp):
sig, rate = sf.read(fp)
return sig

@bmcfee
Copy link
Contributor Author

bmcfee commented Jan 21, 2020

any idea why the soundfile backend is even faster than using soundfile directly - aka. is there anything I could optimize?

It shouldn't be faster, but it looks like the differences are within the error bars. (Viz suggestion: use dots/swarms instead of bar plots.) Probably this is down to cache effects and general system load fluctuations.

If I understand correctly, it looks like you're implementing your own benchmark code by calling time.time() and loading a bunch of files in sequence:

start = time.time()
for fp in dataset.audio_files:
audio = dataset.loader_function(fp)
np.max(audio)
end = time.time()

I guess the point here is to average over a bunch of file loads to get a sense of the average behavior, but I think you could do a little bit better by using the timeit utility separately for each file. If you load each file individually several times, that can neutralize caching and warm-start effects that could linger from a previous run. The statistics get a bit more involved, and it will take longer, but it should cut down on the variance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants