AST for multi-label audio tagging? #142
Antoine101
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I am trying to describe acoustic scenes from audio samples by listing all the sources present in the sounds from a set list of learnt labels.
I have read your paper and used your model, mainly through the Hugging Face hub, for single label classification.
Does it work for multi-labels classification as well (one audio sample = possibly multiple labels)?
In here you say this checkpoint is able to classify an audio into one of audioset classes.

In your paper however, you mention results obtained on the FSD50K dataset which is multi-label dataset (correct me if I'm wrong).

I have come accross the LwLRAP metric which seems to be suited to multi-labels tasks. Did you use this metric specially for finetuning your model on the FSD50K? Or did you tweak FSD50K to turn it into a single label dataset?
And finally, would it be possible to finetune AST on my multi-labels downstream task through the hugging face checkpoint? Does it only require the appropriate arrays of labels and metric? Is it only a matter of metric or is there more to it?
Thanks a lot in advance.
Antoine
Beta Was this translation helpful? Give feedback.
All reactions