Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the objective when pretraining? #107

Open
Young973 opened this issue Aug 3, 2023 · 3 comments
Open

What is the objective when pretraining? #107

Young973 opened this issue Aug 3, 2023 · 3 comments
Labels
question Further information is requested

Comments

@Young973
Copy link

Young973 commented Aug 3, 2023

TBH, I'm a little confused about what is the objective when pretraining with AST? It seems it is not indicated in the paper. BTW, when pretraining SSAST discriminative objective is the classification with InfoNCE and generative objective is reconstruction. But what is it in AST?

@YuanGongND YuanGongND added the question Further information is requested label Aug 3, 2023
@YuanGongND
Copy link
Owner

hi there,

It is just ImageNet pretraining.

I.e., using ImageNet pretrained DeiT as the initial weight for AST.

if model_size == 'tiny224':
self.v = timm.create_model('vit_deit_tiny_distilled_patch16_224', pretrained=imagenet_pretrain)
elif model_size == 'small224':
self.v = timm.create_model('vit_deit_small_distilled_patch16_224', pretrained=imagenet_pretrain)
elif model_size == 'base224':
self.v = timm.create_model('vit_deit_base_distilled_patch16_224', pretrained=imagenet_pretrain)
elif model_size == 'base384':
self.v = timm.create_model('vit_deit_base_distilled_patch16_384', pretrained=imagenet_pretrain)
else:

-Yuan

@YuanGongND
Copy link
Owner

Some modification is needed. See https://github.com/YuanGongND/ast/blob/master/src/models/ast_models.py.

@YuanGongND
Copy link
Owner

YuanGongND commented Aug 3, 2023

If you mean audio domain pretraining, that is just train AST on AudioSet (based on ImageNet initialization) with BCE loss for classification task. You can then take the model for other audio tasks (e.g., for ESC-50).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants