Possible to train the backbone? #1138

arjun-kg · 2023-06-16T02:00:38Z

arjun-kg
Jun 16, 2023

I was just wondering if it's possible to train the backbone as well. In my understanding, when we set pretrained: true, pretrained imagenet weights are loaded in for eg. Resnet. And even if we set pretrained: false, the weights seem to be randomly initialized and frozen, and we get a feature extractor that is not useful unless we separately load other weights offline. Is this accurate?

Is it possible to train the backbone as well while training the anomaly model, i.e., backpropagate all the way through the backbone network during training?

samet-akcay · 2023-06-16T04:18:39Z

samet-akcay
Jun 16, 2023
Maintainer

It is possible, but depends on the design of the algorithm. pretrained flag only determines whether to use the pre-trained weights of the backbone.

Some algorithms freeze the backbone and use it as a feature extractor, while others continue to fine-tune the backbone. For example, memory-bank based approaches such as PADIM and Patchcore do not train the backbone.

anomalib/src/anomalib/models/patchcore/torch_model.py

Lines 65 to 66 in 161d66b

    
           with torch.no_grad(): 
        
               features = self.feature_extractor(input_tensor)

Some other models contiue to train the backbone. For example, knowledge distillation models such as student-teacher, train the backbone of the student model further, as shown below:

anomalib/src/anomalib/models/stfpm/torch_model.py

Lines 34 to 41 in 161d66b

    
           self.teacher_model = FeatureExtractor(backbone=self.backbone, pre_trained=True, layers=layers) 
        
           self.student_model = FeatureExtractor( 
        
               backbone=self.backbone, pre_trained=False, layers=layers, requires_grad=True 
        
           ) 
        
           # teacher model is fixed 
        
           for parameters in self.teacher_model.parameters(): 
        
               parameters.requires_grad = False

Overall, it is based on the design of the algorithm itself. Let us know if you have any other questions. Thanks!

1 reply

arjun-kg Aug 4, 2023
Author

Thank you for your detailed reply! This means if I use PADIM or PatchCore with pretrained=False, that would just be using a fixed backbone with random weights (which basically creates unusable features)?

And just checking, if the below code means that weights are frozen for Fastflow too?

anomalib/src/anomalib/models/fastflow/torch_model.py

Lines 155 to 156 in 0c7444e

    
           for parameter in self.feature_extractor.parameters(): 
        
               parameter.requires_grad = False

Suppose the pretrained Imagenet features are not good for my specific dataset, what would you recommend? It seems it's not possible to train something like Fastflow end-to-end including the feature extractor.

samet-akcay · 2023-06-16T04:28:56Z

samet-akcay
Jun 16, 2023
Maintainer

I'm moving this to the Q&A section in Discussions. We could continue our discussion there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible to train the backbone? #1138

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Possible to train the backbone? #1138

arjun-kg Jun 16, 2023

Replies: 2 comments · 1 reply

samet-akcay Jun 16, 2023 Maintainer

arjun-kg Aug 4, 2023 Author

samet-akcay Jun 16, 2023 Maintainer

arjun-kg
Jun 16, 2023

Replies: 2 comments 1 reply

samet-akcay
Jun 16, 2023
Maintainer

arjun-kg Aug 4, 2023
Author

samet-akcay
Jun 16, 2023
Maintainer