-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mask2Former to SMP #1044
Comments
Thanks for opening an issue @caxel-ap 🤗 It might be a first instance segmentation model in the library, let's see if anyone would be a eager to contribute, I suppose it will be super impactful 👍 |
Even just for semantic segmentation would be great to have in here someday, I’ve had good results using it in transformers for semantic with https://huggingface.co/facebook/mask2former-swin-large-ade-semantic |
Hey @qubvel ! I would like to work on it. Is there a guideline on how to contribute to this repo? My todos would be:
Thanks! |
Hey @ariG23498! That's super cool, thanks for your interest 🤗 At the moment there are no guidelines, but you can get inspiration from any of the existing models. The code for existing models is relatively small, so you can just copy
Just let me know what questions you will face and I will try to answer, and then add it to the docs 🤗 |
Mask2Former model was introduced in the paper Masked-attention Mask Transformer for Universal Image Segmentation and first released in this repository.
Mask2Former addresses instance, semantic and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation. Mask2Former outperforms the previous SOTA, MaskFormer both in terms of performance an efficiency by (i) replacing the pixel decoder with a more advanced multi-scale deformable attention Transformer, (ii) adopting a Transformer decoder with masked attention to boost performance without without introducing additional computation and (iii) improving training efficiency by calculating the loss on subsampled points instead of whole masks.
Papers with Code
https://paperswithcode.com/paper/masked-attention-mask-transformer-for
Paper:
https://arxiv.org/abs/2112.01527
HF Reference implementation:
https://huggingface.co/docs/transformers/main/en/model_doc/mask2former
https://github.com/huggingface/transformers/blob/main/src/transformers/models/mask2former/modeling_mask2former.py
The text was updated successfully, but these errors were encountered: