Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][Add] Mask2Former #1059

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ariG23498
Copy link

Fix #1044

@ariG23498
Copy link
Author

Hi @qubvel some questions before I start contributing to the core part of the model:

  1. This model seems to have three parts, a pixel encoder, a pixel decoder and a transformer decoder. While I understand that I do not have to write the pixel encoder, as that can be directly retrieved from timm, the only bits I would need to contribute right now would be the two decoders.
  2. Do we want this to be inference only at first? To make this happen my workflow would be to copy code from the transformers implementation and make the weight conversion (only if required) and pass an image to the model to see the semgentation maps correctly.
  3. I am unsure about the processor. Do you think we should concentrate on that as well?

Let me know what you think.

@qubvel
Copy link
Collaborator

qubvel commented Feb 13, 2025

Hey @ariG23498, thanks for the questions, I tried to answer them, but let me know if anything is unclear

  1. Yes
  2. AFAIU the semantic segmentation model should be trainable with existing tutorials, otherwise, we can make a new tutorial (if there are any nuances)
  3. We can use Albumentations for preprocessing - similar to what I used for segformer + to create a notebook on how to make an inference. See also the snippet for Segformer https://huggingface.co/smp-hub/segformer-b3-512x512-ade-160k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Mask2Former to SMP
2 participants