Comparison with Transfusion #12

NickGao96 · 2024-08-27T07:39:37Z

Transfusion seems to be also about AR + Diffusion Multi Modality Model (https://huggingface.co/papers/2408.11039). Are you using similar techniques? Is there any major difference?

Sierkinhane · 2024-08-28T08:06:07Z

Hi, sorry for the late reply. You can find two distinct differences i) representations for multimodal understanding, clip-vit and magvitv2 (ours) vs vae (transfusion); ii) representations for generation, magvitv2 (ours) vs vae (transfusion). More details can be found in our paper. Btw, welcome to star our repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison with Transfusion #12

Comparison with Transfusion #12

NickGao96 commented Aug 27, 2024

Sierkinhane commented Aug 28, 2024 •

edited

Loading

Comparison with Transfusion #12

Comparison with Transfusion #12

Comments

NickGao96 commented Aug 27, 2024

Sierkinhane commented Aug 28, 2024 • edited Loading

Sierkinhane commented Aug 28, 2024 •

edited

Loading