Replies: 1 comment
-
Hi @zsyOAOA ! Please let me explain why it is not the case that our method is simply a “flow matching version of DifFace.” DifFace inserts the posterior mean somewhere along the reverse process of a pre-trained diffusion model, where the model never observed such posterior mean predictions during training (it observed noisy versions of natural images). Namely, when inserting the posterior mean predictions in such a way, DifFace needs to add a sufficiently severe noise so that the diffusion model would not distinguish between noisy natural images and noisy posterior mean predictions. As discussed in the DifFace paper (before and after equation (12)), this ensures that the reconstructed images would be of high perceptual quality, but clearly comes at the cost of hindered distortion (e.g., PSNR, SSIM). In contrast, PMRF trains a rectified flow model from scratch (namely, not just some out-of-the-box flow matching model), and our model is specifically tailored to transport the posterior mean predictions to the ground-truth image distribution. Unlike DifFace, we show that PMRF enjoys compelling theoretical guarantees. Practically, our approach leads to far-better distortion, with either better or on-par perceptual quality compared to DifFace, and this is achieved with a significantly smaller model and with a smaller number of flow steps. I hope this clarifies things for you. If you don't mind, I am moving this to "Discussions", as this is not a code issue. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi, congratulations on such an effective and powerful work!
I have carefully read this paper. According to my understanding, the difference to DifFace is that it replaces the diffusion model implemented by DDPM in DifFace with an advanced version based on flow-matching. It is amazing that the flow-matching framework could result in such appealing performance improvements.
Beta Was this translation helpful? Give feedback.
All reactions