2024.11.08 Pretrained weights can be downloaded here
2024.02.05 Pre-Release code 🥳 🥳
- In our implementation, we use text2image pre-trained weight from Latent Diffusion Model (LDM). Please download the pre-trained weight from their official github. Alternatively, you can use directly from Diffusers library (version 0.3.0).
- Please note that our method requires a cropped text. You may use EAST for cropping the text or manual crop.
- For training LDM with Syntext, please download Syntext.
You can download our SynText dataset on this. Alternatively, you can generate by your own. Please refer to srnet. We slightly change the code from its origin. Please refer to generate-syntext directory.
Please go to generate-syntext/
directory and run
python datagen.py
Before training the noise model, please initialize the weight from the pretrained text2img from Latent Diffusion Model. Then, go to outer-loop/
directory and run
python finetune.py
For text recognition model, please use the origin weight from ABINet. After that, please run
python gradio_dbest.py
Q: What is the sampling method used in this paper?
A: We use Denoising Diffusion Implicit Models (DDIM) and Pseudo Numerical Methods for Diffusion Models on Manifolds (PNDM) as implemented by diffusers and we used a normalized guidance scale (gs) 0-1.
Q: Is there any post-processing step?
A: Yes. We apply the color transfer method to enhance the final result's quality.
Q: How to generate the Syntext dataset?
A: Yes. Please refer to the original srnet repository. We heavily used their code and only changed the parameters.
Q: Why the result is broken?
A: You should check the diffusers version. I tried to re-implement my code with the newest version of diffusers and found some of my code should be tuned. My code used the earlier diffusers and some of the functions are overrided by myself. If you have any questions, please send me an email or just write in the issue.
Q: For any discussion and input?
A: I open for any discussion and input. Please send me an email to janojoshua_at_gmail_dot_com.