KnobGen is a dual-pathway framework that empowers sketch-based image generation diffusion model by seamlessly adapting to varying levels of sketch complexity and user skill. KnobGen employs a Coarse-Grained Controller (CGC) module for leveraging high-level semantics from both textual and sketch inputs in the early stages of generation, and a Fine-Grained Controller (FGC) module for detailed refinement later in the process.
More details available in our paper.
- [2024-09-27] 🔥 Initial release of KnobGen code!
- [2024-10-02] 🔥 The paper is released on arXiv.
Follow steps 1-3 to run our pipeline.
To set up the environment, please follow these steps in the terminal:
git clone https://github.com/aminK8/KnobGen.git
cd KnobGen
conda env create -f environment.yml
conda activate knobgen
We utilized the MultiGen-20M dataset, originally introduced by UniControl.
For run training, use the appropriate command based on the model:
# For T2I-Adapter:
bash job_adapter_training.sh
# For ControlNet:
bash job_controlnet_training.sh
To run inference, use the appropriate command based on the model:
# For T2I-Adapter:
bash job_adapter_inference.sh
# For ControlNet:
bash job_controlnet_inference.sh
Our method democratizes sketch-based image generation by effectively handling a broad spectrum of sketch complexity and user drawing ability—from novice sketches to those made by seasoned artists—while maintaining the natural appearance of the image.
More demos |
Comparison With Baseline |
The effect of our Knob mechanism
If you liked our paper, please consider citing it
@misc{navardknobgen,
title={KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models},
author={Pouyan Navard and Amin Karimi Monsefi and Mengxi Zhou and Wei-Lun Chao and Alper Yilmaz and Rajiv Ramnath},
year={2024},
eprint={2410.01595},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.01595},
}