This is the official implementation of the LOTS adapter from the paper "LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing", published as Oral at ICCV25 in Honolulu.
To access the Sketchy dataset, refer to the HuggingFace repository
- Code release
- Weights release
- Platform release
ckptsfolder
- Contains the pre-trained weights of the LOTS adapter.
scriptsfolder
- Contains all the scripts for training and inference with LOTS on Sketchy.
srcfolder
- Contains all the source code for the classes, models, and dataloaders used in the scripts.
We advise creating a Conda environment as follows
conda create -n lots python=3.12conda activate lotspip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121pip install -r requirements.txtpip install -e .
Unzip the pre-trained weights and config
cd ckpts
unzip lots.zip
cd ..
We provide the script to train LOTS on our Sketchy dataset in scripts/lots/train_lots.py.
For an example of usage, check run_train.sh, which contains the default parameters used in our experiments.
You can test our pre-trained model with the inference script in scripts/lots/inference_lots.py.
For an example, check run_inference.sh.
This script generates an image for each item in the test split of Sketchy, and saves them in a structured folder, with each item identified by its unique ID.
If you find our work useful, please cite our work:
@inproceedings{girella2025lots,
author = {Girella, Federico and Talon, Davide and Lie, Ziyue and Ruan, Zanxi and Wang, Yiming and Cristani, Marco},
title = {LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing},
journal = {Proceedings of the International Conference on Computer Vision},
year = {2025},
}
