Pytorch model based on AttnGAN described in paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks with noise-reducing encoding from paper A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
The idea of noise reduction is based on FC layer on raw embeddings. But we lose information about each separate word. So I decided to use this idea to part of Text Encoder, which produce sentence features. In the original paper sentence features are concatenated hidden states from both directions of LSTM. I propose FC layer to reduce dimensions and get noise resistant representation of sentence features. This approach should be more noise resistant because in LSTM noise in first word affect all hidden states. All other blocks are the same as in original paper.
python 2.7
Pytorch
In addition, please add the project folder to PYTHONPATH and pip install the following packages:
python-dateutileasydictpandastorchfilenltkscikit-image
Data
- Download our preprocessed metadata for birds coco and save them to
data/ - Download the birds image data. Extract them to
data/birds/ - Download coco dataset and extract the images to
data/coco/
Training
-
Pre-train DAMSM models:
- For bird dataset:
python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0 - For coco dataset:
python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1
- For bird dataset:
-
Train AttnGAN models:
- For bird dataset:
python main.py --cfg cfg/bird_attn2.yml --gpu 2 - For coco dataset:
python main.py --cfg cfg/coco_attn2.yml --gpu 3
- For bird dataset:
-
*.ymlfiles are example configuration files for training/evaluation our models.
Pretrained Model
-
DAMSM for bird. Download and save it to
DAMSMencoders/ -
DAMSM for coco. Download and save it to
DAMSMencoders/ -
AttnGAN for bird. Download and save it to
models/ -
AttnGAN for coco. Download and save it to
models/ -
AttnDCGAN for bird. Download and save it to
models/- This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework.
Sampling
- Run
python main.py --cfg cfg/eval_bird.yml --gpu 1to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved toDAMSMencoders/. - Change the
eval_*.ymlfiles to generate images from other pre-trained models. - Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.
Validation
- To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run
python main.py --cfg cfg/eval_bird.yml --gpu 1 - We compute inception score for models trained on birds using StackGAN-inception-model.
- We compute inception score for models trained on coco using improved-gan/inception_score.
Examples generated by AttnGAN [Blog]
| bird example | coco example |
|---|---|
![]() |
![]() |
Evaluation code embedded into a callable containerized API is included in the eval\ folder.
If you find AttnGAN useful in your research, please consider citing:
@article{Tao18attngan,
author = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
title = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
Year = {2018},
booktitle = {{CVPR}}
}
Reference

