Skip to content
This repository has been archived by the owner on May 20, 2024. It is now read-only.
/ EDGAN Public archive

EDGAN: StackGAN with Embedding Distance Training

License

Notifications You must be signed in to change notification settings

yao-zhao/EDGAN

Repository files navigation

EDGAN

This repository modifies the original StackGAN code from github.

Dataset

use MSCOCO data set

get data set and preprocessed model

  • Download MSCOCO dataset and annotations including captions and instances
  • Download pretrained char-CNN-RNN embedding of MSCOCO.
  • misc/preprocess_mscoco.py preprocess the image in to different sizes for selected supercategory ,write them into tfrecords file along with the corresponding caption embedding.

New features

Data input pipline

  • use mscoco python API
  • dataloader that load tfrecords from mscoco
  • image augumentation including cropping, flipping, and standarlization (when downsample the image, use INTER_AREA method)
  • sampling from multiple caption embeddings, visualize embedding distributions
  • negative example (use inner product of embedding captions, see method CLSGAN)
  • filter out selective images based on classes and their areas

Modification of GAN network

  • enlarge capacity of generator network, adding 3 residual blocks.
  • change relu to leaky relu
  • option to no batch norm in discriminator
  • increase or reduce discriminator final dimension

Multiple training methods of GAN

  • Option to trian with vanilla GAN
  • Option to train with WGAN (excluding weight clipping for batchnorm)
  • Option to train with LSGAN
  • Option to train with CLSGAN, continous least square GAN that estimates the inner products of embeddings between right caption embeddings and wrong caption embeddings.
  • Option to train with BGAN (not implemented yet)

Classification Transfering from Imagenet to MSCOCO (for future 3 stage GAN)

  • Label each image in MSCOCO with multiple labels for objects that have area larger than the threshold
  • Transfer resnet from Caffe to Tensorflow
  • Train resnet to classify the 80 categories of objects in MSCOCO

References publications

About

EDGAN: StackGAN with Embedding Distance Training

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages