Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix for: "ImportError: cannot import name 'FileWriter' from 'tensorboard'" #29

Open
wants to merge 90 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
f18de8a
Bug fix for
May 5, 2023
c59afae
Support for Python3.8 and Python 2.7
May 5, 2023
1fef73c
- A script to generate custom dataset for StackGAN
May 9, 2023
72b0aad
- generate custome dataset: filenames, classids
soumentechnoexponent May 10, 2023
07a6117
- filenames and embedding is created
May 10, 2023
6bc6321
- Proper embeddings, classids, and filenames are generated
May 11, 2023
55fb7ca
- Stage-1 and Stage-2 ran successfully
May 12, 2023
4799fba
- code cleanup
May 15, 2023
8666e6f
- install fasttext through conda
May 15, 2023
b0f5e17
install pytorch through conda
May 15, 2023
ba7c4c5
- Doc update
May 16, 2023
e7bad61
- script to check CUDA
May 16, 2023
6737182
- script to check CUDA support version
May 16, 2023
af3aaa3
- Cuda checks
May 19, 2023
39f7e6f
- Generate data form sqlite
May 19, 2023
b7e3c82
-pascal voc tool update
May 19, 2023
f2b27b7
- generate dataset form sqlite
May 19, 2023
d7fbb14
configuration for Sixray500 stage-1
May 19, 2023
14f378d
configuration for Sixray500 stage-2
May 19, 2023
2fa9fef
Retraining configuration for stage-1 with 1000 epoch
May 19, 2023
730de24
Retraining configuration for stage-2 with 1000 epoch
May 19, 2023
e4ef52f
Retraining configuration for stage-2 with 1000 epoch
May 19, 2023
a04ff37
configuration for 600 captions
May 19, 2023
0b97033
Stop config for finetune
May 19, 2023
fa2761b
Config for stage-2 600 captions
May 19, 2023
e81d89b
Experiment-3 1107 captions and 432 files
May 22, 2023
29705d2
Experiment-3 1107 captions and 432 files, stage-2
May 22, 2023
7dcabf1
Trainer image logging update
May 22, 2023
c307aa3
Setup config for experiment-3
May 22, 2023
d0d93e6
- support to train fasttext on caption dataset
May 23, 2023
77b4fd2
- Updating scripts and configuration
May 23, 2023
1cd65cc
- save image at last epoch
May 23, 2023
dc4f39d
- config for stage -1
May 23, 2023
2b8271a
- config for stage -2
May 23, 2023
9f39fd7
- config for stage -1 for 1347 captions
May 24, 2023
cf41c01
- one script for entire pipeline
May 24, 2023
5e6ba5b
- generate test dataset
soumentechnoexponent May 25, 2023
19eacbb
- testing script
soumentechnoexponent May 25, 2023
d46deaf
- script update to generate data with fasttext encoding
May 25, 2023
7ec642c
- config update for express train and test
May 25, 2023
cbe8d8c
- increase epoch form 1 to 500
May 25, 2023
a827877
- script generation update
May 26, 2023
5cf8a42
Preparing for training with 2183 caption 1000D
May 26, 2023
0ab31e1
- config for 2500 captions
May 29, 2023
c5d5af2
- Fasttext algorithm change
May 29, 2023
7b7ffc5
- Aspect ratio resize transform
May 29, 2023
58bafec
- cbow 2048D, 200 epoch, zdim 200
May 29, 2023
ad61d1e
- resize bug fix
May 29, 2023
5829e84
- config for cbow1024D no crop 2500 caption and zdim 100
May 30, 2023
0a12f0d
- minimum caption length=3
May 30, 2023
fffbd1e
- script update foe experiment 11: sixray_2500_ftt_1024D_skipgram_nocrop
Jun 1, 2023
33f7e1a
WIP for openai access
Jun 1, 2023
b277140
Bulk OpenAPI embedding text processor
Jun 1, 2023
ce01d5d
- bulk caption loading
Jun 2, 2023
30d7183
- single caption loading
Jun 2, 2023
c01ca24
- Embedding and POC with OpenAI is done
Jun 2, 2023
65022cf
- file check in progress
Jun 2, 2023
310d355
- config update for openai embedding
Jun 2, 2023
9695908
- FP32 to FP64 auto adaption
Jun 3, 2023
c7660bb
- experiment12
Jun 3, 2023
4708a20
texts to captions
Jun 3, 2023
af0752b
texts to captions
Jun 3, 2023
fd44939
- ready for experiment12
Jun 3, 2023
e82cee4
run with 1000 epoch openai embeddings
Jun 3, 2023
41b47ec
ignore pycache
Jun 3, 2023
86d44f2
floating point precision control
Jun 3, 2023
bf1f3ca
11805 caption and 2381 images is available
Jun 8, 2023
d0fb9cc
config update for 2381 images
Jun 8, 2023
a9e4fe5
Prepare for SQLite data generation
Jun 8, 2023
a049450
32 batch size
Jun 8, 2023
a3b6717
16 batch size
Jun 8, 2023
c31192a
64 batch size
Jun 8, 2023
a40a7b6
epoch 1000
Jun 8, 2023
ec7d45a
- Generate test images during training
Jun 8, 2023
c859f7e
- CUDA clean in test phase
Jun 8, 2023
aff251d
- PYTORCH_CUDA_ALLOC_CONF
Jun 8, 2023
c9d4d3b
- bug fix during memory cleaning
Jun 8, 2023
1bd338e
Fixing CUDA PyTorch Future deprecation waring
Jun 8, 2023
fc46af1
- capping captions to 4 as a few files are inconsistently contains up…
soumentechnoexponent Jun 9, 2023
fc7804d
- Hyperparameter double
Jun 12, 2023
d70f377
- Hyperparameter double
Jun 12, 2023
3afcf2b
- copy config
Jun 12, 2023
50aeb70
- copy config
Jun 12, 2023
213f178
- Finetune Experiment15.
Jun 13, 2023
58f1e98
- fix memory count
Jun 13, 2023
53e3081
- mild code update for WorkRNN models
Jul 7, 2023
fcb0a58
- saving git checksum and generating a config
Jul 8, 2023
ac85b9d
- config update
Jul 8, 2023
c961de2
buf gix
Jul 8, 2023
42dbacf
config for cnn-rnn word embedding
Jul 10, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,9 @@ code/*.pyc
code/miscc/*.pyc
backup
.DS_Store
.idea/
.idea/
/.ipynb_checkpoints/
__pycache__
/test_stage1.sh
/test_stage2.sh
/train_stage2.sh
109 changes: 77 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,65 +1,110 @@
# StackGAN-pytorch

Original Repository [hanzhanggit/StackGAN-Pytorch](https://github.com/hanzhanggit/StackGAN-Pytorch)

- [Tensorflow implementation](https://github.com/hanzhanggit/StackGAN)

- [Inception score evaluation](https://github.com/hanzhanggit/StackGAN-inception-model)

- [StackGAN-v2-pytorch](https://github.com/hanzhanggit/StackGAN-v2)

Pytorch implementation for reproducing COCO results in the paper [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/pdf/1612.03242v2.pdf) by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas. The network structure is slightly different from the tensorflow implementation.
Pytorch implementation for reproducing COCO results in the
paper [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/pdf/1612.03242v2.pdf)
by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas. The network
structure is slightly different from the tensorflow implementation.

<img src="examples/framework.jpg" width="850px" height="370px"/>

# Environment Setup (Linux)

### Dependencies
python 2.7
## Install conda (if not available)

Pytorch
- `git clone https://github.com/Redcof/StackGAN-Pytorch.git`
- `wget https://repo.anaconda.com/miniconda/Miniconda3-py38_23.3.1-0-Linux-x86_64.sh`
- `bash Miniconda3-py38_23.3.1-0-Linux-x86_64.sh -b`
- `$HOME/miniconda3/bin/conda init`
- `source $HOME/.bashrc`

In addition, please add the project folder to PYTHONPATH and `pip install` the following packages:
- `tensorboard`
- `python-dateutil`
- `easydict`
- `pandas`
- `torchfile`
## Create environment

- `conda create -n ganenv python=3.8`
- `conda activate ganenv`

## Install dependencies

**Data**
- `pip install -r requirements.txt`
- `conda install -c conda-forge fasttext`
- `conda install pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia`

1. Download our preprocessed char-CNN-RNN text embeddings for [training coco](https://drive.google.com/open?id=0B3y_msrWZaXLQXVzOENCY2E3TlU) and [evaluating coco](https://drive.google.com/open?id=0B3y_msrWZaXLeEs5MTg0RC1fa0U), save them to `data/coco`.
- [Optional] Follow the instructions [reedscot/icml2016](https://github.com/reedscot/icml2016) to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
2. Download the [coco](http://cocodataset.org/#download) image data. Extract them to `data/coco/`.
## Install CUDA drivers(if not available)

**How to check?**

```cmd
python cuda_test.py # should return True
```

**Training**
- The steps to train a StackGAN model on the COCO dataset using our preprocessed embeddings.
- Step 1: train Stage-I GAN (e.g., for 120 epochs) `python main.py --cfg cfg/coco_s1.yml --gpu 0`
- Step 2: train Stage-II GAN (e.g., for another 120 epochs) `python main.py --cfg cfg/coco_s2.yml --gpu 1`
- `*.yml` files are example configuration files for training/evaluating our models.
- If you want to try your own datasets, [here](https://github.com/soumith/ganhacks) are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.
**Check OS architecture**
`cat /etc/os-release` return the OS name and `uname -m` command should return the OS architecture. For us, it was 'x86_64'

**Downloading Toolkit**
[https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Linux](https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Linux)

We choose to install online:
```commandline
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
sudo dnf clean all
sudo dnf -y module install nvidia-driver:latest-dkms
sudo dnf -y install cuda
```

**Pretrained Model**
- [StackGAN for coco](https://drive.google.com/open?id=0B3y_msrWZaXLYjNra2ZSSmtVQlE). Download and save it to `models/coco`.
- **Our current implementation has a higher inception score(10.62±0.19) than reported in the StackGAN paper**
**Data - Text**

1. Download our preprocessed char-CNN-RNN text embeddings
for [training coco](https://drive.google.com/open?id=0B3y_msrWZaXLQXVzOENCY2E3TlU)
and [evaluating coco](https://drive.google.com/open?id=0B3y_msrWZaXLeEs5MTg0RC1fa0U), save them to `data/coco`.

2. [Optional] Follow the instructions [reedscot/icml2016](https://github.com/reedscot/icml2016) to download the
pretrained char-CNN-RNN text encoders and extract text embeddings.

**Data - Image**

1. Download the [coco](http://cocodataset.org/#download) image data. Extract them to `data/coco/`.

**Custom Dataset**

1. See `data/README.md` file

**Training COCO**

- The steps to train a StackGAN model on the COCO dataset using our preprocessed embeddings.
- Step 1: train Stage-I GAN (e.g., for 120 epochs) `python code/main.py --cfg cfg/coco_s1.yml --gpu 0`
- Step 2: train Stage-II GAN (e.g., for another 120 epochs) `python code/main.py --cfg cfg/coco_s2.yml --gpu 1`
- `*.yml` files are example configuration files for training/evaluating our models.
- If you want to try your own datasets, [here](https://github.com/soumith/ganhacks) are some good tips about how to
train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex
datasets.

**Pretrained Model**

- [StackGAN for coco](https://drive.google.com/open?id=0B3y_msrWZaXLYjNra2ZSSmtVQlE). Download and save it
to `models/coco`.
- **Our current implementation has a higher inception score(10.62±0.19) than reported in the StackGAN paper**

**Evaluating**
- Run `python main.py --cfg cfg/coco_eval.yml --gpu 2` to generate samples from captions in COCO validation set.

- Run `python code/main.py --cfg cfg/coco_eval.yml --gpu 2` to generate samples from captions in COCO validation set.

Examples for COCO:

![](examples/coco_2.png)
![](examples/coco_3.png)

Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same discription :smiley:


Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation
makes them creative enough to generate objects with different poses and viewpoints from the same discription :smiley:

### Citing StackGAN

If you find StackGAN useful in your research, please consider citing:

```
Expand All @@ -71,14 +116,14 @@ booktitle = {{ICCV}},
}
```


**Our follow-up work**

- [StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1710.10916)
- [AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks](https://arxiv.org/abs/1711.10485) [[supplementary]](https://1drv.ms/b/s!Aj4exx_cRA4ghK5-kUG-EqH7hgknUA)[[code]](https://github.com/taoxugit/AttnGAN)


**References**

- Generative Adversarial Text-to-Image Synthesis [Paper](https://arxiv.org/abs/1605.05396) [Code](https://github.com/reedscot/icml2016)
- Learning Deep Representations of Fine-grained Visual Descriptions [Paper](https://arxiv.org/abs/1605.05395) [Code](https://github.com/reedscot/cvpr2016)
- Generative Adversarial Text-to-Image
Synthesis [Paper](https://arxiv.org/abs/1605.05396) [Code](https://github.com/reedscot/icml2016)
- Learning Deep Representations of Fine-grained Visual
Descriptions [Paper](https://arxiv.org/abs/1605.05395) [Code](https://github.com/reedscot/cvpr2016)
7 changes: 7 additions & 0 deletions TODO.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
May-25
[+] Share image with - Tejas
[*] Generate data for testing and test it
[ ] Image superresolution with S1 images
[ ] Run training with fasttext self-trained + latest captions
[ ] VQGAN
[ ] CLIP
Binary file added code/__pycache__/model.cpython-38.pyc
Binary file not shown.
Binary file added code/__pycache__/trainer.cpython-38.pyc
Binary file not shown.
34 changes: 34 additions & 0 deletions code/cfg/sixray_500_s1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
CONFIG_NAME: 'stage_1'

DATASET_NAME: 'sixray_2381_charcnnrnn_1536_jemb'
EMBEDDING_TYPE: 'embedding_bulk_word_1536_jemb.pickle'
GPU_ID: '0,1'
Z_DIM: 200
DATA_DIR: 'data/sixray_2381'
IMSIZE: 64
WORKERS: 4
STAGE: 1
TRAIN:
FLAG: True
BATCH_SIZE: 64
BATCH_DROP_LAST: True
MAX_EPOCH: 1000
LR_DECAY_EPOCH: 20
SNAPSHOT_INTERVAL: 5
DISCRIMINATOR_LR: 0.0004
GENERATOR_LR: 0.0004
COEFF:
KL: 2.0
FINETUNE:
FLAG: False
EPOCH_START: 1001
NET_G: '/home/icmore_acc/Downloads/StackGAN-Pytorch/output/sixray_500_stage_1_2023_05_19_15_38_51/Model/netG_epoch_300.pth'
NET_D: '/home/icmore_acc/Downloads/StackGAN-Pytorch/output/sixray_500_stage_1_2023_05_19_15_38_51/Model/netD_epoch_last.pth'

GAN:
CONDITION_DIM: 128
DF_DIM: 96
GF_DIM: 192

TEXT:
DIMENSION: 1536
36 changes: 36 additions & 0 deletions code/cfg/sixray_500_s2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
CONFIG_NAME: 'stage_2'

DATASET_NAME: 'sixray_2381_charcnnrnn_1536_jemb'
EMBEDDING_TYPE: 'embedding_bulk_word_1536_jemb.pickle'
GPU_ID: '0,1'
Z_DIM: 200
DATA_DIR: 'data/sixray_2381'
IMSIZE: 256
WORKERS: 4
STAGE: 2
STAGE1_G: ''
TRAIN:
FLAG: True
BATCH_SIZE: 64
BATCH_DROP_LAST: True
MAX_EPOCH: 1000
LR_DECAY_EPOCH: 20
SNAPSHOT_INTERVAL: 5
DISCRIMINATOR_LR: 0.0008
GENERATOR_LR: 0.0008
COEFF:
KL: 1.0
FINETUNE:
FLAG: False
EPOCH_START: 1001
NET_G: 'output/experiment15/sixray_2381_ftt_1024D_cbow_nocrop_batch_double_stage_2_train_2023_06_12_18_16_28/Model/netG_epoch_1000.pth'
NET_D: 'output/experiment15/sixray_2381_ftt_1024D_cbow_nocrop_batch_double_stage_2_train_2023_06_12_18_16_28/Model/netD_epoch_last.pth'

GAN:
CONDITION_DIM: 128
DF_DIM: 96
GF_DIM: 192
R_NUM: 2

TEXT:
DIMENSION: 1536
34 changes: 34 additions & 0 deletions code/cfg/sixray_s1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
CONFIG_NAME: 'stage1'

DATASET_NAME: 'sixray_sample'
EMBEDDING_TYPE: 'embeddings_cc.en.300.bin_300D.pickle'
GPU_ID: '0,1'
Z_DIM: 100
DATA_DIR: '../data/sixray_sample'
IMSIZE: 64
WORKERS: 4
STAGE: 1
TRAIN:
FLAG: True
BATCH_SIZE: 6
BATCH_DROP_LAST: True
MAX_EPOCH: 300
LR_DECAY_EPOCH: 20
SNAPSHOT_INTERVAL: 10
DISCRIMINATOR_LR: 0.0002
GENERATOR_LR: 0.0002
COEFF:
KL: 2.0
FINETUNE:
FLAG: False
EPOCH_START: 0
NET_G: ''
NET_D: ''

GAN:
CONDITION_DIM: 128
DF_DIM: 96
GF_DIM: 192

TEXT:
DIMENSION: 300
36 changes: 36 additions & 0 deletions code/cfg/sixray_s2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
CONFIG_NAME: 'stage2'

DATASET_NAME: 'sixray_sample'
EMBEDDING_TYPE: 'embeddings_cc.en.300.bin_300D.pickle'
GPU_ID: '0,1'
Z_DIM: 100
STAGE1_G: 'output/sixray_sample_stage1_2023_05_12_19_17_04/Model/netG_epoch_300.pth'
DATA_DIR: 'data/sixray_sample'
WORKERS: 4
IMSIZE: 256
STAGE: 2
TRAIN:
FLAG: True
BATCH_SIZE: 6
BATCH_DROP_LAST: True
MAX_EPOCH: 500
LR_DECAY_EPOCH: 20
SNAPSHOT_INTERVAL: 5
DISCRIMINATOR_LR: 0.0002
GENERATOR_LR: 0.0002
COEFF:
KL: 2.0
FINETUNE:
FLAG: False
EPOCH_START: 0
NET_G: ''
NET_D: ''

GAN:
CONDITION_DIM: 128
DF_DIM: 96
GF_DIM: 192
R_NUM: 2

TEXT:
DIMENSION: 300
Loading