-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to resume training #9
Comments
Using the option |
thank you very much. I have another question for you, why G_ Does loss have a negative value? After a period of time, it will output a negative value and then change to a positive value. |
Hi Yi Wang ,Can you share the place2 pretrained model ,thank you very much。 |
It happens as the loss for updating generator in WGAN-GP could be negative.
You required model (Places2) can be downloaded from here. You can use it by |
when i was testing ,i have met another problem. |
The picture size of place2 is 256 * 256. How do you use 256 * 512 for training? Do you directly change the fractional rate of 256 * 256 pictures to 256 * 512? |
When using the pretrained model on Places2, please set the image and mask shapes by |
Can this program be fine-tuned? For example, import a pre-trained model, then freeze certain layers and then train. |
Sure. If you wanna freeze some specific layers in the generator, you can remove them by their names in g_vars, then these parameters will not be updated in the following training. |
Can you give me an example? thank you very much. |
Hi Yi Wang. What applications do you think this technology has in real life? For example, what kind of practical problems can be solved? |
I am training from the newly selected data set. To what extent can the first stage of training be identified as convergence?thanks for your help |
Extending images or videos naturally to fit the display device could benefit from such technology. Someone has explored its video application as here. |
We can ensure the training is converged when the reconstruction loss seems stable in the first stage. Quantitatively, for a relatively small-scale dataset (e.g. Paris street view, cityscapes) contains 2k~12k training images, 80000 iterations with batch size 16 should be enough (larger batch size may require fewer training iterations). The two-stage training is actually a compromise due to the used network only has a small capacity (no more than 4M parameters). If training the model with a large capacity equipped with residual blocks (like SPADE or Pix2pixHD) and new GAN stable tricks (spectral norm, multiple-scale, patchgan, condition projection, etc), it may be trained directly with vgg loss and adversarial loss from scratch. |
Can you send me all the loss curves of your training at that time? thanks. |
I will search my server for these data and get back to you later. |
At least the loss tendency of discriminator should be oscillating instead of converging. Note it would better to train with aligned data (or data with similar layouts, e.g., aligned face or cityscape-like data). If not, using a bigger model / pretraining / gan stabilization tricks for training. |
Can you send me all the loss curves of your training at that time? thanks. |
when i was training a lot time ,and i have stop it ,can you tell me how to resume training? And the petrain_network is equal to 1or 0?
The text was updated successfully, but these errors were encountered: