Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit loss is negative #43

Open
ZekaiGalaxy opened this issue Dec 28, 2023 · 9 comments
Open

Commit loss is negative #43

ZekaiGalaxy opened this issue Dec 28, 2023 · 9 comments

Comments

@ZekaiGalaxy
Copy link

image

When I trained on several objects with several epochs, the commit loss starts to become negative, and it turns out that the overall loss keeps going down, but neither the recon loss nor the reconstruction result turns better.

I wonder if the commit loss being negative is normal or not, or what it implies

@MarcusLoppe
Copy link
Contributor

Try lowering the diversity_gamma from 1.0 to 0.1 - 0.3.
The quantizer uses this variable to lower the loss, so it uses 1.0 which allows it to trick itself to be more diverse since it punishes exploration less. As you can see,, your commit loss is near -1 which is due to the code below, I think it's good to have high diversity at start but at the end it might do more harm then good.

Part of the commit loss calculation in LFQ:
entropy_aux_loss = per_sample_entropy - self.diversity_gamma * codebook_entropy

autoencoder = MeshAutoencoder(    
    num_discrete_coors = 128 ,
    rlfq_kwargs = {"diversity_gamma": 0.2 }
)

@qixuema
Copy link
Contributor

qixuema commented Dec 28, 2023

Try lowering the diversity_gamma from 1.0 to 0.1 - 0.3. The quantizer uses this variable to lower the loss, so it uses 1.0 which allows it to trick itself to be more diverse since it punishes exploration less. As you can see,, your commit loss is near -1 which is due to the code below, I think it's good to have high diversity at start but at the end it might do more harm then good.

Part of the commit loss calculation in LFQ: entropy_aux_loss = per_sample_entropy - self.diversity_gamma * codebook_entropy

autoencoder = MeshAutoencoder(    
    num_discrete_coors = 128 ,
    rlfq_kwargs = {"diversity_gamma": 0.2 }
)

Hi @MarcusLoppe

Happy New Year! 🎉🎉🎉

I attempted to train an autoencoder using 20 different chairs as training samples and encountered the same issue where the commit loss was negative.

This is the commit loss curve during my training process.

W B Chart 2023_12_28 23_14_01

I will reduce the diversity_gamma from 1.0 to between 0.1 and 0.3 to see what changes occur in the commit loss.

Best regards,
Xueqi Ma

@MarcusLoppe
Copy link
Contributor

Try lowering the diversity_gamma from 1.0 to 0.1 - 0.3. The quantizer uses this variable to lower the loss, so it uses 1.0 which allows it to trick itself to be more diverse since it punishes exploration less. As you can see,, your commit loss is near -1 which is due to the code below, I think it's good to have high diversity at start but at the end it might do more harm then good.
Part of the commit loss calculation in LFQ: entropy_aux_loss = per_sample_entropy - self.diversity_gamma * codebook_entropy

autoencoder = MeshAutoencoder(    
    num_discrete_coors = 128 ,
    rlfq_kwargs = {"diversity_gamma": 0.2 }
)

Hi @MarcusLoppe

Happy New Year! 🎉🎉🎉

I attempted to train an autoencoder using 20 different chairs as training samples and encountered the same issue where the commit loss was negative.

This is the commit loss curve during my training process.

W B Chart 2023_12_28 23_14_01

I will reduce the diversity_gamma from 1.0 to between 0.1 and 0.3 to see what changes occur in the commit loss.

Best regards, Xueqi Ma

Experiment a little bit since I just discovered this yesterday and haven't tested it out fully :)
Diversity is good at the start of the training but changing too much at the end of the training isn't very good.

Please let me know what you find out.

@qixuema
Copy link
Contributor

qixuema commented Dec 29, 2023

@MarcusLoppe

I tried reducing the diversity_gamma from 1.0 to 0.2 and retrained the data. The current commit loss curve is shown in the following image.

W B Chart 2023_12_29 09_05_36

@ZekaiGalaxy
Copy link
Author

Thank you @MarcusLoppe, From my perspective, maybe we can use a 'decaying' gamma, since we want it to explore at the beginning but converge at the end.

I also notice that in @qixuema 's experiment, with gamma = 0.2 commit loss do drop, but there are some extreme commit loss values. Does that mean the model overfits to a certain or several types of shapes, codes, whatever and can't do rare cases well.

@ZekaiGalaxy
Copy link
Author

Also @qixuema how's your recon loss going? I find that though my recon loss is going down (~0.32), I still can't reconstruct the train data using autoencoder when trained on multi objects.

@qixuema
Copy link
Contributor

qixuema commented Dec 29, 2023

Hi, @ZekaiGalaxy

The following are my recon_loss and total_loss.

W B Chart 2023_12_29 14_51_00

W B Chart 2023_12_29 14_50_48

@MarcusLoppe
Copy link
Contributor

@lucidrains

Hi all, this issue resolves itself when training on a large dataset, using a 300x50 augmentations the commit was at 3-14 at start and then settled itself and matched the recon loss at around 0.6

@avramdj
Copy link
Contributor

avramdj commented Dec 4, 2024

Hey, isn't this still a problem with the commit loss implementation itself? Any plans on fixing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants