Gumbel max trick does not seem to make sense in here #108

ivallesp · 2024-02-23T19:01:13Z

Hi all,

I want to ask a question regarding some concerns I got looking at the usage of the gumbel_sample method when reinmax=False.

vector-quantize-pytorch/vector_quantize_pytorch/vector_quantize_pytorch.py

Line 472 in 6102e37

    
           embed_ind, embed_onehot = self.gumbel_sample(dist, dim = -1, temperature = sample_codebook_temp, training = self.training)

First, this sampling technique is mathematically equivalent to sample from the categorical distribution, Gumbel is doing nothing here (just sampling), and the argmax makes the operation non differentiable (I know we apply STE later).

vector-quantize-pytorch/vector_quantize_pytorch/vector_quantize_pytorch.py

Lines 72 to 77 in 6102e37

    
           if training and stochastic and temperature > 0: 
        
               sampling_logits = (logits / temperature) + gumbel_noise(logits) 
        
           else: 
        
               sampling_logits = logits 
        
           ind = sampling_logits.argmax(dim = dim)

Additionally, the logits are the codebook distances (dist in the first snippet above). It's an always positive variable, which means that it's going to be biased because it's bounded at zero. There are no gradients flowing from the sampling operation backwards (because it is not a Gumbel softmax, but a Gumbel max) hence the logits magnitude never gets altered to improve the sampling.

It seems to me that this is just takes a hidden variable (the distance matrix) normalizes it given an arbitrary temperature parameter and samples from it, adding biased noise to the straight-through relaxation... What am I missing?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gumbel max trick does not seem to make sense in here #108

Gumbel max trick does not seem to make sense in here #108

ivallesp commented Feb 23, 2024 •

edited

Loading

Gumbel max trick does not seem to make sense in here #108

Gumbel max trick does not seem to make sense in here #108

Comments

ivallesp commented Feb 23, 2024 • edited Loading

ivallesp commented Feb 23, 2024 •

edited

Loading