Why use centroids for training and generation #15

ChaofanTao · 2022-01-22T09:41:17Z

Hi,

Thanks for your implementation of image-gpt.

I wonder whether quantize the input to centroids is an optional processing for both training and generation, and the advantages of using centroids. Thanks again.

teddykoker · 2022-01-22T15:07:28Z

Hi! See this passage from the original paper:

An IR of 32^2 × 3 is still quite computationally intensive.
While working at even lower resolutions is tempting, prior
work has demonstrated human performance on image classi-
fication begins to drop rapidly below this size (Torralba et al.,
2008). Instead, motivated by early color display palettes,
we create our own 9-bit color palette by clustering (R, G,
B) pixel values using k-means with k = 512. Using this
palette yields an input sequence length 3 times shorter than
the standard (R, G, B) palette, while still encoding color
faithfully. A similar approach was applied to spatial patches
by Ranzato et al. (2014). We call the resulting context length
(32^2 or 48^2 or 64^2) the model resolution (MR). Note that
this reduction breaks permutation invariance of the color
channels, but keeps the model spatially invariant.

The idea is that by discretizing the 3 RGB values into single bins you can reduce the sequence length by a factor of 3, which greatly reduces compute as the attention mechanism has a quadratic complexity with respect to sequence length.

ChaofanTao · 2022-01-23T09:49:22Z

Thanks, I get it! Do you have experiments on the effect of the value of 'k' in the k-means, especially for ImageNet (large dataset).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why use centroids for training and generation #15

Why use centroids for training and generation #15

ChaofanTao commented Jan 22, 2022

teddykoker commented Jan 22, 2022

ChaofanTao commented Jan 23, 2022

Why use centroids for training and generation #15

Why use centroids for training and generation #15

Comments

ChaofanTao commented Jan 22, 2022

teddykoker commented Jan 22, 2022

ChaofanTao commented Jan 23, 2022