Hey,
In the paper it is stated that batch size is 350seconds, is that per GPU, or total batch size? Also, if I follow the hyper-parameters of the HuBERT repo to pre-train the model I get a batch size of 2800seconds on 32 V100 GPUs
Kind Regards,
Goksenin Yuksel