Hello,
Thanks for you great work :)
I plan to fine your pretrained model on another dataset. My idea is to implement a LoRa version of the flow transformers and to train them while keeping the rest of the NN frozen. What do you think about that ? What is the minimum setup (GPUs) that I need to train the transformers (one at a time) ?
Thanks a lot !