You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your question!
The choice to focus more on the R2R dataset compared to the other datasets is based on two main reasons: 1) The R2R dataset provides more detailed and diverse instructions, which helps in learning more effective modality alignment. 2) In other datasets, a single instruction often corresponds to multiple trajectories, which means the amount of unique data is actually smaller.
But maybe there is a little mistake in your 2) response. That is one trajectory corresponded to many instructions? It is around 1:10 in some dataset. I noticed that too!
So, in your opinion, the unique data is very import.
Hi,
I noticed that you pay a lot of attention to R2R dataset and less attention to CVDN dataset which is about 20:1:5:5:5
Could you please tell me why you choose that? Is that a experiment speaking? Or you have some idea about this ratio choice?
Thank you for your help!
The text was updated successfully, but these errors were encountered: