-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add source #3
Comments
Group A:
|
Are you talking about cleaning the data itself or the metadata (lang codes)? |
about the cleaning, I meant more the tags like |
Can you clarify why facebookresearch/flores#61 is solved? I don't see any update in their data. |
@laubonghaudoi For my project (GlotLID), the issue is resolved because I deleted the yue in my Flores benchmark. This project is GlotLID, which trains a better language identification system. Flores-200 is one of the benchmarks I used. But to answer your question in general, this issue is not resolved in Flores-200 at its root. They made another project to maintain Flores: https://github.com/openlanguagedata/flores, but that also does not address this issue! Maybe someone needs to bring up this issue in the new project again. |
Group A: Please add here any possible speculation to have cleaner sources and evaluation data.
Group B: Please add any possible new sources here, especially those concerning languages not included.
The text was updated successfully, but these errors were encountered: