Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dataset loader for VlogQA #621

Closed
SamuelCahyawijaya opened this issue Apr 8, 2024 · 4 comments · Fixed by #673
Closed

Create dataset loader for VlogQA #621

SamuelCahyawijaya opened this issue Apr 8, 2024 · 4 comments · Fixed by #673
Assignees
Labels
pr-ready A PR that closes this issue is Ready to be reviewed

Comments

@SamuelCahyawijaya
Copy link
Collaborator

Dataloader name: vlogqa/vlogqa.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?vlogqa

Dataset vlogqa
Description VlogQA is a Vietnamese spoken language corpus for machine reading comprehension. It consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube videos around food and travel.
Subsets -
Languages vie
Tasks Question Answering
License Other (other)
Homepage https://github.com/sonlam1102/vlogqa/tree/main
HF URL -
Paper URL -
@akhdanfadh
Copy link
Collaborator

@holylovenia If I may, I want to work on this dataset. But it requires a dataset user agreement. Can I submit on behalf of the SEACrowd organization? I'm also unsure if I can receive the dataset before the dataloader implementation.

@holylovenia
Copy link
Contributor

holylovenia commented May 13, 2024

@holylovenia If I may, I want to work on this dataset. But it requires a dataset user agreement. Can I submit on behalf of the SEACrowd organization? I'm also unsure if I can receive the dataset before the dataloader implementation.

Sure @akhdanfadh, you can try to submit the user agreement first then we can discuss if you receive the dataset after the dataloader implementation.

@akhdanfadh
Copy link
Collaborator

I just received the dataset, working on it now.

@akhdanfadh
Copy link
Collaborator

#self-assign

@akhdanfadh akhdanfadh added the pr-ready A PR that closes this issue is Ready to be reviewed label May 14, 2024
sabilmakbar pushed a commit that referenced this issue May 30, 2024
* implement vlogqa dataloader

* Update seacrowd/sea_datasets/vlogqa/vlogqa.py

fix typo on desc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-ready A PR that closes this issue is Ready to be reviewed
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants