QA-eval-Greek-dataset

This repository contains QAevalGreek dataset, a manually created small-scale dataset for QA LM evaluation in Greek.

The annotation process of the QAEvalGreek dataset is similar to the annotation process of SQuAD (Rajpurkar et al., EMNLP 2016), it follows the SQuAD format and comprises the following steps:

We selected 50 random articles from the Greek Wikipedia and we extracted an individual paragraph for each article.
For each paragraph an annotator generated 3 relative questions. The annotator was encouraged to generate the questions by using his/her own words and not just copying phrases from the paragraph. Also, there was an instruction to highlight the answers in the paragraph, in order for the exact position of the answer to be evident, in cases of multiple instances of the answer’s text.
2 other annotators have been given each paragraph and the corresponding questions and answers to check the validity of every question-answer pair. All the questions were accepted by the other annotators and included in the QAEvalGreek.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
QAevalGreek.jsonl		QAevalGreek.jsonl
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QA-eval-Greek-dataset

About

Releases

Packages

License

robotics-4-all/QA-eval-Greek-dataset

Folders and files

Latest commit

History

Repository files navigation

QA-eval-Greek-dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages