An Evaluation Dataset for Identifying Communicative Functions of Sentences in English Scholarly Papers
The dataset consists of three sets of tsv files. The usage is explained in our paper.
- ID
- Targeted sentence [s0]
- Correct choice [s1]
- Wrong choice [s2]
- Core FE for s0
- Core FE for s1
- Core FE for s2
- Communicative function for s0 and s1
- Communicative function for s2
- Paper/sentence ID for s0
- Paper/sentence ID for s1
- Paper/sentence ID for s2
- Accuracy of human annotation
- Communicative function
- The core FE
- Sentence
- Sentence ID (PaperID_SentID; identical to the ID in AASC)
This dataset is licensed under the Creative Commons BY-NC-SA 3.0. When you use the dataset, please cite our paper (see below).
This dataset uses ACL Anthology Sentence Corpus, which consists of papers retrieved from ACL Anthology.
© 1979-2018 Association for Computational Linguistics
Licensed under the Creative Commons BY-NC-SA 3.0 (-2015) and Creative Commons BY 4.0 (2016-)
Licensed under the Creative Commons BY-NC-SA 3.0
Iwatsuki, K., Boudin, F., & Aizawa, A. (2020). An Evaluation Dataset for Identifying Communicative Functions of Sentences in English Scholarly Papers. In Proceedings of The 12th Language Resources and Evaluation Conference, 1712–1720.
@InProceedings{Iwatsuki2020LREC,
author = {Iwatsuki, Kenichi and Boudin, Florian and Aizawa, Akiko},
title = {An Evaluation Dataset for Identifying Communicative Functions of Sentences in English Scholarly Papers},
booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
month = {May},
year = {2020},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {1712--1720},
url = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.212.pdf}
}