Skip to content

Latest commit

 

History

History
42 lines (33 loc) · 1.62 KB

README.md

File metadata and controls

42 lines (33 loc) · 1.62 KB

UCCA-Annotated English Corpus: The Little Prince

Version 2.0 (February 9, 2021)

This bundle contains 99 sentences annotated according to the foundational layer of UCCA. The total number of tokens in this corpus is 1312.

Corpus:

The English corpus used here is the book "The Little Prince" (Le Petit Prince), a classic novel written in French by Antoine de Saint-Exupéry, and first published in 1943. This is the same text as was used in the AMR Little Prince corpus in English (https://amr.isi.edu/download.html).

Format and Source Code:

Information about the format of the xml files and source code for reading and manipulating them are available at https://universalconceptualcognitiveannotation.github.io/.

Citation:

The annotation was conducted at the Hebrew University of Jerusalem. If you use this corpus, please cite:

@inproceedings{Oep:Abe:Abz:20,
  author = {Oepen, Stephan and Abend, Omri and Abzianidze, Lasha and
            Bos, Johan and Haji\v{c}, Jan and Hershcovich, Daniel and
            Li, Bin and O'Gorman, Tim and Xue, Nianwen and Zeman, Daniel},
  title = {{MRP}~2020: {T}he {S}econd {S}hared {T}ask on
           {C}ross-Framework and {C}ross-{L}inguistic
           {M}eaning {R}epresentation {P}arsing},
  booktitle = {Proc. of CoNLL Shared Task},
  year = 2020
}

Licensing:

The UCCA annotation is distributed under the "Attribution-ShareAlike 3.0 Unported" license (http://creativecommons.org/licenses/by-sa/3.0/). Please follow the link for exact details.