- Camps, Jean-Baptiste, Clérice, Thibault, Duval, Frédéric, Kanaoka, Naomi & Pinche, Ariane (2021). Corpus and Models for Lemmatisation and POS-tagging of Old French, arXiv preprint arXiv:2109.11442, https://arxiv.org/abs/2109.11442.
- [Chrestien]: Kunstmann, Pierre (éd), Chrétien de Troyes: Cligès, Erec, Lancelot, Perceval, Yvain – manuscrit P (BnF fr. 794), 2009, http://www.atilf.fr/dect.
- [Code]: Duval and Pastore, in progress.
- [DocLing]: Gleßgen, Martin Dietrich (dir.), et al., Les plus anciens documents linguistiques de la France, 2016, http://www.rose.uzh.ch/docling/, 3e édition.
- [Geste]: Camps, Jean-Baptiste (dir.), Geste: un corpus de chansons de geste, 2016-… (v02), École nationale des chartes, Paris, 2019, http://doi.org/10.5281/zenodo.2630574, textes du domaine public, développements CC-BY-SA.
- [Lancelot]: Ing, Lucence, Disparitions lexicales en diachronie: traitements automatiques sur le Lancelot en prose, thèse de doct. en préparation, dir. F. Duval, codir. J.B. Camps, École nationale des chartes, Université PSL, Paris.
- [WauchierSConf] Pinche, Ariane, Édition nativement numérique du recueil hagiographique ‘Li Seint Confessor’ de Wauchier de Denain d’après le manuscrit fr. 412 de la Bibliothèque nationale de France, thèse de doctorat dir. C. pierreville et B. Bureau, Université de Lyon, Lyon, 2021.
The [Varia] are composed of short excerpts, taken from the work of students at the École des chartes, annotated in 2020, as part of the evaluation of the course initiation à la philologie romane: introduction au moyen français, given by Lucence Ing and Jean-Baptiste Camps (thematic dossier on the plague and medicine, during the first lockdown of 2020 of the COVID19 pandemic)
Texts from:
- Chroniques de Froissart after Paris ms. fr. 2663, 168v.-169r Online Froissart : P63, SHF 1-318
- Chroniques de Froissart, after London Arundel 67 (vol. 1), 360r-360v Online Froissart : L67, SHF 1-330
- Great surgery by Guy de Chauliac From the ed. by Nicaise, Edouard (1890) p. 167 ff
- Poésies de Gilles li Muisis, published for the first time, according to the manuscript of Lord Ashburnham by baron Kervyn de Lettenhove, Louvain, 1882, https://archive.org/details/posiesdegilles01lemuuoft/page/78/mode/2up,
Category | Different | Total | Values with 1 occurrence only |
---|---|---|---|
Forms | 47,661 | 1,183,960 | 23,851 |
Lemma | 11,295 | 1,183,960 | 3,852 |
POS | 66 | 1,183,960 | 6 |
Non-x values means that the category actually applied to the token: a verb will have a DEGRE annotation of x, because verb can't have DEGRE.
Category | Different | Total | Non-x values |
---|---|---|---|
Mode | 6 | 478,657 | 60,740 |
Temps | 5 | 478,657 | 57,367 |
Personne | 5 | 478,657 | 106,566 |
Nombre | 3 | 478,657 | 290,326 |
Genre | 4 | 478,657 | 226,996 |
Cas | 4 | 478,657 | 229,586 |
Degre | 5 | 478,657 | 42,949 |
Value | Count |
---|---|
NOMcom | 160,410 |
VERcjg | 156,630 |
PROper | 96,533 |
PRE | 91,586 |
PONfbl | 79,784 |
ADVgen | 79,578 |
CONcoo | 66,658 |
DETdef | 57,655 |
PONfrt | 42,489 |
CONsub | 40,120 |
VERppe | 35,647 |
ADJqua | 31,675 |
VERinf | 28,218 |
NOMpro | 27,872 |
ADVneg | 25,947 |
PROrel | 25,542 |
DETpos | 22,367 |
PROadv | 15,003 |
PRE.DETdef | 14,836 |
PROdem | 14,327 |
PROind | 11,661 |
DETind | 10,985 |
PONpga | 7,707 |
DETndf | 7,076 |
DETdem | 6,057 |
PONpdr | 4,842 |
DETcar | 3,229 |
VERppa | 2,784 |
ADJind | 2,575 |
PROimp | 2,036 |
PROcar | 1,855 |
ADJcar | 1,277 |
ADJpos | 1,049 |
PROint | 1,014 |
PONpxx | 1,012 |
ADVneg.PROper | 952 |
PROpos | 669 |
ADJord | 636 |
ADVsub | 592 |
INJ | 549 |
ADVint | 506 |
DETrel | 448 |
PROord | 327 |
PROper.PROper | 311 |
ADVgen.PROper | 271 |
DETint | 225 |
PRE.PROdem | 151 |
DETcom | 52 |
PRE.PROper | 47 |
PROrel.PROper | 46 |
RED | 34 |
ETR | 33 |
CONsub.PROper | 18 |
ADVgen.CONsub | 16 |
PRE.DETcom | 12 |
DETord | 8 |
ADJqua.NOMcom | 7 |
PRE.PROrel | 4 |
ADVing | 2 |
ADVneg.PROadv | 2 |
PROint.PROper | 1 |
CONsubs | 1 |
ADVgen.PROadv | 1 |
NomPro | 1 |
PRE.DETrel | 1 |
CONsub.DETdef | 1 |
Value | Count |
---|---|
MODE=x | 417,917 |
MODE=ind | 51,951 |
MODE=sub | 5,416 |
MODE=imp | 2,061 |
MODE=con | 1,311 |
MODE=cond | 1 |
Value | Count |
---|---|
TEMPS=x | 421,290 |
TEMPS=pst | 29,150 |
TEMPS=psp | 14,882 |
TEMPS=ipf | 9,012 |
TEMPS=fut | 4,323 |
Value | Count |
---|---|
PERS.=x | 372,091 |
PERS.=3 | 76,497 |
PERS.=1 | 18,377 |
PERS.=2 | 11,455 |
PERS.=0 | 237 |
Value | Count |
---|---|
NOMB.=s | 218,952 |
NOMB.=x | 188,331 |
NOMB.=p | 71,374 |
Value | Count |
---|---|
GENRE=x | 251,661 |
GENRE=m | 155,955 |
GENRE=f | 63,962 |
GENRE=n | 7,079 |
Value | Count |
---|---|
CAS=x | 249,071 |
CAS=r | 145,693 |
CAS=n | 75,652 |
CAS=i | 8,241 |
Value | Count |
---|---|
DEGRE=x | 435,708 |
DEGRE=- | 24,947 |
DEGRE=p | 16,622 |
DEGRE=c | 910 |
DEGRE=s | 470 |