Skip to content
/ bdpa Public

CLDF dataset derived from List and Prokić's "Benchmark Database of Phonetic Alignments" from 2014

License

Notifications You must be signed in to change notification settings

lexibank/bdpa

Repository files navigation

CLDF dataset derived from List and Prokić's "Benchmark Database of Phonetic Alignments" from 2014

CLDF validation

How to cite

If you use these data please cite

  • the original source

    List, Johann-Mattis and Jelena Prokić. (2014). A benchmark database of phonetic alignments in historical linguistics and dialectology. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 26 — 31 May 2014, Reykjavik. 288-294.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at https://zenodo.org/record/11880/files/germanic.zip

Statistics

CLDF validation Glottolog: 86% Concepticon: 76% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 538 (linked to 61 different Glottocodes)
  • Concepts: 590 (linked to 297 different Concepticon concept sets)
  • Lexemes: 50,095
  • Sources: 11
  • Synonymy: 1.06
  • Cognacy: 50,095 cognates in 750 cognate sets (0 singletons)
  • Cognate Diversity: 0.00
  • Invalid lexemes: 0
  • Tokens: 216,493
  • Segments: 753 (2 BIPA errors, 2 CLTS sound class errors, 752 CLTS modified)
  • Inventory size (avg): 51.73

Contributors

Name GitHub user Description Role
Johann-Mattis List @LinguList maintainer Author
Jelena Prokić DataCollector Author

CLDF Datasets

The following CLDF datasets are available in cldf: