Skip to content

CLDF dataset derived from Lieberherr and Bodt's "Comparative Wordlists of Kho-Bwa" from 2017

License

Notifications You must be signed in to change notification settings

lexibank/lieberherrkhobwa

Repository files navigation

CLDF dataset derived from Lieberherr and Bodt's "Comparative Wordlists of Kho-Bwa" from 2017

CLDF validation

How to cite

If you use these data please cite

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at https://doi.org/10.5281/zenodo.1154518

Conceptlists in Concepticon:

Notes

This data set consists of lexical entries for one hundred concepts, based on the concept lists of Haspelmath and Tadmor (2009) and Swadesh (1971). Entries were translated into twenty-two languages of the Kho-Bwa subgroup of the Sino-Tibetan language family and were annotated with respect to cognacy information.

A tutorial accompanying this data set and providing first steps towards an analysis can be found here.

Statistics

CLDF validation Glottolog: 100% Concepticon: 100% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 22 (linked to 21 different Glottocodes)
  • Concepts: 100 (linked to 100 different Concepticon concept sets)
  • Lexemes: 2,144
  • Sources: 3
  • Synonymy: 1.01
  • Cognacy: 2,144 cognates in 310 cognate sets (67 singletons)
  • Cognate Diversity: 0.10
  • Invalid lexemes: 0
  • Tokens: 9,146
  • Segments: 164 (0 BIPA errors, 0 CLTS sound class errors, 164 CLTS modified)
  • Inventory size (avg): 49.41

Contributors

Name GitHub user Description Role
Tiago Tresoldi @tresoldi orthography Other
Johann-Mattis List @lingulist code, orthography, concepts Editor
Robert Forkel @xrotwang code, integration Editor
Christoph Rzymski @chrzyki code, integraration Editor
Ismail Lieberherr DataCurator, Distributor, Author
Timotheus Adrianus Bodt DataCurator, Distributor, Author

CLDF Datasets

The following CLDF datasets are available in cldf: