If you use these data please cite
- the original source
Lieberherr, Ismail and Bodt, Timotheus Adrianus (2017): Sub-grouping Kho-Bwa based on shared core vocabulary. Himalayan Linguistics 16(2). 26-63. URL: https://escholarship.org/uc/item/4t27h5fg
- the derived dataset using the DOI of the particular released version you were using
This dataset is licensed under a CC-BY-4.0 license
Available online at https://doi.org/10.5281/zenodo.1154518
Conceptlists in Concepticon:
This data set consists of lexical entries for one hundred concepts, based on the concept lists of Haspelmath and Tadmor (2009) and Swadesh (1971). Entries were translated into twenty-two languages of the Kho-Bwa subgroup of the Sino-Tibetan language family and were annotated with respect to cognacy information.
A tutorial accompanying this data set and providing first steps towards an analysis can be found here.
- Varieties: 22 (linked to 21 different Glottocodes)
- Concepts: 100 (linked to 100 different Concepticon concept sets)
- Lexemes: 2,144
- Sources: 3
- Synonymy: 1.01
- Cognacy: 2,144 cognates in 310 cognate sets (67 singletons)
- Cognate Diversity: 0.10
- Invalid lexemes: 0
- Tokens: 9,146
- Segments: 164 (0 BIPA errors, 0 CLTS sound class errors, 164 CLTS modified)
- Inventory size (avg): 49.41
Name | GitHub user | Description | Role |
---|---|---|---|
Tiago Tresoldi | @tresoldi | orthography | Other |
Johann-Mattis List | @lingulist | code, orthography, concepts | Editor |
Robert Forkel | @xrotwang | code, integration | Editor |
Christoph Rzymski | @chrzyki | code, integraration | Editor |
Ismail Lieberherr | DataCurator, Distributor, Author | ||
Timotheus Adrianus Bodt | DataCurator, Distributor, Author |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json