Skip to content

Latest commit

 

History

History
63 lines (43 loc) · 2.94 KB

README.md

File metadata and controls

63 lines (43 loc) · 2.94 KB

CLDF dataset accompanying Auderset et al.'s "Subgrouping in a dialect continuum" from 2023

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Auderset, Sandra, Simon J. Greenhill, Christian T. DiCanio, Eric W. Campbell. (2023) "Subgrouping in a `dialect continuum': A Bayesian phylogenetic analysis of the Mixtecan language family". Journal of Language Evolution 8 (1). 33–-63. DOI: https://doi.org/10.1093/jole/lzad004.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at https://doi.org/10.1093/jole/lzad004

Conceptlists in Concepticon:

Statistics

CLDF validation Glottolog: 79% Concepticon: 92% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 110 (linked to 61 different Glottocodes)
  • Concepts: 240 (linked to 204 different Concepticon concept sets)
  • Lexemes: 15,932
  • Sources: 42
  • Synonymy: 1.03
  • Cognacy: 17,876 cognates in 947 cognate sets (336 singletons)
  • Cognate Diversity: 0.05
  • Invalid lexemes: 0
  • Tokens: 80,084
  • Segments: 160 (0 BIPA errors, 0 CLTS sound class errors, 157 CLTS modified)
  • Inventory size (avg): 33.52

Contributors

Name GitHub user Role Description
Sandra Auderset @SAuderset Author Data collection, cognate coding
Johann-Mattis List @LinguList Editor CLDF data conversion, orthography profile creation
Johannes English @johenglisch Editor CLDF data conversion
Christoph Rzymski @chrzyki Editor CLDF data conversion
Eric W. Campbell Other Cognate coding
Christian T. DiCanio Other Cognate coding
Simon J. Greenhill @SimonGreenhill Author, Editor

CLDF Datasets

The following CLDF datasets are available in cldf: