CLDF Dataset derived from the Bahnaric data in Sidwell's "Austroasiatic dataset for phylogenetic analysis" from 2015

How to cite

If you use these data please cite

the original source

Sidwell, Paul. 2015. Austroasiatic dataset for phylogenetic analysis: 2015 version. Mon-Khmer Studies (Notes, Reviews, Data-Papers) 44. lxviii-ccclvii.
the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-By-4.0 license

Conceptlists in Concepticon:

Sidwell-2015-200

Notes

This dataset by Sidwell (2015) was used as a gold standard benchmark in the study of List et al. (2017) on automated cognate detection. It forms part of the test dataset used in this study, and it was in the form in which you find it here also prepared in this way.

List, J.-M., S. Greenhill, and R. Gray (2017): The potential of automatic word comparison for historical linguistics. PLOS ONE 12.1. 1-18. DOI: https://doi.org/10.1371/journal.pone.0170046

Statistics

Varieties: 24 (linked to 20 different Glottocodes)
Concepts: 200 (linked to 200 different Concepticon concept sets)
Lexemes: 4,546
Sources: 1
Synonymy: 1.06
Cognacy: 4,546 cognates in 1,055 cognate sets (524 singletons)
Cognate Diversity: 0.20
Invalid lexemes: 0
Tokens: 17,314
Segments: 133 (0 BIPA errors, 0 CLTS sound class errors, 133 CLTS modified)
Inventory size (avg): 47.12

Contributors

Name	GitHub user	Descriptin	Role
Johann-Mattis List	@LinguList	maintainer	Editor
Paul Sidwell		data collection	Author

CLDF Datasets

The following CLDF datasets are available in cldf:

CLDF Wordlist at cldf/cldf-metadata.json

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
cldf		cldf
etc		etc
raw		raw
.gitignore		.gitignore
.zenodo.json		.zenodo.json
CONTRIBUTORS.md		CONTRIBUTORS.md
FORMS.md		FORMS.md
LICENSE		LICENSE
NOTES.md		NOTES.md
README.md		README.md
TRANSCRIPTION.md		TRANSCRIPTION.md
lexibank_sidwellbahnaric.py		lexibank_sidwellbahnaric.py
metadata.json		metadata.json
setup.cfg		setup.cfg
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLDF Dataset derived from the Bahnaric data in Sidwell's "Austroasiatic dataset for phylogenetic analysis" from 2015

How to cite

Description

Notes

Statistics

Contributors

CLDF Datasets

About

Releases 2

Packages

Contributors 4

Languages

License

lexibank/sidwellbahnaric

Folders and files

Latest commit

History

Repository files navigation

CLDF Dataset derived from the Bahnaric data in Sidwell's "Austroasiatic dataset for phylogenetic analysis" from 2015

How to cite

Description

Notes

Statistics

Contributors

CLDF Datasets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Languages

Packages