Skip to content

CLDF dataset derived from Greenhill and Gray’s "Bantu Basic Vocabulary Database" from 2015

License

Notifications You must be signed in to change notification settings

lexibank/bantubvd

Repository files navigation

CLDF dataset derived from Greenhill and Gray’s "Bantu Basic Vocabulary Database" from 2015

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Simon Greenhill and Russell Gray, 2015. Bantu Basic Vocabulary Database

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at http://language.psy.auckland.ac.nz/bantu/

Conceptlists in Concepticon:

Notes

The Bantu Basic Vocabulary Database was a small project to collect a basic vocabulary database for Bantu languages that subsequently stalled.

Statistics

CLDF validation Glottolog: 100% Concepticon: 98% Source: 88% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 10 (linked to 10 different Glottocodes)
  • Concepts: 430 (linked to 415 different Concepticon concept sets)
  • Lexemes: 4,256
  • Sources: 10
  • Synonymy: 1.09
  • Invalid lexemes: 0
  • Tokens: 18,925
  • Segments: 118 (0 BIPA errors, 0 CLTS sound class errors, 116 CLTS modified)
  • Inventory size (avg): 45.30

Possible Improvements:

  • Entries missing sources: 492/4256 (11.56%)

Contributors

Name GitHub user Description Role
Simon J. Greenhill @SimonGreenhill Author
Russell Gray Author
Christoph Rzymski @chrzyki patron Editor
Johann-Mattis List @lingulist orthography profile Editor

CLDF Datasets

The following CLDF datasets are available in cldf: