Skip to content

Latest commit

 

History

History

cldf

Wordlist CLDF dataset derived from Sagart et al.'s "Sino-Tibetan Database of Lexical Cognates" from 2019

CLDF Metadata: cldf-metadata.json

Sources: sources.bib

property value
dc:bibliographicCitation Laurent Sagart, Jacques, Guillaume, Yunfan Lai, and Johann-Mattis List (2019): Sino-Tibetan Database of Lexical Cognates. Jena: Max Planck Institute for the Science of Human History.
dc:conformsTo CLDF Wordlist
dc:format
  1. http://concepticon.clld.org/contributions/Sagart-2019-250
dc:identifier http://dighl.github.io/sinotibetan/
dc:license https://creativecommons.org/licenses/by/4.0/
dcat:accessURL https://github.com/lexibank/sagartst
prov:wasDerivedFrom
  1. lexibank/sagartst v2.0-1-ga0b483c
  2. Glottolog v5.0
  3. Concepticon v3.2.0
  4. CLTS v2.3.0
prov:wasGeneratedBy
  1. lingpy-rcParams: lingpy-rcParams.json
  2. python: 3.12.4
  3. python-packages: requirements.txt
rdf:ID sagartst
rdf:type http://www.w3.org/ns/dcat#Distribution

Table forms.csv

Raw lexical data item as it can be pulled out of the original datasets.

This is the basis for creating rows in CLDF representations of the data by

  • splitting the lexical item into forms
  • cleaning the forms
  • potentially tokenizing the form
property value
dc:conformsTo CLDF FormTable
dc:extent 12179

Columns

Name/Property Datatype Description
ID string Primary key
Local_ID string
Language_ID string References languages.csv::ID
Parameter_ID string References parameters.csv::ID
Value string
Form string
Segments list of string (separated by )
Comment string
Source list of string (separated by ;) References sources.bib::BibTeX-key
Cognacy string
Loan boolean
Graphemes string
Profile string
property value
dc:conformsTo CLDF LanguageTable
dc:extent 50

Columns

Name/Property Datatype Description
ID string Primary key
Name string
Glottocode string
Glottolog_Name string
ISO639P3code string
Macroarea string
Latitude decimal
≥ -90
≤ 90
Longitude decimal
≥ -180
≤ 180
Family string
Name_in_Text string
Name_in_Source string
SubGroup string
Coverage string
Source string
Number string
property value
dc:conformsTo CLDF ParameterTable
dc:extent 250

Columns

Name/Property Datatype Description
ID string Primary key
Name string
Concepticon_ID string
Concepticon_Gloss string
TBL_ID string
Coverage string
property value
dc:conformsTo CLDF CognateTable
dc:extent 12179

Columns

Name/Property Datatype Description
ID string Primary key
Form_ID string References forms.csv::ID
Form string
Cognateset_ID string
Doubt boolean
Cognate_Detection_Method string
Source list of string (separated by ;) References sources.bib::BibTeX-key
Alignment list of string (separated by )
Alignment_Method string
Alignment_Source string