Skip to content

Commit

Permalink
updating the orthography profile to achieve full clts comap
Browse files Browse the repository at this point in the history
  • Loading branch information
lingulist committed Jan 30, 2019
2 parents 4e9965c + 4a33bb1 commit 01c7135
Show file tree
Hide file tree
Showing 18 changed files with 5,022 additions and 4,694 deletions.
6 changes: 6 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
language: python
python: "3.6"
cache: pip
before_cache: rm -f $HOME/.cache/pip/log/debug.log
install: pip install pytest-cldf
script: pytest --cldf-metadata=cldf/cldf-metadata.json test.py
36 changes: 36 additions & 0 deletions .zenodo.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"description": "<p>Original source of the data:</p>\n\n<blockquote>\n<p>Lieberherr and Bodt. Khowba.</p>\n</blockquote>",
"license": "Apache-2.0",
"title": "",
"keywords": [
],
"grants": [
{
"id": "10.13039/501100000780::715618"
}
],
"upload_type": "dataset",
"version": "v0.1",
"communities": [
{
"identifier": "calc"
},
{
"identifier": "lexibank"
}
],
"publication_date": "2019-01-30",
"creators": [
{
"affiliation": "Max Planck Institute for the Science of Human History",
"name": "Tiago Tresoldi",
"ordic": "0000-0002-2863-1467"
},
{
"affiliation": "Max Planck Institute for the Science of Human History",
"name": "Johann-Mattis List",
"orcid": "0000-0003-2133-8919"
}
],
"access_right": "open"
}
408 changes: 408 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

32 changes: 15 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,28 @@

Cite the source dataset as

>
> Lieberherr, Ismail and Bodt, Timotheus Adrianus (2017): Sub-grouping Kho-Bwa based on shared core vocabulary. Himalayan Linguistics 16(2). 26-63. URL: https://escholarship.org/uc/item/4t27h5fg
## Statistics
This dataset is licensed under a https://creativecommons.org/licenses/by-nc/4.0/ license

Available online at https://doi.org/10.5281/zenodo.1154518

## Statistics


[![Build Status](https://travis-ci.org/lexibank/lieberherrkhobwa.svg?branch=master)](https://travis-ci.org/lexibank/lieberherrkhobwa)
![Glottolog: 100%](https://img.shields.io/badge/Glottolog-100%25-brightgreen.svg "Glottolog: 100%")
![Concepticon: 100%](https://img.shields.io/badge/Concepticon-100%25-brightgreen.svg "Concepticon: 100%")
![Source: 0%](https://img.shields.io/badge/Source-0%25-red.svg "Source: 0%")
![BIPA: 98%](https://img.shields.io/badge/BIPA-98%25-green.svg "BIPA: 98%")
![CLTS SoundClass: 98%](https://img.shields.io/badge/CLTS%20SoundClass-98%25-green.svg "CLTS SoundClass: 98%")
![Source: 100%](https://img.shields.io/badge/Source-100%25-brightgreen.svg "Source: 100%")
![BIPA: 100%](https://img.shields.io/badge/BIPA-100%25-brightgreen.svg "BIPA: 100%")
![CLTS SoundClass: 100%](https://img.shields.io/badge/CLTS%20SoundClass-100%25-brightgreen.svg "CLTS SoundClass: 100%")

- **Varieties:** 20
- **Varieties:** 22
- **Concepts:** 100
- **Lexemes:** 1,935
- **Lexemes:** 2,130
- **Synonymy:** 1.00
- **Cognacy:** 1,874 cognates in 234 cognate sets
- **Cognacy:** 2,063 cognates in 243 cognate sets
- **Invalid lexemes:** 0
- **Tokens:** 7,112
- **Segments:** 143 (3 BIPA errors, 3 CTLS sound class errors, 140 CLTS modified)
- **Inventory size (avg):** 48.65

## Possible Improvements:



- Entries missing sources: 1935/1935 (100.00%)
- **Tokens:** 7,913
- **Segments:** 159 (0 BIPA errors, 0 CTLS sound class errors, 159 CLTS modified)
- **Inventory size (avg):** 48.18
238 changes: 116 additions & 122 deletions TRANSCRIPTION.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,151 +5,167 @@

| Segment | Occurrence | BIPA | CLTS SoundClass |
|:----------|-------------:|:-------|:------------------|
| a | 852 |||
| ŋ | 373 |||
| i | 359 |||
| ʔ | 331 |||
| j | 283 |||
| u | 282 |||
| n | 277 |||
| k | 270 |||
| ə | 226 |||
| e | 225 |||
| m | 224 |||
| h | 213 |||
| r | 200 |||
| l | 177 |||
| b | 176 |||
| ɛ | 128 |||
| a | 945 |||
| ŋ | 421 |||
| i | 389 |||
| k | 351 |||
| u | 295 |||
| n | 293 |||
| j | 277 |||
| ə | 273 |||
| h | 243 |||
| m | 233 |||
| e | 231 |||
| r | 209 |||
| b | 196 |||
| l | 193 |||
| p | 145 |||
| t | 145 |||
| ³³ | 135 |||
| ɛ | 131 |||
|| 127 |||
| p | 118 |||
|| 118 |||
| t | 109 |||
| s | 108 |||
| ɔ | 102 |||
| o | 92 |||
| g | 82 |||
| d | 76 |||
| tɕʰ | 71 |||
|| 119 |||
| ʔ | 113 |||
| o | 111 |||
| s | 110 |||
| ɔ | 103 |||
| g | 91 |||
| d | 81 |||
| tɕʰ | 75 |||
| ⁵⁵ | 72 |||
| w | 68 |||
| y | 68 |||
|| 66 |||
| y | 65 |||
|| 64 |||
| z | 63 |||
| w | 62 |||
| ua | 66 |||
| z | 66 |||
|| 63 |||
| ɛː | 62 |||
|| 61 |||
|| 57 |||
| ɹ | 57 |||
| ai | 50 |||
| ua | 46 |||
|| 41 |||
| ei | 37 |||
| ɲ | 35 |||
|| 31 |||
| ɹ | 61 |||
|| 60 |||
| ai | 59 |||
|| 59 |||
| ⁵³ | 55 |||
| ³¹ | 48 |||
| ei | 46 |||
|| 45 |||
| ɲ | 37 |||
| ui | 35 |||
| f | 34 |||
| ɔː | 31 |||
| f | 30 |||
| ui | 28 |||
| | 30 |||
| ɕ | 29 |||
| ʃ | 28 |||
| N/ɴ | 27 |||
| ɕ | 25 |||
| | 26 |||
|| 23 |||
|| 20 |||
| ĩː | 20 |||
| ɨː | 20 |||
| oᵘ/o | 19 |||
| ts | 19 |||
| aⁱ | 18 | ? | ? |
| ɔ̃ː | 18 |||
|| 17 |||
|| 16 |||
| v | 16 |||
|| 23 |||
| v | 23 |||
| ĩː | 21 |||
| oᵘ/o | 20 |||
| ts | 20 |||
| ɣ | 20 |||
| ɨː | 19 |||
| ɯ | 19 |||
| aⁱ/a | 18 |||
| au | 17 |||
| ɔ̃ː | 17 |||
| ɨ | 15 |||
| õ | 14 |||
| tsʰ | 13 |||
| əː | 13 |||
| au | 12 |||
| eⁱ/e | 11 |||
| k̚/k | 11 |||
| ue | 13 |||
| ʈ | 13 |||
| oⁱ/o | 11 |||
| ue | 11 |||
| aᵘ/a | 10 |||
|| 10 |||
| ou | 10 |||
|| 10 |||
| ɔ̃ | 10 |||
| eⁱ/e | 10 |||
| tsʰ | 10 |||
| æ | 10 |||
| əː | 10 |||
| ia | 9 |||
| ou | 9 |||
|| 9 |||
| ɔ̃ | 9 |||
|| 8 |||
| ie | 8 |||
| x | 8 |||
| ũː | 8 |||
| əu | 8 |||
| c | 7 |||
|| 7 |||
| oᵉ/o | 7 |||
| x | 7 |||
| əu | 7 |||
| | 7 |||
| ʑ | 7 |||
|| 6 |||
| ũː | 6 |||
| ã | 6 |||
| ɬ | 6 |||
| ʒ | 6 |||
|| 5 |||
| yi | 5 |||
| ã | 5 |||
| aʳ/a˞ | 5 |||
| ¹¹ | 5 |||
| ãː | 5 |||
| ý/y | 5 |||
| ɛ̃ | 5 |||
| ɾ | 5 |||
| ūa | 4 |||
| dz | 4 |||
| rᵘ/r | 4 |||
| yi | 4 |||
| ʈʂʰ | 4 |||
| ᵘ/w | 4 |||
| dz | 3 |||
| ʐ | 4 |||
| eoː | 3 |||
| eʴ/e˞ | 3 |||
| iu | 3 |||
|| 3 |||
| iᵒ/i | 3 |||
| ø | 3 |||
| øː | 3 |||
| ɔi | 3 |||
| ūa | 3 |||
| ɛi | 3 |||
| ɛʰ | 3 | ? | ? |
| ɛ̃ | 3 |||
| ɛʰ/ɛ | 3 |||
| ɛ̃ː | 3 |||
| ɦ | 3 |||
| ʂ | 3 |||
| ʐ | 3 |||
| aːu | 2 |||
|| 2 |||
| iᵒ/io | 2 |||
| oa | 2 |||
| oe | 2 |||
| oi | 2 |||
| oːɛ | 2 |||
| p̚/p | 2 |||
|| 2 |||
| uᵉ/ue | 2 |||
|| 2 |||
|| 2 |||
| uᵉ/u | 2 |||
| õa | 2 |||
| õᵘ/õu | 2 |||
| ĩ | 2 |||
| ŋ̊ | 2 |||
| ũ | 2 |||
| ȶ | 2 |||
| ɐ | 2 |||
| ɔi | 2 |||
| əo | 2 |||
| əʳ/ə˞ | 2 |||
| ɨ̃ | 2 |||
| ʑ | 2 |||
| ae | 1 |||
| ao | 1 |||
| aʲ/a | 1 |||
| aʴ/a˞ | 1 |||
| aːʰ/aː | 1 |||
|| 1 |||
| ia | 1 |||
| iu | 1 |||
| eʳ/e˞ | 1 |||
| eᵘ/e | 1 |||
|| 1 |||
| iᵒ/i | 1 |||
| iⁱ/i | 1 |||
|| 1 |||
| uei | 1 | ? | ? |
| uːə | 1 |||
| oʳ/o˞ | 1 |||
| oːɛ | 1 |||
| uⁱ/u | 1 |||
| á | 1 |||
| æʳ/æ˞ | 1 |||
| ç | 1 |||
| õː | 1 |||
| õːe | 1 |||
| ī/i | 1 |||
| õᵘ/õ | 1 |||
| ũə | 1 |||
| ɐ | 1 |||
| ɖʐ | 1 |||
| ə̄/ə | 1 |||
| ū/u | 1 |||
| ɔʳ/ɔ˞ | 1 |||
| əʴ/ə˞ | 1 |||
| ə̌/ə | 1 |||
| ɛĩ | 1 |||
| ẽeᵘ/ẽu | 1 |||
| ɛʴ/ɛ˞ | 1 |||
| ɟ | 1 |||
|| 1 |||
|| 1 |||

(143 rows)
(159 rows)



Expand All @@ -165,30 +181,8 @@
## Words with invalid segments (up to 100 only)

| ID | LANGUAGE | CONCEPT | FORM | SEGMENTS |
|:----------|-----------:|----------:|:-------|:------------------|
| 1-1210-1 | 1 | 1210 | laⁱ | l <s> aⁱ </s> |
| 1-1343-1 | 1 | 1343 | hanaⁱ | h a n <s> aⁱ </s> |
| 1-163-1 | 1 | 163 | esaⁱ | e s <s> aⁱ </s> |
| 10-1220-1 | 10 | 1220 | ʔa-nɛʰ | ʔ a n <s> ɛʰ </s> |
| 14-1210-1 | 14 | 1210 | elaⁱ | e l <s> aⁱ </s> |
| 14-1343-1 | 14 | 1343 | hanaⁱ | h a n <s> aⁱ </s> |
| 14-2098-1 | 14 | 2098 | laⁱ | l <s> aⁱ </s> |
| 14-946-1 | 14 | 946 | məfaⁱ | m ə f <s> aⁱ </s> |
| 15-1220-1 | 15 | 1220 | ʔa-nɛʰ | ʔ a n <s> ɛʰ </s> |
| 16-227-1 | 16 | 227 | ʧuei | tʃ <s> uei </s> |
| 20-1210-1 | 20 | 1210 | laⁱ | l <s> aⁱ </s> |
| 20-1343-1 | 20 | 1343 | hanaⁱ | h a n <s> aⁱ </s> |
| 20-163-1 | 20 | 163 | esaⁱ | e s <s> aⁱ </s> |
| 21-1210-1 | 21 | 1210 | elaⁱ | e l <s> aⁱ </s> |
| 21-163-1 | 21 | 163 | esaⁱ | e s <s> aⁱ </s> |
| 21-2009-1 | 21 | 2009 | ʧaⁱ | tʃ <s> aⁱ </s> |
| 6-1220-1 | 6 | 1220 | ʔa-nɛʰ | ʔ a n <s> ɛʰ </s> |
| 7-1210-1 | 7 | 1210 | laⁱ | l <s> aⁱ </s> |
| 7-1343-1 | 7 | 1343 | hanaⁱ | h a n <s> aⁱ </s> |
| 7-2098-1 | 7 | 2098 | laⁱ | l <s> aⁱ </s> |
| 7-221-1 | 7 | 221 | baⁱ | b <s> aⁱ </s> |
| 7-946-1 | 7 | 946 | nəfaⁱ | n ə f <s> aⁱ </s> |

(22 rows)
||

(0 rows)


Loading

0 comments on commit 01c7135

Please sign in to comment.