Code and data for Temperature shapes language sonority: Revalidation from a large dataset.
The following 5 steps can be run separately as the output of each step is already provided in this repository (see Data below). Steps 1 and 2 require a local storage of the ASJP dataset and the FLDAS dataset, but you can skip these two steps so you do not need to download full datasets.
Run python get_sonority.py [raw_path]
, where [raw_path]
is the path to raw
folder in the local ASJP dataset (e.g. python get_sonority.py C:/ASJP/raw/
). Results will be saved as sonorities.csv
, phones.csv
, word_structures.csv
, and word_lengths.csv
in the data
folder.
Run python get_temperature.py [FLDAS_path]
to extract monthly temperature data of all doculects in sonorities.csv
, where [FLDAS_path]
is the path to FLDAS_NOAH01_C_GL_M.001
folder of the local FLDAS dataset (e.g. python get_temperature.py C:/FLDAS/FLDAS_NOAH01_C_GL_M.001/
). Results will be saved as data/temperatures.csv
.
Run python get_temperature_global.py [FLDAS_path]
to extract global monthly mean temperature data. Result will be saved as temperature_global.csv
.
Run python plot_global.py
. Plot will be saved as figure/global.png
.
Run python process.py
. Results will be saved as data.csv
, data_genus.csv
, data_family.csv
, and data_macroarea.csv
in the data
folder.
Run corresponding code blocks in process.r
in R.
Run python test_vowel_length_solutions.py [raw_path]
. Results will be saved as data/vowel_length_solutions.csv
. Then, run code block of “Plot correlations between vowel length solutions” in process.r
to plot correlations.
All extracted data files are in the data
folder.
temperatures.csv
: Monthly temperature (1982–2022) for each filtered doculecttemperature_global.csv
: Global mean annual temperature over 41 years (180° W–180° E, 60° S–90° N)
sonorities.csv
: Mean sonority index (MSI) of each filtered doculect. We adapted 5 methods to calculate MSI from ASJP codes:index0
: Parker’s scale, from Sonority in The Blackwell Companion to Phonologyindex1
: Fought’s scale, from Sonority and climate in a world sample of languages: Findings and prospectsindex2
: List’s scale, from Sequence Comparison in Historical Linguisticsindex3
: Clements’s scale, from The role of the sonority cycle in core syllabification in Papers in Laboratory Phonologyindex4
: Sonorant index (here obstruent = 1; sonorant = 2)index5
: Vowel index (here consonant = 1; semivowel = 2; vowel = 3)index6
: List’s scale, calculated using LingPytokens2class()
phones.csv
: Extracted phones from all doculectsword_structures.csv
: Word structures statistics of all doculects, characterized byC
(= consonant) andV
(= vowel) symbolsword_structures_grouped.csv
: Word lengths statistics of all doculectsvowel_length_solutions.csv
: MSI results under three vowel length solutions
data.csv
: Data for each filtered doculect, with temperature data and linguistic data combinedWL
: Mean word lengthIndex0
toIndex6
: MSIs in 7 methodsT
: Mean annual temperatureT_max
: Max of 41-year mean monthly temperaturesT_min
: Min of 41-year mean monthly temperaturesT_sd
: Standard deviation of monthly temperatures over 41 yearsT_diff
: Mean annual range of temperatureIndex0_trans
, etc.: Transformated above data
data_genus.csv
: Data for each language “genus” classified by WALSdata_family.csv
: Data for each language family classified by WALSdata_macroarea.csv
: Data for each macroarea (North America, South America, Eurasia, Africa, Greater New Guinea, and Australia)
All saved figure files are in the figures
folder.
global.png
(also converted intoglobal.pdf
): Global distribution of MATs and MSIsdistribution.pdf
: Distribution of MATs and MSIs grouped by macroareacorrelation.pdf
: Relationship between MSI and MATcorrelation_by_family.pdf
: Relationship between MSI and MAT of the top 25 largest familiesword_length.pdf
: Relationship between mean word length and MSI or MATword_length_by_family.pdf
: Relationship between MSI and mean word length of the top 25 largest familiesrange.pdf
: Relationship between mean annual range and MATvowel_length_solutions.pdf
: Relationship between vowel length solutions