Skip to content

Generate minimal pairs (and minimal sets) for US English words

License

Notifications You must be signed in to change notification settings

brandonlim-hs/minpair

Repository files navigation

Minpair

Generate minimal pairs (and minimal sets) for US English words.

In phonology, minimal pairs are pairs of words or phrases in a particular language, spoken or signed, that differ in only one phonological element

-- https://en.wikipedia.org/wiki/Minimal_pair

>>> import minpair
>>> minpair.vowel_minpair(['AE', 'EH'])[:4]
[{'AE': 'al', 'EH': 'l'}, {'AE': 'axe', 'EH': 'x'}, {'AE': 'bad', 'EH': 'bed'}, {'AE': 'bag', 'EH': 'beg'}]

Installation

pip install -U minpair
>>> import minpair

Usage

Vowel minimal pairs

Words that differ in only one vowel phonological element. For example: bad, bed

>>> minpair.vowel_minpair(['AE', 'EH'])[:4]
[{'AE': 'al', 'EH': 'l'}, {'AE': 'axe', 'EH': 'x'}, {'AE': 'bad', 'EH': 'bed'}, {'AE': 'bag', 'EH': 'beg'}]

Config

Corpus data

This package depends on a few NLTK's corpora, namely: brown, cmudict, universal_tagset, and words corpus. By default, this package will download these corpora into NLTK data directory if not available.

To disable the auto download of corpus data:

>>> minpair.generator(download_corpus=False).vowel_minpair(['AE', 'EH'])[:4]
[{'AE': 'al', 'EH': 'l'}, {'AE': 'axe', 'EH': 'x'}, {'AE': 'bad', 'EH': 'bed'}, {'AE': 'bag', 'EH': 'beg'}]

POS

This package depends on part-of-speech tagger to filter words from meaningful lexical categories. List of possible POS tags are found here. By default, this package will only return words that are tagged as 'ADJ', 'NOUN' or 'VERB'.

To use different POS tags:

>>> minpair.generator(pos=['VERB']).vowel_minpair(['AE', 'EH'])[:4]
[{'AE': 'bag', 'EH': 'beg'}, {'AE': 'bat', 'EH': 'bet'}, {'AE': 'blast', 'EH': 'blest'}, {'AE': 'kept', 'EH': 'kept'}]

Alternatively, using method chaining:

>>> minpair.generator().pos(['VERB']).vowel_minpair(['AE', 'EH'])[:4]
[{'AE': 'bag', 'EH': 'beg'}, {'AE': 'bat', 'EH': 'bet'}, {'AE': 'blast', 'EH': 'blest'}, {'AE': 'kept', 'EH': 'kept'}]

About

Generate minimal pairs (and minimal sets) for US English words

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages