This repository provides a Kindle-compatible format of the English-Ukrainian dictionary (Англо-український словник М.І. Балла), a comprehensive resource originally sourced from bakustarver/ukr-dictionaries-list-opensource.
The dictionary has been optimized for use on Kindle devices, enabling seamless access to translations and definitions directly within your e-reader.
- Download the latest version of the dictionary file in MOBI format.
- Install Calibre on your computer.
- Connect your Kindle to your computer using a USB cable.
- In Calibre, go to Device > Add Books from a single folder, and select the downloaded MOBI file.
- Once the transfer is complete, go to Device > Eject to safely remove your Kindle.
- On your Kindle, set the newly added dictionary as the default dictionary.
The scripts have been tested with Python 3.10.16, and the following dependencies are required:
pip install pyinflect
pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm
Note
On macOS ARM (e.g., M1/M2), use spacy[apple]
instead of spacy
for compatibility.
For more details, refer to the SpaCy installation guide.
The following scripts are executed sequentially to process the original source dictionary:
Script Name | Purpose |
---|---|
00.sanitize.py | Removes metadata entries from the dictionary. |
01.crosslinks.py | Merges dictionary articles with simple links into a single consolidated entry. |
02.varcon-csv.py | Converts the Variant Conversion (VarCon) dataset into a CSV of British and American variants. |
03.variants.py | Applies the generated CSV to add synonyms and variants to the dictionary. |
04.irregular-nouns.py | Extracts irregular noun inflections from the dictionary and saves them in a CSV. |
05.filter-irregular-nouns.py | Merges dictionary articles for irregular nouns into their main entries. |
06.regular-nouns.py | Processes regular noun inflections, such as plural forms. |
07.adjectives.py | Generates comparative and superlative forms of adjectives. |
08.filter-irregular-verbs.py | Merges dictionary articles for irregular verbs into their main entries. |
09.irregular-verbs.py | Extracts irregular verb inflections and saves them in a CSV. |
10.regular-verbs.py | Processes regular verb inflections (e.g., past tense, participles, -ing form, singular forms). |
11.all-inflections.py | Applies all previously extracted and processed inflections back into the dictionary. |
12.clean-markup.py | Cleans up redundant markup from the original source file. |
13.convert-to-xhtml.py | Converts the processed dictionary data into XHTML format for final output. |
Each script plays a critical role in transforming the source dictionary into its final structured and usable format.
Thanks to these great resources that helped in preparing this dictionary:
-
Ukrainian offline dictionaries in open formats for providing the source of this dictionary.
-
Jake McCrary's article Creating a custom Kindle dictionary for explaining the basics of the Kindle dictionary format.
-
Hossein Yazdani's open-source English-Persian Dictionary and Kindle Custom Dictionary Scripts for providing basic scripts that actually work.
-
Kevin Atkinson and Benjamin Titze for the VarCon dataset (Variant Conversion Info), which provides information to convert between American, British, Canadian, and Australian spellings and vocabulary.
Some parts of this project, including code and images, were generated with the assistance of OpenAI's ChatGPT. OpenAI asserts no copyright over the outputs you generate with ChatGPT, and you are free to use them in accordance with the terms of the OpenAI Usage Policies.
The VarCon dataset is Copyright 2000-2020 by Kevin Atkinson and Benjamin Titze and is used under the terms of its license, which permits use, modification, and redistribution with proper attribution.
The VarCon dataset was derived from numerous sources, including the Ispell distribution, and is provided "as is" without warranty.
For more details, visit the official VarCon page: http://wordlist.aspell.net/.