UzTranslit | State-of-the-art machine transliteration tool for Uzbek language, Cyrillic<>Latin<>NewLatin
The main goal of this paper is to present a state-of-the-art machine transliteration tool between three common scripts used in low-resource Uzbek language: old Cyrillic, currently official Latin, and newly announced New-Latin alphabets, which was created using a combination of rule-based and statistical approaches. The created tool is available as an open-source Python package, as well as a web-based application including a public API
Feel free to use the tool presented in this project, and if you find it useful, plese make sure to cite the paper here (coming soon...) Demo of the web-based transliteration tool can be seen here.
In this paper, we presented a Python code, a web tool, and an API created for the Uzbek language that performs machine transliteration between two popularly used Cyrillic and Latin alphabets, as well as a newly reformed version of the Latin alphabet, which, according to the governmental decree, all legal texts will have been completely adapted to by year 2023.
Programming language used:
These are the major libraries used inside Python:
Distributed under the MIT LICENSE. See LICENSE.txt
for more information.
We are grateful for these resources and tutorials for making this repository possible: