App for transliterating text from roman script to native script for Hindi,Telugu,Tamil,Kannada languages
What is Transliteration?
Transliteration is conversion of a text from one script to another.
Using this app, words in roman script can be transliterated to native script for these languages.
Check the demo below
The transliteration model is basically a Encoder - Decoder network based on Luong style Attention .
-
Sentence is broken down into continous sequence of characters
-
Each sequence of characters is further broken down into sequence of alphabets and non-alphabets
-
While the alphabet sequences are transliterated, the non alphabets sequences remains unchanged.
-
When the romanized words (i.e alphabet sequences) are fed to the model, it performs decoding step by step
-
Beam search (with a beam size of 5) is performed over outputs at each time step and finally the word with highest log probability score is selected.
-
Once the results from beam search are obtained, the transliterated words are rejoined in a way to preserve the punctuation and final sentence is returned.
Dataset : Dakshina dataset
For queries contact - Chitreddy Sairam