Skip to content

Gradio app for transliterating text from roman script to native script for Hindi,Telugu,Tamil,Kannada languages

License

Notifications You must be signed in to change notification settings

chittiman/Transliteration-App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transliteration-App

App for transliterating text from roman script to native script for Hindi,Telugu,Tamil,Kannada languages

What is Transliteration?

Transliteration is conversion of a text from one script to another.

Using this app, words in roman script can be transliterated to native script for these languages.

Check the demo below

Working

The transliteration model is basically a Encoder - Decoder network based on Luong style Attention .

Steps

  1. Sentence is broken down into continous sequence of characters

  2. Each sequence of characters is further broken down into sequence of alphabets and non-alphabets

  3. While the alphabet sequences are transliterated, the non alphabets sequences remains unchanged.

  4. When the romanized words (i.e alphabet sequences) are fed to the model, it performs decoding step by step

  5. Beam search (with a beam size of 5) is performed over outputs at each time step and finally the word with highest log probability score is selected.

  6. Once the results from beam search are obtained, the transliterated words are rejoined in a way to preserve the punctuation and final sentence is returned.

Dataset : Dakshina dataset

For queries contact - Chitreddy Sairam

Demo

About

Gradio app for transliterating text from roman script to native script for Hindi,Telugu,Tamil,Kannada languages

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages