- Designed a system to efficiently preprocess Impure Code-Mixed text obtained from Social Media
- Performed data cleaning and preprocessing of text.
- Identified and converted various Net Lingo (e.g. Abbreviations, Slang words, Intentionally Misspelt words etc.) using a dictionary-based approach and Regex
- Designed an algorithm for transliteration of Romanized Hindi words to Devanagari script using syllabification