Skip to content

SimplifyUR: Unsupervised Lexical Text Simplification for Urdu

License

Notifications You must be signed in to change notification settings

harisbinzia/SimplifyUR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimplifyUR

This repository contains code, dataset and models for Urdu text simplification as described in paper SimplifyUR: Unsupervised Lexical Text Simplification for Urdu.

Requirement(s)

The source is available as a Jupyter notebook for a Python 3 kernel. Please see requirements.txt for details.

Model(s)

Pre-trained models including Word2Vec, Parts of Speech (PoS) tagger and Language Model (LM) are available for download. Download and extract them to root directory, SimplifyUR.

Dataset

A parallel corpus of complex-simplified Urdu sentence-pairs is the Data folder.

Reference(s)

If you use this tool in any of your work, please cite below paper.

SimplifyUR: Unsupervised Lexical Text Simplification for Urdu

@InProceedings{qasmi-EtAl:2020:LREC,
  author    = {Qasmi, Namoos Hayat  and  Zia, Haris Bin  and  Athar, Awais  and  Raza, Agha Ali},
  title     = {SimplifyUR: Unsupervised Lexical Text Simplification for Urdu},
  booktitle      = {Proceedings of The 12th Language Resources and Evaluation Conference},
  month          = {May},
  year           = {2020},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages     = {3484--3489},
  url       = {https://www.aclweb.org/anthology/2020.lrec-1.428}
}

License(s)

Copyright (c) 2020 CSaLT, ITU

Code licensed under the MIT License: http://opensource.org/licenses/MIT. Data licensed under CC-BY 4.0: https://creativecommons.org/licenses/by/4.0/

About

SimplifyUR: Unsupervised Lexical Text Simplification for Urdu

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published