Skip to content

Repository for all codes, data and resources on Awadhi language that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Mr. Abdul Basit.

License

Notifications You must be signed in to change notification settings

kmi-linguistics/awadhi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awadhi

Repository for all codes, data and resources on Awadhi language that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Mr. Abdul Basit. It contains a raw corpus of approximately 70,000 tokens and a POS-annotated corpus of approximately 20,000 tokens.

The raw corpus is in the directory called 'source'. And the annotated corpus is in the directory called 'annotation'. The annotations are in CONLL 2000 format.

The original dissertation and the complete tagset is in the directory called 'publications'.

Please use the following to cite the corpus:

Basit, Abdul. 2017. A POS Annotated Corpus of Awadhi Language. Unpublished M.Phil. Dissertation, Dr. Bhim Rao Ambedkar University, Agra

For any queries, offers of collaboration and others, please send an email to linguistics[dot]kmi[at]gmail[dot]com

About

Repository for all codes, data and resources on Awadhi language that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Mr. Abdul Basit.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published