Skip to content
Change the repository type filter

All

    Repositories list

    • Other
      1300Updated Jan 10, 2024Jan 10, 2024
    • Repository of the data and models generated by Mr. Shyam Ratan as part of his MPhil dissrtation titled 'Automatic Detection Of Propaganda In Hindi On Social Media'
      Jupyter Notebook
      Other
      1000Updated Jan 10, 2024Jan 10, 2024
    • ComMA

      Public
      Other
      1000Updated Jan 10, 2024Jan 10, 2024
    • SpeeD-IA

      Public
      Repository for different Speech Datasets and Models for Indo-Aryan languages prepared by the Department under different projects
      Other
      1000Updated Nov 27, 2023Nov 27, 2023
    • Repository of data and scripts of UGC-UKIERI Project on "Automatic Detection of Verbal Threat in HIndi and English Aggressive Speech"
      Shell
      Apache License 2.0
      1000Updated Apr 4, 2023Apr 4, 2023
    • crawlers

      Public
      Python
      Apache License 2.0
      4000Updated Apr 4, 2023Apr 4, 2023
    • mscrabble

      Public
      Repository for Multilingual Scrabble Generator and Games - especially aimed towards endangered languages
      JavaScript
      MIT License
      3300Updated Dec 16, 2021Dec 16, 2021
    • bhojpuri

      Public
      Resources and Technologies for Bhojpuri
      Python
      0000Updated Oct 10, 2021Oct 10, 2021
    • This repository contains code and details of the KMI-Panlingua-IITKGP system submitted to the SigTyp 2020 Shared Task on Prediction of Linguistic Features. It could be used for training and prediction on any new dataset in the same format with similar information.
      Apache License 2.0
      0000Updated Oct 27, 2020Oct 27, 2020
    • This is the repository of the aggression project carried out as part of the The Aggression Project at the Microsoft Research India Summer Workshop on Artificial Social Intelligence in June 2017. The repository contains all codes and datasets generated during the school.
      Python
      Apache License 2.0
      0000Updated Sep 7, 2020Sep 7, 2020
    • bodo

      Public
      This repository contains all the resources (corpora) of Bodo and tools that were developed for creating and managing these resources
      Apache License 2.0
      1200Updated Jul 31, 2020Jul 31, 2020
    • braj

      Public
      Repository for all codes, data and resources on Braj Bhasha that is being developed at the Institute.
      Python
      Apache License 2.0
      0000Updated Jul 30, 2020Jul 30, 2020
    • magahi

      Public
      This repository contains all the data, tools, applications and publications related to Magahi, an Indo-Aryan language
      Java
      Apache License 2.0
      2300Updated Jul 30, 2020Jul 30, 2020
    • trac-2

      Public
      Repository hosting dataset for the Shared Task on Aggression and Misogyny Identification during Second Workshop on Trolling, Aggression and Cyberbullying (TRAC - 2) as LREC-2020. Please visit the workshop website - https://sites.google.com/view/trac2/shared-task - for more details
      Apache License 2.0
      0000Updated Jul 30, 2020Jul 30, 2020
    • Repository for all data and resources on Western Hindi that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Ms. Saba Parween.
      Apache License 2.0
      0000Updated Jul 30, 2020Jul 30, 2020
    • taluitew

      Public
      Repository for all data and resources on Taluitew, a Tibeto-Burman language of Naga Group, spoken in parts of Manipur that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Mr. Chingrimung Lungleng.
      Apache License 2.0
      0000Updated Jul 30, 2020Jul 30, 2020
    • This repository contains the dataset used for Indo-Aryan Language identitifcation Shared Task as part of the Evaluation Campaign in the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial) at COLING 2018. It has 15k sentences each in Awadhi, Bhojpuri, Braj, Magahi and Hindi
      Apache License 2.0
      2610Updated Mar 22, 2019Mar 22, 2019
    • indianlr

      Public
      A repository of language resources and technologies for non-scheduled and endangered Indian languages
      0000Updated Feb 16, 2019Feb 16, 2019
    • A repository for listing the non-scheduled and endangered Indian language resources and technologies. The website could be accessed here
      HTML
      0000Updated Feb 16, 2019Feb 16, 2019
    • trac-1

      Public
      Repository hosting dataset for the Shared Task on Aggression Identification during First Workshop on Trolling, Aggression and Cyberbullying (TRAC - 1) as COLING - 2018. Please visit the workshop website - https://sites.google.com/view/trac1/home - for more details
      Apache License 2.0
      2500Updated Mar 12, 2018Mar 12, 2018
    • awadhi

      Public
      Repository for all codes, data and resources on Awadhi language that is being developed at the Institute. Currently, it contains all the data generated as part of the M.Phil. dissertation of Mr. Abdul Basit.
      Apache License 2.0
      1000Updated Sep 8, 2017Sep 8, 2017
    • NLP

      Public
      Natural Language Processing R&D @K.M. Institute of Hindi and Linguistics
      Apache License 2.0
      0000Updated Jul 15, 2017Jul 15, 2017
    • Research and Development at the Department of Linguistics in K.M. Institute of Hindi and Linguistics at Dr. Bhim Rao Ambedkar University, Agra
      HTML
      Apache License 2.0
      0000Updated Jul 15, 2017Jul 15, 2017