KanHope

This is the code for the paper "Hope Speech detection in under-resourced Kannada language"

Steps to run the Vanilla sequence classification tasks:

Download the corresponding files from Zenodo:

https://zenodo.org/record/5006517/

Set the path to 'path_to_repo/KanHope/Dual Channel models/'.
For the models that follow the architecture of BERT, run the classifier.py and find the string 'read_csv'. Add the paths to the train, test, and validation dataframes. Change the path to the dataset where the files have been stored after downloading from Zenodo.
Run test.py for inference.
Under the same directory, run get_predictions.py to view classification reports and confusion matrix.

Steps to run the Dual Channel BERT-based models (DC-BERT4HOPE)

Download the English translations of the code-mixed Kannada-English dataset, along with the splits:

https://Zenodo.org/record/4904729/

run dc_classifier.py to train the Dual channel BERT model.
For the names of the models (model1;model2), follow the naming conventions as listed in Huggingface Transformers' pretrained models. a)model1: Monolingual English language model (Translated Texts). b)model2: Multilingual language model (Kannada-English code-mixed text).
under the same directory run get_predictions.py to view the classification reports and confusion matrix.
The architecture of the dual channel model is as follows:

This approach could be used for any multilingual datasets. The weights of the fine-tuned models are available on my Huggingface account [AdWeeb](https://huggingface.co/AdWeeb).

We have provided the notebooks for reference.

Experiments, Results, and Discussions

The code and their explanation for all the experiments are present in the Jupyter Notebook. We document interesting findings, results, discussions and qualitative analysis in the manuscript.

If you use our dataset, and/or find our codes useful, please cite our paper:

@misc{hande2021hope,
      title={Hope Speech detection in under-resourced Kannada language}, 
      author={Adeep Hande and Ruba Priyadharshini and Anbukkarasi Sampath and Kingston Pal Thamburaj and Prabakaran Chandran and Bharathi Raja Chakravarthi},
      year={2021},
      eprint={2108.04616},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.idea		.idea
Dual Channel models		Dual Channel models
Images		Images
Notebooks		Notebooks
Vanilla sequence Classification		Vanilla sequence Classification
dataset		dataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KanHope

Steps to run the Vanilla sequence classification tasks:

Steps to run the Dual Channel BERT-based models (DC-BERT4HOPE)

Experiments, Results, and Discussions

About

Releases

Packages

Contributors 2

Languages

License

adeepH/kan_hope

Folders and files

Latest commit

History

Repository files navigation

KanHope

Steps to run the Vanilla sequence classification tasks:

Steps to run the Dual Channel BERT-based models (DC-BERT4HOPE)

Experiments, Results, and Discussions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages