This is Racism Detection System that was built to detect racism on Indonesian tweets text. This project uses Long Short-Term Memory Network (LSTM) using Python programming language.
Dataset is collected from Twitter. Dataset contains three columns of data: number, tweets, label. Data collected is 686 tweets, in which each of the tweets is labelled manually. Labelled tweets resulted in 511 tweets for class Non_R (Non Racism) and 175 tweets for class R (Racism). Since this dataset is imbalanced, you can do under-sampling to create balanced dataset.
- Python
You can get dataset from file "dataset_racism.csv". You can run either the python notebook file "Racism Detection.ipynb" or "Racism"
You can run the python notebook file "Racism Detection.ipynb" locally on Jupyter Notebook or Google Colaboratory. You can run the python script file "Racism"
- Clone the repo
git clone
- Enter your own location path of dataset in
orRacism Detection.ipynb
in this line of code
df = pd.read_csv(<FILE PATH LOCATION>, sep=',')
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b <add-your-new-branch-name>
) - Commit your Changes (
git commit -m 'commit message'
) - Push to the Branch (
git push origin <add-your-branch-name>
) - Open a Pull Request
Angela Marpaung -
Project Link: