Three datasets of different fine-grained levels in both English and Chinese.
- MR: It is an English Movie Review dataset with one-sentence review per movie. The classification includes two classes: positive and negative.
- COVID_Chinese: It is a Chinese dataset consisted of Weibo posts during COVID-19 from st th January 1 to February 20 . It is a multimodal dataset with text, pictures and videos, but only text was used in this project. The classification includes three classes: positive, neutral and negative.
- SST-5: It is an English fine-grained sentiment classification dataset from Stanford Sentiment Treebank. Data is provided at phrase-level, so the sentences after data transformation were used for training and testing. The classification includes five classes: very negative, negative, neutral, positive and very positive.
- TextCNN: Convolutional Neural Networks for Sentence Classification
- RNN
- CharCNN: Character-level Convolutional Networks for Text Classification
- Very Deep CNN (VDCNN): Very Deep Convolutional Networks for Text Classification
- Bi-LSTM
- Attention-Based Bi-LSTM: Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification
- RCNN: Recurrent Convolutional Neural Networks for Text Classification
- BERT