Skip to content

Python script to perform sentiment analysis on Turkish text data using multiple pre-trained transformer models and list of Turkish Sentiment Analysis Datasets between 2012 to 2022.

Notifications You must be signed in to change notification settings

sevvalckc/Turkish-SAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 

Repository files navigation

Turkish-SAD

This repository provides a collection of Turkish Sentiment Analysis Datasets from 2012 to 2022, covering various domains. It includes access links for publicly available datasets and contact information for non-public datasets. It also includes a Python script for sentiment analysis using pre-trained transformer models.

Turkish Sentiment Analysis Datasets

A thorough investigation was carried out on research papers related to 'sentiment analysis' and 'Turkish dataset' indexed on Scopus between 2012 and 2022. 23 unique datasets were collected from publicly available sources and through email requests. This repository provides links to the publicly available Turkish datasets, as well as contact information for those that are not publicly available.

Search Details:

  • Search Query: 'sentiment analysis' AND 'Turkish dataset'
  • Fields: Article Title, Abstract, Keywords
  • Date Range: 2012–2022
  • Database: Scopus

The repository provides:

  • Links to publicly available datasets.
  • Contact Information for datasets not openly accessible.

Contents

  1. List of Datasets
  2. Usage
  3. Requirements
  4. Pre-trained Models
  5. Using Google Colab

List of Datasets

Articles Datasets Status Contact Info
Cross-lingual Polarity Detection with Machine Translation Turkish Movie Reviews & Turkish Multidomain Products Reviews Publicly Available e.demirtas@student.tue.nl m.pechenizkiy@tue.nl
Sentiment Analysis in Turkish Media Twitter Dataset & Movie Dataset Not Available turkmenogluc@itu.edu.tr tantug@itu.edu.tr
Sentiment Analysis for Turkish Twitter Feeds Twt Not Available onder.coban@atauni.edu.tr baris.ozyer@atauni.edu.tr gulsah.ozyer@atauni.edu.tr
SentiWordNet for New Language: Automatic Translation Approach Turkish Sentiment Analysis Dataset Publicly Available aucan@hacettepe.edu.tr n.behzad@hacettepe.edu.tr ebru@hacettepe.edu.tr sever@hacettepe.edu.tr
Sentiment Analysis on Microblog Data based on Word Embedding and Fusion Techniques Turkish Sentiment Dataset Publicly Available ahayran@baskent.edu.tr msert@baskent.edu.tr
A Real-time Social Network-based Knowledge Discovery System for Decision Making Foursquare Venue and Venue Comments Data Publicly Available asimyuksel@sdu.edu.tr
Words, Meanings, Characters in Sentiment Analysis Dataset Publicly Available mfatih@ce.yildiz.edu.tr hakantaskopru77@gmail.com kubra.clskn94@gmail.com
Sentiment Analysis Through Transfer Learning for Turkish Language Turkish Product Reviews Dataset Publicly Available emre.akin02@bilgiedu.net tugba.yildiz@bilgi.edu.tr
Sentiment Analysis in Turkish Text with Machine Learning Algorithms Dataset Not Available merve.rumelli@ceng.deu.edu.tr deniz.akkus@ceng.deu.edu.tr ozge.kart@deu.edu.tr zerrin@cs.deu.edu.tr
Sentiment Analysis of Turkish Twitter Data Dataset Not Available harisushehu@gmail.com stokat@pau.edu.tr md.sharif@uoh.edu.sa uyaver@tau.edu.tr
Comparison of N-stage Latent Dirichlet Allocation versus Other Topic Modeling Methods for Emotion Analysis Dataset Not Available zekeriya.anil.guven@ege.edu.tr diri@yildiz.edu.tr txcakaloglu@ualr.edu
An Annotated Turkish Aspect Based Sentiment Analysis Corpus for Smart Tourism Turkish Tourism ABSA Dataset Publicly Available mehmetumut.salur@gibtu.edu.tr iaydin@firat.edu.tr
Twitter Dataset and Evaluation of Transformers for Turkish Sentiment Analysis BounTİ Turkish Sentiment Analysis Publicly Available abdullatif.koksal@boun.edu.tr arzucan.ozgur@boun.edu.tr
Aspect Based Twitter Sentiment Analysis on Vaccination and Vaccine Types in COVID-19 Pandemic With Deep Learning Covid 19 Vaccination Dataset Publicly Available irfan.aygun@cbu.edu.tr bkaya@firat.edu.tr kaya@firat.edu.tr
TRSAv1: A New Benchmark Dataset for Classifying User Reviews on Turkish E-commerce Websites TRSAv1 Publicly Available m.aydogan@firat.edu.tr
Multi-Label Classification of E-Commerce Customer Reviews via Machine Learning Turkish E-Commerce Reviews Dataset Publicly Available mustafacosar@hitit.edu.tr
Sentimental Analysis of Twitter Users from Turkish Content with Natural Language Processing The Public Dataset & SentimentSet Publicly Available alok.mishra@himolde.no
A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts Dataset Not Available melih.mutlu@boun.edu.tr arzucan.ozgur@boun.edu.tr
A Novel COVID-19 Sentiment Analysis in Turkish based on the Combination of Convolutional Neural Network and Bidirectional Long-short Term Memory on Twitter COVID-19 Dataset Not Available talhakabakus@duzce.edu.tr
Opinion Mining Using LSTM Networks Ensemble for Multi-class Sentiment Analysis in E-commerce Dataset Not Available dalnahas@infina.com.tr fasik@infina.com.tr akanturvardar@infina.com.tr mulkgun@infina.com.tr
BERT-based Transfer Learning Model for COVID-19 Sentiment Analysis on Turkish Instagram Comments Dataset1 & Dataset2 Not Available d2014242@mersin.edu.tr akdagli@mersin.edu.tr caci@mersin.edu.tr

Usage

Steps to Use:

  1. Clone this repository:
    git clone https://github.com/sevvalckc/Turkish-SAD.git
    cd Turkish-SAD
  2. Install required libraries: pip install -r requirements.txt
  3. Ensure your datasets (e.g., data1.csv, data2.csv) are placed in the same directory as the script.
  4. Run the script: python sentiment_analysis.py
  5. The script will output sentiment analysis results to CSV files for each model.

Requirements

The script requires the following Python libraries and versions:

  • Pandas version: 2.2.2
  • PyTorch version: 2.5.1+cu121
  • Transformers version: 4.46.2
  • Scipy version: 1.13.1

Install Requirements

To install all required libraries, run: pip install -r requirements.txt sv) for each model.

Pre-trained Models Used

TurkishBERTweet: VRLLab/TurkishBERTweet-Lora-SA TSAM: emre/turkish-sentiment-analysis BERTurk: akoksal/bounti XLM-T: cardiffnlp/twitter-xlm-roberta-base-sentiment

Using Google Colab

Enabling TPU and High RAM

To use this script on Google Colab with TPU and high RAM, follow these steps:

  • Open Google Colab: Go to Google Colab.
  • Upload the script: Upload sentiment_analysis.py and your datasets (data1.csv, data2.csv) to Colab.

Enable TPU:

Go to Runtime > Change runtime type. Select TPU from the Hardware accelerator dropdown menu. Enable High RAM:

Go to Runtime > Manage sessions. Click on the current session. Select High-RAM from the options available.

About

Python script to perform sentiment analysis on Turkish text data using multiple pre-trained transformer models and list of Turkish Sentiment Analysis Datasets between 2012 to 2022.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages