GitHub - javaidiqbal11/Arabic-Tweets-Sentiment-Analysis-using-Spark: This repo is for Twitter Arabic dataset for sentiment analysis using Apache Spark.

Installation Process

Spark is Java based, You need to install java on your system to run the spark.

Download apache spark and place it some directory you want to place it. Uncompress it and save it as spark directory.

Add Environment variables for spark in .bashrc if you are using linux or Unix based OS. For example I have downloaded the spark in irfan directory. Now add these lines to .bashrc file.

export SPARK_HOME=/home/irfan/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
export PYSPARK_PYTHON=python3

Test your installation of spark by running the command in terminal

pyspark

It will start the python shell in spark.

Install requirements.

you need to install Python packages to run the training.

pip3 install -r requirements.txt

Now you are good to go.

Start spark server

Run the start_server.sh to start the spark. It will automatically train the model on startup and provide you with api access to find the sentiment of given tweet using API.

Test server

# json request
curl -X POST -d '{"sentence":"التعلم الرقمي من خلال التسجيل الرقمي افتتاح الموقع قري"}' -H "Content-Type: application/json" http://0.0.0.0:5432/analyse

If you want to test it with Pycharm you can run the api_test.http file

Note

POSTMAN has encoding issues with arabic type language such as Urdu etc so you have to use the curl or Pycharm to test it. Or You can integrate it with your app or website.

Create database to store user tweets

import findspark
findspark.init("/home/irfan/spark")
import pyspark as ps
from pyspark.sql import SQLContext
sc = ps.SparkContext('local[2]')
sqlContext = SQLContext(sc)
csv_file = "./user_tweets.csv"
sqlContext.sql("CREATE DATABASE IF NOT EXISTS Sentiment;")
sqlContext.sql("use sentiment")
df = (sqlContext.read.format("csv")
  .option("inferSchema", "true")
  .option("header", "true")
  .load(csv_file))

schema="tweet varchar(512)"

sqlContext.sql("use sentiment;")
df.write.saveAsTable("user_tweets", schema=schema)
# df.write.format("csv").saveAsTable("user_tweets", schema=schema)
df = sqlContext.read.load("spark-warehouse/sentiment.db/user_tweets")
df_sfo = sqlContext.sql("SELECT * FROM user_tweets")
tbl = sqlContext.read.format("parquet").load("spark-warehouse/sentiment.db/user_tweets")
tbl.sql_ctx.sql("INSERT INTO user_tweets  VALUES ('{}')".format('قراءة المزيد')).show()

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Document_v1.2.pdf		Document_v1.2.pdf
README.md		README.md
analyser.py		analyser.py
api_test.http		api_test.http
app.py		app.py
preprocess.py		preprocess.py
pyspark_test.txt		pyspark_test.txt
requirements.txt		requirements.txt
server.py		server.py
start_server.sh		start_server.sh
user_tweets.csv		user_tweets.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation Process

Install requirements.

Start spark server

Test server

Note

Create database to store user tweets

About

Releases

Packages

Languages

javaidiqbal11/Arabic-Tweets-Sentiment-Analysis-using-Spark

Folders and files

Latest commit

History

Repository files navigation

Installation Process

Install requirements.

Start spark server

Test server

Note

Create database to store user tweets

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages