Real Time Twitter Sentiment Analysis using Kafka and ELK Stack
The Goal of the project is to Connect to Twitter API and continuously stream tweets of particular search string and analyze the sentiment of these tweets using Hugging Face Transformers Pipeline. Send the output of Sentiment Analyse to Kafka and index this data into Elasticsearch by a pipeline Kafka -> Logstash -> Elasticsearch. Finally view the results and analyse real time sentiment for a search string on Kibana.
Technologies and Resources Employed:
- Twitter API
- Transformers - Hugging Face
- Kafka
- Logstash
- Elasticsearch
- Kibana
Steps to run :
// Start up of Kafka Enivronment
- Zookeeper: bin/zookeeper-server-start.sh config/zookeeper.properties
- Kafka bin/kafka-server-start.sh config/server.properties
- Create Topic: bin/kafka-topics.sh --create --topic assignment3 --bootstrap-server localhost:9092
- Producer: bin/kafka-console-producer.sh --topic assignment3 --bootstrap-server localhost:9092
- Consumer: bin/kafka-console-consumer.sh --topic assignment3 --from-beginning --bootstrap-server localhost:9092
// Start of ELK Stack :
- Elasticsearch (At the directory of Elasticsearch) : ./bin/elasticsearch
- Kibana (At the directory of Kibana) : ./bin/kibana
- Logstash (At the directory of Logstash) : ./bin/logstash (OR) ./bin/logstash -f /path/to/conf/logstash-sample.conf This depends on the mode of installation.
This Application is built on Tweepy - Python for Twitter API Analysis.
- Arguments has been configured using "config.ini" which is at the same directory as "Producer.py"
- All the necessary libraries are present in "requirements.txt".
- Run the python file using command : python Producer.py
// Note:-
- Have employed Hugging Face - Transformers for Sentiment Analysis.
- Logstash requires its .conf file to point the input and output. ex. input {
kafka { bootstrap_servers => ["localhost:9092"] topics => ["assignment3"] }
}
output { elasticsearch { hosts => ["http://localhost:9200"] index => "assignment3" #user => "elastic" #password => "changeme" } }