#️⃣Tweet Sentiment Analysis

Introduction

This project aims to analyze tweets and predict their sentiment using Natural Language Processing (NLP) techniques and machine learning models. The goal is to classify tweets into three categories: Neutral, Negative, and Positive.

Objectives

Perform data preprocessing and text cleaning.
Use NLP techniques to prepare the text data for model training.
Build and train an LSTM model for sentiment analysis.
Evaluate the model and make predictions on sample text.

Dataset Description

The dataset contains the following columns:

Column Name	Description
Tweets	Data in the form of sentences written by individuals
category	Sentiment category: 0 (Neutral), -1 (Negative), 1 (Positive)

Methodology

Data Preparation

Import Libraries and Load Dataset:
- Import necessary Python libraries such as pandas, seaborn, matplotlib, sklearn, nltk, and tensorflow.
- Load the dataset from the provided Excel file into a pandas DataFrame.
Change Dependent Variable to Categorical:
- Convert the numerical categories (0, -1, 1) to categorical labels ("Neutral," "Negative," "Positive").
Missing Value Analysis:
- Check for missing values and drop any null/missing values from the dataset.

Text Preprocessing

Clean Text Data:
- Remove all symbols except alphanumeric characters.
- Transform all words to lowercase.
- Remove punctuation, stopwords, and numbers.
- Perform tokenization, lemmatization, and expand contractions.

Data Splitting

Split Data into Dependent and Independent DataFrames:
- Separate the tweets (X) from the sentiment categories (y).

Text Data Operations

One-Hot Encoding and Padding:
- Perform one-hot encoding for each sentence using TensorFlow.
- Add padding to the sequences from the front side using TensorFlow.

Model Building and Training

Build and Compile LSTM Model:
- Define the model architecture including input length, vocabulary size, dropout layer, and activation function.
- Compile the LSTM model.
Dummy Variable Creation:
- Create dummy variables for the dependent variable categories.
Split Data into Training and Test Sets:
- Split the data into training and testing sets.

Model Training

Train the Model:
- Train the LSTM model on the training data.

Model Evaluation

Normalize Predictions:
- Normalize the prediction results to match the original categories (nearest to 1 is predicted as yes, others as 0).
Measure Performance Metrics:
- Calculate accuracy, print the classification report, and plot the confusion matrix.

Sample Text Inferences

Make Inferences:
- Pass sample text through the model and make predictions.

Folder Structure

data/: Contains the dataset files.
notebooks/: Jupyter notebooks for data analysis, text preprocessing, and model training.
src/: Python scripts for data processing, text preprocessing, and model training.
api/: Api code (if any).
models/: Trained models and saved results.
results/: Output files including visualizations, model evaluation metrics, and plots.

Installation and Usage

Clone the Repository:

git clone https://github.com/MariahFerns/Tweet-Sentiment-Analysis.git
cd Tweet-Sentiment-Analysis

Install the required libraries:
```
pip install -r requirements.txt
```
Run Jupyter Notebooks: Navigate to the notebooks/ folder and open the notebooks to explore data analysis and model development.

Results and Findings

EDA Insights:

Identified the distribution of tweet sentiments.
Found patterns in text data that contribute to sentiment classification.

Model Evaluation:

Evaluated the LSTM model and found it to be effective in classifying tweet sentiments.
Plotted ROC curves and confusion matrices to compare model performance.

Conclusion

This project successfully analyzed and classified tweet sentiments using NLP techniques and an LSTM model. The findings can help in understanding public opinion and sentiment on various topics.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
docs		docs
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

#️⃣Tweet Sentiment Analysis

Introduction

Objectives

Dataset Description

Methodology

Data Preparation

Text Preprocessing

Data Splitting

Text Data Operations

Model Building and Training

Model Training

Model Evaluation

Sample Text Inferences

Folder Structure

Installation and Usage

Results and Findings

Conclusion

About

Releases

Packages

Languages

License

MariahFerns/NLP-Tweet-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

#️⃣Tweet Sentiment Analysis

Introduction

Objectives

Dataset Description

Methodology

Data Preparation

Text Preprocessing

Data Splitting

Text Data Operations

Model Building and Training

Model Training

Model Evaluation

Sample Text Inferences

Folder Structure

Installation and Usage

Results and Findings

Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages