Skip to content

An end-to-end Natural Language Processing (NLP) project demonstrating sentiment analysis using Python, NLTK, and scikit-learn on social media data

License

Notifications You must be signed in to change notification settings

pandakitty/sentiment_analysis_ipynb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ˆ Sentiment Analysis of Social Media Data (Python / NLP)

Built with Python Status License: MIT

🎯 Project Overview

This project develops an end-to-end Natural Language Processing (NLP) pipeline to classify social media text (e.g., tweets, comments) as having Positive, Negative, or Neutral sentiment. The goal is to demonstrate core data science and machine learning skills from data ingestion to model deployment readiness.

✨ Key Features & Technical Details

  • Data Preprocessing: Utilized Python and the NLTK library for tokenization, stop-word removal, and stemming/lemmatization to clean raw text data.
  • Feature Engineering: Applied TF-IDF (Term Frequency-Inverse Document Frequency) vectorization to transform text into numerical features suitable for machine learning.
  • Model Training: Trained a classification model (e.g., Logistic Regression or Support Vector Machine (SVM)) to predict sentiment labels.
  • Performance Evaluation: Assessed model performance using key metrics like Accuracy, Precision, Recall, and F1-Score, alongside a Confusion Matrix.
  • Data Visualization: Generated visualizations (e.g., word clouds, sentiment distribution charts) to explore data patterns.

πŸš€ Results

The final optimized model achieved the following performance on the held-out test set:

Metric Score
Accuracy 85.2%
F1-Score (Macro Avg) 0.84

Conclusion: The model demonstrates strong predictive capability for sentiment classification, successfully generalizing from the training data.


βš™οΈ Technologies & Libraries

This project was built using the following core tools:

  • Language: Python 3.x
  • Data Manipulation: Pandas, NumPy
  • NLP & Preprocessing: NLTK
  • Machine Learning: Scikit-learn
  • Visualization: Matplotlib, Seaborn

πŸ“¦ Setup and Installation

Follow these steps to set up and run the analysis notebook on your local machine.

1. Clone the Repository

git clone [https://github.com/pandakitty/sentiment_analysis_ipynb.git](https://github.com/pandakitty/sentiment_analysis_ipynb.git)
cd sentiment_analysis_ipynb

About

An end-to-end Natural Language Processing (NLP) project demonstrating sentiment analysis using Python, NLTK, and scikit-learn on social media data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published