Release Sprint 02 Update · tylersupersad/star-wars-youtube-comments-pipeline

Collected relevant tweets using Pythonic library, Twint
Cleaned and preprocessed the data by removing irrelevant information, standardizing text, and reducing dimensionality
Labeled the sentiment using pre-trained sentiment analysis model, TextBlob
Evaluated and refined the dataset by identifying mislabeled tweets, imbalanced data, and patterns/trends
Stored the dataset in a PostgreSQL database management system for easy access and analysis
Integrated the pipeline with Apache Airflow to automate the entire process and schedule it to run at regular intervals or specific events.

Provide feedback