Data-Science Project , Crawling data , Cleaning data , Visualization and ML model
IDEs : Pycharm community . Jupyter Notebook.
Modules uses : pandas , selenium , instascrape , colorgram , PIL(python imaging library) , sklearn , numpy , time , matplotlib and seaborn
Purpose: Analyzing instagram posts data - likes , followers , comments and the most dominant colors in the photo.
Questions for ML:
- Does colors have an effect on instagram post behaviour ( Likes , Engagement etc) ?
- Can we predict the amount of the next posts likes by knowing the dominant colors and the amount of followers ?
Acquisition:
- Crawling data from instagram posts using Selenium & Instascrape.
- Clustering RGB tuples from string (r,g,b) - to numbers between 0-9 . With Kmeans algorithm.
- duplicates and null handeling.
Visualization and ML:
- Visualizing data for deeper understanding.
- Correlation examination.
- Fitting Linear Model and predicting the Likes.
- understanding the number and deeper error examinations.