A project to explore various aspects and factors associated with Youtube videos to gain valuable insights.
The dataset was uploaded to Kaggle by the user Syahrul Hamdani around June 2022. Original data is extracted using the YouTube API. Heavily inspired by the work of @rsrishav and @datasnaek with similar dataset mentioned earlier.
In doing this project, we would like to know several things, which are listed below.
- Exploratory data analysis
- How do the comments corelate to dislike and like count?
- How do like, dislike, and comments differ from each other?
- How do publish time of the video affect views and comments?
- What is the average video duration?
- On average, how long after uploading will the video be marked as trending?
The main focus of this project lies in items 3 and 4 that involve utilizing Pearson analysis and Mann-Whitney U analysis, respectively.