We have data-set related to a particular show of a media company (like Hot star) and we need to predict the no of viewers (Regression Problem). Dataset has 7 features(rows- including views) + 1(Garbage) and 80 datapoints(columns).
Date :- This is used to derive a new features called Days and Weekends. Days :- As the no of days increases the no of views increases upto certain point and then declined towards end. weekends :- we found that saturday and sunday has maximum views.
Ad_impression :- we found that advertisement has large impact on views.
Cricket_match_india :- we have imbalanded dataset for this (non_cricket_match_day and cicket_match_day), it was hard to identify but we could say views declined on match day.
Character_A : Whats the impact of the a particular Character on the show views (Twist in the story).
Visitors : how many have visited the website but have or haven't watched the show. :
Views_platform : how many people have the platform (like app)
Views_show : which we want to predict.