EDA and Modelling on Advertising carried out as a part of STAT-S 670 Exploratory Data Analysis coursework at Indiana University Bloomington.
The project addresses the following problem statements:
- How does daily time spent on advertisements help understand if the consumer will click on the ad?
- What is the relationship between the daily time spent on site and daily internet usage?
- Identify the parameters like age, gender, etc., that affect the probability of clicking on ads.
The data was obtained from Kaggle (https://www.kaggle.com/datasets/rizdelhi/my-datasets). This project uses the Advertising.csv dataset, which contains data on 1000 users indicating whether a particular internet consumer clicked on an advertisement on a company website. The dataset includes variables such as daily time spent on the site, age, area income, daily internet usage, ad topic line, city, male, country, timestamp, and clicked on ad (0 – NO, 1 – YES). The target variable in this exploratory data analysis project is Clicked on Ad, and other variables considered include daily time spent on site, daily internet usage, age, and gender.
The project's exploratory data analysis explores the relationship between the target variable, Clicked on Ad, and Daily Time Spent on Site or Daily Internet Usage. The analysis includes scatter plots, density plots, and distribution plots to understand the relationship between these variables. The analysis helps identify factors affecting the CTR of ads, such as age, gender, and daily internet usage.
The repository contains the RMD files on Data Exploration and Modelling, along with the Final Project Report. The report contains key points and trends found during the exploratory analysis along with variable selection for Modelling purposes.
The project concludes that users who spend a relatively low time on sites daily are more likely to click on ads, and those who spend more time on sites and use the internet for a longer duration do not tend to click on ads. Age and gender also play a significant role in determining the CTR of ads. Younger consumers are more likely to click on ads than middle-aged or older consumers. The project's findings can help advertisers better target their audiences and create more effective ad campaigns.