This repository contains the analysis of an insurance dataset, exploring factors affecting insurance premiums, including smoking habits, BMI, and regional trends. The project utilizes various data analysis and machine learning techniques to derive insights.
-
Identified a direct correlation between insurance premiums and smoking habits.
-
Analyzed the impact of BMI, especially in conjunction with smoking, on insurance costs.
-
Explored the significance of obesity as a health measure affecting insurance pricing.
-
Highlighted regional insights to potentially tailor insurance policies and health campaigns.
-
Proposed leveraging technology for real-time data collection to encourage healthy habits.
- The problem at hand involves analyzing a dataset related to insurance premiums. The goal is to uncover patterns and factors influencing insurance costs. Understanding these factors is crucial for insurance companies to make informed decisions regarding premium setting and policy structuring.
- The primary business goal is to enhance the efficiency of insurance premium determination. By gaining insights into influential factors such as smoking habits, BMI, and regional variations, the aim is to optimize premium pricing. Additionally, promoting healthier lifestyles could be a secondary goal, achieved through tailored policies and incentives.
- The key question driving this analysis is: "What factors significantly affect insurance premiums, and how can insurance companies adjust their policies and pricing to reflect these factors accurately?"
- Acquire the insurance dataset and perform data cleaning to ensure a reliable foundation for analysis.
- Conduct EDA to understand the dataset's structure, variables, and initial insights into the relationships between features and insurance premiums.
- Investigate the correlation between smoking habits, BMI, and insurance premiums. Analyze how smoking status and BMI values impact insurance costs.
- Examine regional data to identify areas with higher smoking prevalence. This information can guide the design of region-specific insurance policies.
- Utilize predictive models, including linear and polynomial regressions, to gain deeper insights into the relationships between various factors and insurance premiums.
- Summarize the findings and propose actionable recommendations for insurance companies based on the analysis.
- Direct correlation between insurance premiums and smoking habits.
- Smoking indicates increased cancer risks and mortality rates due to tobacco's carcinogenic properties.
- Insurance companies like New York Life, Cathay Life, and Nan Shan Life adjust premiums based on smoking habits, exemplified by policies like the "Healthy Body Policy."
- Higher BMI values, especially in conjunction with smoking, result in higher insurance premiums.
- Smoking individuals tend to have higher BMI; BMI exceeding 30 (indicative of obesity) increases insurance costs.
- BMI does not significantly impact premiums for non-smokers.
- Obesity is a chronic condition contributing to various health issues like diabetes, cardiovascular diseases, etc.
- BMI is a crucial health measure factored into insurance pricing.
- Regional data analysis reveals areas with higher smoking prevalence.
- Opportunity for designing region-specific insurance policies and bolstering health awareness campaigns.
- Utilize real-time data collection through smart devices and wearables to incentivize healthy habits.
- Insurance companies can offer premium reductions, encouraging healthy lifestyles.
- Linear and polynomial regressions provide precise insights into factors influencing insurance premiums.
- These findings are valuable for shaping future insurance practices and policy formulations.
- Data analytics plays a pivotal role in the insurance domain, contributing to better policy design and decision-making.