Exploratory Data Analysis and application of Regression techniques.

NOTE - Please see the Jupyter notebook(.ipynb file) for complete explanation and Exploratory Data Analysis (EDA)

Project goal - Prediction of insurance charges for customers

Problem type - Regression

Project description - This work predicts insurance charges for customers based on their several physical characteristics and some other features. Exploratory Data analysis (EDA) is performed for data overview, visualization and study of statistical details. Linear regression, ADABoost Regressor and Random Forest Regressor are implemented using Python's Scikit-learn package. All three algorithms are appied on both train and test data. Ordinary Least Square(OLS) from Python's Statsmodel is used for prediction and to generate full statistical summary including p-values, correlation coefficients and F-statistics for the model. Two approaches are used for creating train and test data from the dataset.

Cross-vaidation - Using entire dataset and perform cross-validation.
Train-test splitting - Splitting the dataset, 80% for training and rest for testing. Performance metrics - R-squared values for each model on both training and testing data.

About dataset-

Source - Kaggle (Medical Insurance Personal dataset) Dependent variable - charges Independent variables - Age - Age of customer (numeric) Sex - Gender of customer (categorical, String type) Region - Areas they live in (categorical, String type) Children - Number of children they have (numeric) BMI - Body-mass index (continous, float) Smoker - Whether smokes or not (categorical, String type)

To compile and execute - python Insurance.py

Output type - Continous, float values Output parameter - Cost charges projected for each customer by predictive models.

*Note - Update input file path with local file pathname.

***************************************************** End of file ***************************************************

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Insurance Forecast - A Regression problem.ipynb		Insurance Forecast - A Regression problem.ipynb
Insurance.csv		Insurance.csv
Insurance_Forecast.py		Insurance_Forecast.py
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploratory Data Analysis and application of Regression techniques.

About

Releases

Packages

Languages

Shubha23/Insurance-Forecast-A-Regression-Problem

Folders and files

Latest commit

History

Repository files navigation

Exploratory Data Analysis and application of Regression techniques.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages