This repository contains Data Science Projects in Python programming language.
-
- Simple Linear Regression Project (Introductory): In this project, I have explained how to implement a Simple Linear Regression Model and perform predictions on Weather dataset.
- Simple Linear Regression Project (Advanced): In this project, I have explained different types of Regression Models to predict the life expectancy in a given country, based on features such as country's GDP, fertility rate and population. I have downloaded dataset from Gapminder website. I have discussed feature engineering techniques along with Cross Validation, Regularized Regression Models and Pipelines.
- Multiple Linear Regression Project: In this project, I have explained implementation of a Multiple Linear Regression Model. I have used E-Commerce Customers Dataset from Kaggle. Also, discussed how to evaluate the performance of model.
- Random Forest Regression Project: In this project, I have built a Prediction Model to predict Medical Insurance Cost of an Individual costed by a Health Insurance Company using Random Forest Regressor Machine Learning Model.
-
-
Classification Models (Project-1 - Introductory): In this project, I have explained basics of different types of classification models and their implementations.
-
Classification Models (Project-2 - Advanced): In this project, I have built a classifier to predict Diabetes disease. I have downloaded dataset from UCI Machine Learning Repository. I have discussed Hyperparameters tuning along with Cross Validation and pipelines.
-
Classification Algorithms Project - Diabetes Disease Prediction: This project consists of different classifiers to predict the whether the person is diabetic or not. Also, evaluated the performances of each classification model.
-
Classification Algorithms:
-
K Nearest Neighbors Project: K Nearest Neighbours is the simplest of all machine learning algorithms. In this project, I have built a kNN classifier to classify the party affiliation of United States Congressmen based on their voting records. I have used the Congressional Voting Records Dataset from the UCI Machine Learning Repository.
-
XGBoost Algorithm Introduction: This jupyter notebook contains introduction of XGBoost Model.
-
XGBoost Algorithm Project: In this project, I have built a XGBoost Classifier to predict Parkinson's Disease.
-
Logistic Regression Project - 1 : In this project, I have built a classifier to predict whether the patient has 10-year risk of future coronary heart disease (CHD). The dataset is available on the Kaggle website, and it is from an ongoing cardiovascular study on residents of the town of Framingham, Massachusetts.
-
Logistic Regression Project - 2 : In this project, I have trained a Logistic Regression classifier to predict if an individual’s application for a Credit Card will be accepted or not. I'll be using Credit Card Approval Dataset from UCI Machine Learning Repository
-
Support Vector Machines Project: In this project, I build a Support Vector Machines classifier to classify a Pulsar star. I have used the predicting a Pulsar Star dataset for this project. I have downloaded this dataset from the Kaggle website.
-
Random Forest Classification Project: In this project, I build a Random Forest Classifier model (with 10 decision-trees) to predict safety of the car. The accuracy increases with number of decision-trees. I have also demonstrated the feature selection process using the Random Forest model.
-
-