Skip to content

yassmin1/machine_Learning_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Projects

8. Housing Price

Clustering Time Series : Analysis of Housing Prices in Bexar County Objective: The objective of this project is to cluster housing regions in Bexar County, TX based on the house prices and gain insight into the housing market in Bexar County.

Skills & Tools Covered

EDA,Time series Clustering,Cluster Profiling,Kmeans.

7. Trading Sotcks

Analyze the stocks data, grouping the stocks based on the attributes provided, and sharing insights about the characteristics of each group.

Skills & Tools Covered

EDA,Kmeans Clustering,Hierarchical Clustering,Cluster Profiling

6. Renewable Energy Wind Power

"ReneWind" is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. The objective is to build various classification models, tune them and find the best one that will help identify failures so that the generator could be repaired before failing/breaking and the overall maintenance cost of the generators can be brought down.

Skills & Tools Covered

Up and downsampling, Regularization, Hyperparameter tuning

5. Visa Approval

Analyze the data of Visa applicants, build a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommend a suitable profile for the applicants for whom the visa should be certified or denied.

Skills & Tools Covered

EDA Data Preprocessing Customer Profiling Bagging Classifiers (Bagging and Random Forest) Boosting Classifier (AdaBoost Gradient Boosting XGBoost) Stacking Classifier Hyperparameter Tuning using GridSearchCV Business insights

4. Hotel Booking

Analyze the data of INN Hotels to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.

Skills & Tools Covered

EDA Data Pre-processing Logistic regression Multicollinearity Finding optimal threshold using AUC-ROC curve Decision trees Pruning

3. Pricing Preowned Devices

Analyze the used devices dataset, build a model which will help develop a dynamic pricing strategy for used and refurbished devices, and identify factors that significantly influence the price.

Skills & Tools Covered

EDA Linear Regression Linear Regression assumptions Business insights and recommendations

2. E-news Express

This project used statistical analysis, a/b testing, and visualization to decide whether the new landing page of an online news portal (E-news Express) is effective enough to gather new subscribers or not. The simulated dataset has certain important metrics such as converted status and time spent on the page that will help to conclude the effectiveness of the new landing page. Apart from that, the dependence of conversion on the preferred language will also be analyzed in this project.

Skills & Tools Covered

Hypothesis Testing a/b testing Data Visualization Statistical Inference

1. FoodHub Order Analysis using Python

The food aggregator company has stored the data of the different orders made by the registered customers in their online portal. They want to analyze the data to draw some actionable insights for the business. Suppose you are hired as a Data Scientist in this company and the Data Science team has shared some of the key questions that need to be answered. Perform the data analysis to find answers to these questions that will help the company to improve the business.

Skills & Tools Covered

Exploratory Data Analysis (Variable Identification Univariate analysis Bi-Variate analysis) Python