GitHub - KAFSALAH/IBM_MachineLearning: This repository encompasses various techniques of Regression, Classification, Clustering, Dimensionality Reduction, Deep Learning, and Recommendation Systems.

Welcome to my IBM—ML Repository 😄

This repository aims to build highly interpretable and accurate machine learning models that balance variance, bias, and time complexity. The Scikit-Learn framework is being used to build machine learning models and Keras for deep learning 💡

Courses

Moreover, the repository contains hands-on labs of 6 machine learning courses created by IBM, which cover in-depth and breadth numerous ML concepts.

01 - Exploratory Data Analysis

Hands-on Labs: SQL, Hypothesis Testing, Features Transformation, Scaling, Skewness & Importance.

02 - Supervised Machine Learning [Regression]

Hands-on Labs: Cross-Validation, Ridge, Lasso, ElasticNet, Pipelines.

03 - Supervised Machine Learning [Classification]

Hands-on Labs: Logistic Regression, K-Nearest Neighbor, Support Vector Machine, Decision Tree, Random Forrest, Extra Trees, Ensemble, Bagging, Boosting, Stacking, Model-Agnostic, Resampling Techniques.

04 - Unsupervised Machine Learning

Hands-on Labs: Principle Component Analysis, Distance Metrics, Inertia & Distortion, K-means, hierarchical, DBSCAN, Mean Shift Clustering.

05 - Deep Learning and Reinforcement Learning

Hands-on Labs: Gradient Descent, Backpropagation, Artificial NN, Convolutional NN, Recurrent NN.

06 - IBM ML Capstone Project — Online Courses Recommender System

Hands-on Labs: Bag of Words, User-Profile Recommendation, Similarity-Index Recommendation.

Capstone Projects

You are welcome to explore my findings in the personal capstone projects I created during my learning journey.

A - Treatment Costs per Person - Exploratory & Predictive Analysis

• Aim: predict the cost of medical treatments based on six features, namely, age, sex, BMI, children, smoking status, and region.

• Procedure: In-depth EDA via pair, bar, box, violin, and regression plots to see the effect of smoking on charges. Hypothesis testing on the relationship between treatment costs and smoking status.

• Findings: The test indicates that a person with a 35K$ charge or more is likely a smoker with a p-value = 0.023 and a confidence level = 0.977.

B - Forecasting Photovoltaic Generated Power - Regression Analysis

• Aim: create a regression model that predicts the generated power by PV panels to facilitate energy management in power plants.

• Procedure: Deploy a pipeline encompassing polynomial transformation, standard scaling, and regressor models. Then, apply GridSearchCV, hyper-parameters tuning and benchmarking of Regular, Lasso, Ridge, Elastic Net & Gradient Boosting Regressors.

• Findings: The winner is the Gradient Boosting Regressor model with an R2 score of ~ 0.79.

C - Fault Classification in Photovoltaic Plants - Multi-Class Classification Analysis

• Aim: Classify the faults that might occur in photovoltaic panels, namely, Short-Circuit, Open-Circuit, Degradation, and Shadowing.

• Procedure: Data stratified split, features scaling, and re-weighting the imbalanced classes. Then, apply a GridSearchCV, hyper-parameters tuning and benchmarking of Logistic Regression, Decision Tree, and Random Forrest.

• Findings: The winner is the Decision Tree algorithm with an accuracy and a weighted F1-score of ~ 97%.

D - Date Fruit Segmentation & Dimensionality Reduction via PCA - Unsupervised Analysis

• Aim: Cluster date fruits based on their physical features.

• Procedure: Check multicollinearity, scale data, and reduce the number of features via PCA. Then, apply a comparative analysis between K-means, Agglomerative, Mean Shift & DBSCAN clustering.

• Findings: The winner is the k-means++ technique. Also, an accuracy of 76% was scored with only two PCAs.

E - MRI Brain Tumor Classification via CNN - Deep Learning Analysis

• Aim: Detect whether a patient has a brain tumor or not.

• Procedure: Convert images to a NumPy array and scale them. Build a convolutional network and train the CNN model to classify brain tumors. Then, deploy the deep learning model using Flask app.

• Findings: The CNN model accuracy is 97%.

F - Personalized Course Recommendation System for Data Science Learners.

• Aim: To build a recommendation system that recommends the most suitable courses for learners on educational platforms.

• Procedure: As listed in the findings, several techniques are used to build the recommendation system.

• Findings: The recommender system is created via eight approaches. Firstly, the content-based approaches.

Approach 1 - Content-Based Recommender Using User Profile and Course Genres

Approach 2 - Content-Based Recommender Using Course Similarities

Approach 3 - Content-Based Recommender Using PCA Clustering

• Findings: The remaining five approaches are collaborative-based. The comparison between them is based on RMSE.

Approach 4 - Collaborative-Filtering Recommender Using K Nearest Neighbor

Approach 5 - Collaborative-Filtering Recommender Using Non-negative Matrix Factorization

Approach 6 - Collaborative-Filtering Recommender Using Neural Networks

Approach 7 - Collaborative-Filtering Recommender Using Embedding Features Regression

Approach 8 - Collaborative-Filtering Recommender using Embedding Features Classification

Acknowledgment

My friend, Mohamad Osman's ML-Repo has been a great source of inspiration. I implore you to have a look at his remarkable work.

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
01 - Exploratory Data Analysis		01 - Exploratory Data Analysis
02 - Supervised Machine Learning [Regression]		02 - Supervised Machine Learning [Regression]
03 - Supervised Machine Learning [Classification]		03 - Supervised Machine Learning [Classification]
04 - Unsupervised Machine Learning		04 - Unsupervised Machine Learning
05 - Deep Learning and Reinforcement Learning		05 - Deep Learning and Reinforcement Learning
06 - Recommender Systems		06 - Recommender Systems
IBM Machine Learning Professional Certificate.pdf		IBM Machine Learning Professional Certificate.pdf
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to my IBM—ML Repository 😄

Courses

01 - Exploratory Data Analysis

02 - Supervised Machine Learning [Regression]

03 - Supervised Machine Learning [Classification]

04 - Unsupervised Machine Learning

05 - Deep Learning and Reinforcement Learning

06 - IBM ML Capstone Project — Online Courses Recommender System

Capstone Projects

A - Treatment Costs per Person - Exploratory & Predictive Analysis

B - Forecasting Photovoltaic Generated Power - Regression Analysis

C - Fault Classification in Photovoltaic Plants - Multi-Class Classification Analysis

D - Date Fruit Segmentation & Dimensionality Reduction via PCA - Unsupervised Analysis

E - MRI Brain Tumor Classification via CNN - Deep Learning Analysis

F - Personalized Course Recommendation System for Data Science Learners.

Acknowledgment

About

Releases

Packages

Languages

KAFSALAH/IBM_MachineLearning

Folders and files

Latest commit

History

Repository files navigation

Welcome to my IBM—ML Repository 😄

Courses

01 - Exploratory Data Analysis

02 - Supervised Machine Learning [Regression]

03 - Supervised Machine Learning [Classification]

04 - Unsupervised Machine Learning

05 - Deep Learning and Reinforcement Learning

06 - IBM ML Capstone Project — Online Courses Recommender System

Capstone Projects

A - Treatment Costs per Person - Exploratory & Predictive Analysis

B - Forecasting Photovoltaic Generated Power - Regression Analysis

C - Fault Classification in Photovoltaic Plants - Multi-Class Classification Analysis

D - Date Fruit Segmentation & Dimensionality Reduction via PCA - Unsupervised Analysis

E - MRI Brain Tumor Classification via CNN - Deep Learning Analysis

F - Personalized Course Recommendation System for Data Science Learners.

Acknowledgment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages