Skip to content

Vishal-sys-code/machine-learning-complete-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS:ML101 The Complete Machine Learning

This repository provides a comprehensive guide to Machine Learning (ML), covering everything from foundational concepts to advanced techniques and their practical applications. It is designed to help learners understand how to approach machine learning problems, implement solutions, and optimize performance using industry-standard tools and techniques.

THE COMPLETE MACHINE LEARNING

Table of Contents

  • Python Classes
    • Advanced Python: Decorators, Namespace, Iterators, Iterable and Iteration
    • Object-oriented programming: classes, objects, encapsulation, static keywords, inheritance, polymorphism, and abstraction.
  • ML Libraries
    • Numpy : [Fundamentals, Advanced, numpy Tricks, and Crash Course]
    • Pandas : [Series(I & II), DataFrames(I & II), and Crash Course]
    • Matplotlib : [Fundamental Plotting, Advanced Plotting, Crash Course]
    • Seaborn
  • Play With Data
    • Working with CSV
    • Understanding your data
  • Exploratory Data Analysis (EDA)
    • EDA with Univariate Analysis
    • EDA with Bivariate and Multivariate Analysis
  • Feature Engineering
    • Feature Scaling: Standardization, Normalization, Ordinal and Label Encoder, Nominal Encoding(One Hot Encoding)
    • Feature Transformation: Column Transformer, Function Transformer, Power Transformer (Box-Cox, Yeo-Johnson), Binning and Binarization.
    • Machine Learning Pipelines: Titanic dataset with and without Pipeline
    • Handling Missing Values: Mixed Variable, Date and Time Variable, Complete Case Analysis, Numerical Data, Categorical Data, Random Sample Imputer, Missing Indicator, Auto Select Imputer, KNN Imputer, Iterative Imputation.
    • Outliers: Outliers Removal using Z Score, IQR Range, Winsorization
  • Feature Construction and Feature Splitting
    • Feature Construction
    • Feature Splitting
  • Regression and Gradient Descent
    • Simple Linear Regression
    • Custom Simple Linear Regression
    • Regression Metrics: Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, Adjusted R2 Score
    • Multiple Linear Regression
    • Custom Multiple Linear Regression
    • Gradient Descent Step by Step
    • Custom Function for Gradient Descent
    • Gradient Descent Calculation of Slopes and Intercepts
    • Batch Gradient Descent
    • Stochastic Gradient Descent
    • Mini Batch Gradient Descent
    • Polynomial Regression
    • Ridge Regression
    • Lasso Regression
    • Elastic Net Regression
    • Logistic Regression
    • Classification Metrics: Accuracy, Confusion Matrix, Precision, Recall, F1 Score
    • Softmax Regression
    • Polynomial Logistic Regression
    • Logistic Regression Hyperparameters Tuning
  • Decision Trees and Regression Trees
    • Decision Trees
    • Regression Trees
  • Ensemble Learning
    • Voting Ensemble: Classification and Regression
    • Bagging Ensemble: Implementation, Classification and Regression
  • Random Forest
    • Random Forest Implementation
    • Bias Variance Trade-Off in Random Forest
    • Bagging vs Random Forest
    • Random Forest Hyper Parameter Tuning
    • Out Of Bag Score [OOB Score]
    • Feature Importance Using Random Forest
  • AdaBoost Classifier [Type of Boosting Algorithm]
    • AdaBoost Algorithm Implementation
    • AdaBoost Hyperparameters [Tuning & Implementing GridSearchCV]
  • K-Means Clustering
    • K-Means Clustering using sklearn or scikit-learn and Elbow Method
    • Implementation of K-Means from Scratch
    • Hierarchical Clustering (Implementation of Agglomerative Clustering)
    • Implementation of DBSCAN Clustering (for any shape of clusters)
  • Gradient Boosting [Type of Boosting Algorithm]
    • Implementing Gradient Boosting [Step by Step Approach]
  • K Nearest Neighbors [Lazy Propagation]
    • Implementation of K - Nearest Neighbor
  • Imbalanced Dataset
    • Implementation of an Imbalanced Dataset
    • Random Sampling: OverSampling and UnderSampling
    • Implementation of SMOTE Analysis for Imbalanced Dataset
  • Naive Bayes
    • Implementation of Naive Bayes Algorithm

All General Code Snippet: machine-learning.py

Machine Learning Templates [ReadMe]: Machine_Learning_Template

TODO

  • Publishing notes on each algorithm step by step.
  • Assignments and Recitation of Classes
  • Implementing Bagging for Classification Problems
  • Implementing Gradient Boosting for Classification Problems

Contact Information

If you have any questions or feedback regarding this course or the repository, feel free to reach out to me:

Feel free to open issues on GitHub if you encounter problems or have suggestions for improvements!