Lecturer: Hossein Hajiabolhassan
The Webpage of the Course: Applied Machine Learning 2019
Data Science Center, Shahid Beheshti University
-
- Lecture 1: Toolkit Lab (Part 1)
- Lecture 2: Introduction
- Lecture 3: Empirical Risk Minimization
- Lecture 4: PAC Learning
- Lecture 5: The Bias-Complexity Tradeoff
- Lecture 6: The VC-Dimension
- Lecture 7: Toolkit Lab (Part 2)
- Lecture 8: Linear Predictors
- Lecture 9: Decision Trees
- Lecture 10: Nearest Neighbor
- Lecture 11: Ensemble Methods
- Lecture 12: Model Selection and Validation
- Lecture 13: Neural Networks
- Lecture 14: Convex Learning Problems
- Lecture 15: Regularization and Stability
- Lecture 16: Support Vector Machines
-
Miscellaneous
Machine learning is an area of artificial intelligence that provides systems the ability to
automatically learn. Machine learning allows machines to handle new situations via analysis,
self-training, observation and experience. The wonderful success of machine learning has made
it the default method of choice for artificial intelligence experts. In this course, we review
the fundamentals and algorithms of machine learning.
Main TextBooks:
- Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David
- An Introduction to Statistical Learning: with Applications in R by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
Additional TextBooks:
- Machine Learning Mastery With Python by Jason Brownlee
- Introduction to Machine Learning with Python: A Guide for Data Scientists by Andreas Mueller and Sarah Guido
- Pattern Recognition and Machine Learning by Christopher Bishop
Recommended Slides & Papers:
-
Required Reading:
Additional Reading:
-
Required Reading:
- Slide: Machine Learning: Types of Machine Learning I by Javier Bejar
- Slide: Machine Learning: Types of Machine Learning II by Javier Bejar
-
Required Reading:
- A Formal Model – The Statistical Learning Framework & Empirical Risk Minimization
Chapter 2 of Understanding Machine Learning: From Theory to Algorithms- Exercises: 2.1, 2.2, and 2.3
- Slide: Machine Learning by Roland Kwitt
- Slide: Lecture 1 by Shai Shalev-Shwartz
- A Formal Model – The Statistical Learning Framework & Empirical Risk Minimization
-
Required Reading:
- Chapter 3 of Understanding Machine Learning: From Theory to Algorithms
- Exercises: 3.2, 3.3, 3.4, 3.5, 3.6, 3.7
- Slide: Machine Learning by Roland Kwitt
- Slide: Lecture 2 by Shai Shalev-Shwartz
- Chapter 3 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 5 of Understanding Machine Learning: From Theory to Algorithms
- Exercise: 5.2
- Slide: Machine Learning by Roland Kwitt
- Slide: Lecture 3 by Shai Shalev-Shwartz
- Paper: The Bias-Variance Dilemma by Raul Rojas
Additional Reading:
- NoteBook: Exploring the Bias-Variance Tradeoff by Kevin Markham
- Chapter 5 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 6 of Understanding Machine Learning: From Theory to Algorithms
- Exercises: 6.2, 6.4, 6.6, 6.9, 6.10, and 6.11
- Exercises: 6.2, 6.4, 6.6, 6.9, 6.10, and 6.11
- Slide: Machine Learning by Roland Kwitt
- Chapter 6 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Machine Learning Mastery With Python by Jason Brownlee
- Data Exploration:
- NoteBook: Titanic 1 – Data Exploration by John Stamford
- NoteBook: Kaggle Titanic Supervised Learning Tutorial
- NoteBook: An Example Machine Learning Notebook by Randal S. Olson
- Homework: Take the 7-Day Machine Learning Challenge of Kaggle: Machine learning is the hottest field in data science, and this track will get you started quickly.
- Machine Learning Mastery With Python by Jason Brownlee
-
Required Reading:
- Chapter 9 of Understanding Machine Learning: From Theory to Algorithms
- Exercises: 9.1, 9.3, 9.4, and 9.6
- Exercises: 9.1, 9.3, 9.4, and 9.6
- Slide: Machine Learning by Roland Kwitt
- Slide: Tutorial 3: Consistent linear predictors and Linear regression by Nir Ailon
- NoteBook: Perceptron in Scikit by Chris Albon
- Paper: Perceptron for Imbalanced Classes and Multiclass Classification by Piyush Rai
Additional Reading:
- NoteBook: Linear Regression by Kevin Markham
- Paper: Matrix Differentiation by Randal J. Barnes
- Lecture: Logistic Regression by Cosma Shalizi
- NoteBook: Logistic Regression-Analysis by Nitin Borwankar
- NoteBook: Logistic Regression by Kevin Markham
- Infographic and Code: Simple Linear Regression (100 Days Of ML Code) by Avik Jain
- Infographic and Code: Multiple Linear Regression (100 Days Of ML Code) by Avik Jain
- Infographic and Code: Logistic Regression (100 Days Of ML Code) by Avik Jain
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Linear Regression by UC Business Analytics R Programming Guide
- Blog: Linear Regression with lm() by Nathaniel D. Phillips
- Blog: Logistic Regression by UC Business Analytics R Programming Guide
- Chapter 9 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 18 of Understanding Machine Learning: From Theory to Algorithms
- Exercise: 18.2
- Exercise: 18.2
- Slide: Decision Trees by Nicholas Ruozzi
- Slide: Representation of Boolean Functions by Troels Bjerre Sørensen
- Slide: Overfitting in Decision Trees by Reid Johnson
- NoteBook: Decision Trees
Additional Reading:
- Paper: Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? by Manuel Fernandez-Delgado, Eva Cernadas, Senen Barro, and Dinani Amorim
- Blog: Random Forest Classifier Example by Chris Albon. This tutorial is based on Yhat’s 2013 tutorial on Random Forests in Python.
- NoteBook: Titanic Competition with Random Forest by Chris Albon
- Infographic and Code: Decision Trees (100 Days Of ML Code) by Avik Jain
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Decision Tree Classifier Implementation in R by Rahul Saxena
- Blog: Regression Trees by UC Business Analytics R Programming Guide
- Chapter 18 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 19 (Section 1) of Understanding Machine Learning: From Theory to Algorithms
- Slide: Nearest Neighbor Classification by Vivek Srikumar
- NoteBook: k-Nearest Neighbors
Additional Reading:
- NoteBook: Training a Machine Learning Model with Scikit-Learn by Kevin Markham
- NoteBook: Comparing Machine Learning Models in Scikit-Learn by Kevin Markham
- Infographic: K-Nearest Neighbours (100 Days Of ML Code) by Avik Jain
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Knn Classifier Implementation in R with Caret Package by Rahul Saxena
- Chapter 19 (Section 1) of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 10 of Understanding Machine Learning: From Theory to Algorithms and Chapter 8 of An Introduction to Statistical Learning: with Applications in R
- Exercises: 10.1, 10.3, 10.4, and 10.5 from Understanding Machine Learning: From Theory to Algorithms
- Exercises: 10.1, 10.3, 10.4, and 10.5 from Understanding Machine Learning: From Theory to Algorithms
- Slide: Bagging and Random Forests by David Rosenberg
- Slide: Ensemble Learning through Diversity Management: Theory, Algorithms, and Applications by Huanhuan Chen and Xin Yao
- Slide: Machine Learning by Roland Kwitt
- Slide: Introduction to Machine Learning (Boosting) by Shai Shalev-Shwartz
- Paper: Ensemble Methods in Machine Learnin by Thomas G. Dietterich
- NoteBook: AdaBoost
Additional Reading:
- Blog: Ensemble Methods by Rai Kapil
- Blog: Boosting, Bagging, and Stacking — Ensemble Methods with sklearn and mlens by Robert R.F. DeFilippi
- NoteBook: Introduction to Python Ensembles by Sebastian Flennerhag
- Library (ML-Ensemble): Graph handles for deep computational graphs and ready-made ensemble classes for ensemble networks by Sebastian Flennerhag
- NoteBook: Ensemble Methods by Vadim Smolyakov
- Paper: On Agnostic Boosting and Parity Learning by A. T. Kalai, Y. Mansour, and E. Verbin
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Random Forests by UC Business Analytics R Programming Guide
- Chapter 10 of Understanding Machine Learning: From Theory to Algorithms and Chapter 8 of An Introduction to Statistical Learning: with Applications in R
-
Required Reading:
- Chapter 11 of Understanding Machine Learning: From Theory to Algorithms
- Exercises: 11.1 and 11.2 from Understanding Machine Learning: From Theory to Algorithms
- Exercises: 11.1 and 11.2 from Understanding Machine Learning: From Theory to Algorithms
- Tutorial: Learning Curves for Machine Learning in Python by Alex Olteanu
- Blog: K-Fold and Other Cross-Validation Techniques by Renu Khandelwal
- NoteBook: Split the Dataset Using Stratified K-Folds Cross-Validator
- Blog: Hyperparameter Tuning the Random Forest in Python by Will Koehrsen
- Blog: Hyperparameter Optimization: Explanation of Automatized Algorithms by Dawid Kopczyk
Additional Reading:
- NoteBook: Cross Validation by Ritchie Ng
- NoteBook: Cross Validation With Parameter Tuning Using Grid Search by Chris Albon
- Blog: Random Test/Train Split is not Always Enough by Win-Vector
- Slide: Cross-Validation: What, How and Which? by Pradeep Reddy Raamana
- Paper: Algorithms for Hyper-Parameter Optimization (NIPS 2011) by J. Bergstra, R. Bardenet,Y. Bengio, and B. Kégl
- Library: Yellowbrick (Machine Learning Visualization)
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Resampling Methods by UC Business Analytics R Programming Guide
- Blog: Linear Model Selection by UC Business Analytics R Programming Guide
- Chapter 11 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 20 of Understanding Machine Learning: From Theory to Algorithms
- Slide: Neural Networks by Shai Shalev-Shwartz
- Blog: 7 Types of Neural Network Activation Functions: How to Choose?
- Blog: Activation Functions
- Blog: Back-Propagation, an Introduction by Sanjeev Arora and Tengyu Ma
Additional Reading:
- Blog: The Gradient by Khanacademy
- Blog: Activation Functions by Dhaval Dholakia
- Paper: Why Does Deep & Cheap Learning Work So Well? by Henry W. Lin, Max Tegmark, and David Rolnick
- Slide: Basics of Neural Networks by Connelly Barnes
R (Programming Language):
- Blog: Classification Artificial Neural Network by UC Business Analytics R Programming Guide
- Chapter 20 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 12 of Understanding Machine Learning: From Theory to Algorithms
- Slide: Machine Learning by Roland Kwitt
Additional Reading:
- Blog: Escaping from Saddle Points by Rong Ge
- Chapter 12 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 13 of Understanding Machine Learning: From Theory to Algorithms
- Slide: Machine Learning by Roland Kwitt
- Blog: L1 and L2 Regularization by Renu Khandelwal
- Blog: L1 Norm Regularization and Sparsity Explained for Dummies by Shi Yan
Additional Resources:
- NoteBook: Regularization by Ethen
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Regularized Regression by UC Business Analytics R Programming Guide
- Chapter 13 of Understanding Machine Learning: From Theory to Algorithms
-
Required Reading:
- Chapter 15 of Understanding Machine Learning: From Theory to Algorithms
- Slide: Support Vector Machines and Kernel Methods by Shai Shalev-Shwartz
- Blog: Support Vector Machine (SVM) by Ajay Yadav
- Blog: Support Vector Machine vs Logistic Regression by Georgios Drakos
Additional Reading:
- Infographic: Support Vector Machines (100 Days Of ML Code) by Avik Jain
- Markdown (NoteBook)
R (Programming Language):
- Book: Machine Learning Mastery With R by Jason Brownlee
- Blog: Support Vector Machine Classifier Implementation in R with Caret Package by Rahul Saxena
- Blog: Support Vector Machine by UC Business Analytics R Programming Guide
- Chapter 15 of Understanding Machine Learning: From Theory to Algorithms
-
- Course: Fondations of Machine Learning by David S. Rosenberg
- Python Machine Learning Book Code Repository
- Dive into Machine Learning
- Python code for "An Introduction to Statistical Learning with Applications in R" by Jordi Warmenhoven
- iPython-NoteBooks by John Wittenauer
- Scikit-Learn Tutorial by Jake Vanderplas
- Data Science Roadmap by Javier Estraviz
- Course: Fondations of Machine Learning by David S. Rosenberg
Saturday and Monday 08:00-09:30 AM (Spring 2019), Room 204/1.
Projects are programming assignments that cover the topic of this course. Any project is written by
Jupyter Notebook. Projects will require the use of Python 3.7, as well as
additional Python libraries as follows.
- Python 3.7: An interactive, object-oriented, extensible programming language.
- NumPy: A Python package for scientific computing.
- Pandas: A Python package for high-performance, easy-to-use data structures and data analysis tools.
- Scikit-Learn: A Python package for machine learning.
- Matplotlib: A Python package for 2D plotting.
- SciPy: A Python package for mathematics, science, and engineering.
- IPython: An architecture for interactive computing with Python.
- Slide: Practical Advice for Building Machine Learning Applications by Vivek Srikumar
- Blog: Comparison of Machine Learning Models by Kevin Markham
- Technical Notes On Using Data Science & Artificial Intelligence: To Fight For Something That Matters by Chris Albon
Google Colab is a free cloud service and it supports free GPU!
- How to Use Google Colab by Souvik Mandal
- Primer for Learning Google Colab
- Deep Learning Development with Google Colab, TensorFlow, Keras & PyTorch
The students can include mathematical notation within markdown cells using LaTeX in their Jupyter Notebooks.
- Preparing and Cleaning Data for Machine Learning by Josh Devlin
- Getting Started with Kaggle: House Prices Competition by Adam Massachi
- Scikit-learn Tutorial: Machine Learning in Python by Satyabrata Pal
- Projects and Midterm – 50%
- Endterm – 50%
Final Examination: Saturday 1398/03/25, 08:30-10:30
General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent.
- Video: Professor Gilbert Strang's Video Lectures on linear algebra.
- Learn Probability and Statistics Through Interactive Visualizations: Seeing Theory was created by Daniel Kunin while an undergraduate at Brown University. The goal of this website is to make statistics more accessible through interactive visualizations (designed using Mike Bostock’s JavaScript library D3.js).
- Statistics and Probability: This website provides training and tools to help you solve statistics problems quickly, easily, and accurately - without having to ask anyone for help.
- Jupyter NoteBooks: Introduction to Statistics by Bargava
- Video: Professor John Tsitsiklis's Video Lectures on Applied Probability.
- Video: Professor Krishna Jagannathan's Video Lectures on Probability Theory.
Course (Videos, Lectures, Assignments): MIT OpenCourseWare (Discrete Mathematics)
Have a look at some reports of Kaggle or Stanford students (CS224N, CS224D) to get some general inspiration.
It is necessary to have a GitHub account to share your projects. It offers plans for both private repositories and free accounts. Github is like the hammer in your toolbox, therefore, you need to have it!
Honesty and integrity are vital elements of the academic works. All your submitted assignments must be entirely your own (or your own group's).
We will follow the standard of Department of Mathematical Sciences approach:
- You can get help, but you MUST acknowledge the help on the work you hand in
- Failure to acknowledge your sources is a violation of the Honor Code
- You can talk to others about the algorithm(s) to be used to solve a homework problem; as long as you then mention their name(s) on the work you submit
- You should not use code of others or be looking at code of others when you write your own: You can talk to people but have to write your own solution/code
I will be having office hours for this course on Monday (09:30 AM--12:00 AM). If this is not convenient, email me at hhaji@sbu.ac.ir or talk to me after class.