Skip to content

Latest commit

 

History

History
105 lines (82 loc) · 5.35 KB

File metadata and controls

105 lines (82 loc) · 5.35 KB

Display

Introduction-to-Machine-Learning

Repository of notebooks and conceptual insights from my "Introduction to Machine Learning" course. Each section contains corresponding PDF solutions, equation, and Python scripts.

equation

(Warm-up)

equation
1.1 Linear Algebra
1.2 Calculus and Probability
1.3 Optimal Classifiers and Decision Rules
1.4 Multivariate normal (or Gaussian) distribution

equation
Visualizing the Hoeffding bound. k-NN algorithm.

equation

equation
2.1 PAC learnability of ℓ2-balls around the origin
2.2 PAC in Expectation
2.3 Union Of Intervals
2.4 Prediction by polynomials
2.5 Structural Risk Minimization

equation

Union Of Intervals. Study the hypothesis class of a finite union of disjoint intervals, and the properties of the ERM algorithm for this class. To review, let the sample space be equation and assume we study a binary classification problem,i.e. equation. We will try to learn using an hypothesis class that consists of k disjoint intervals. define the corresponding hypothesis as

equation

equation

equation
3.1 Step-size Perceptron
3.2 Convex functions
3.3 GD with projection
3.4 Gradient Descent on Smooth Functions

equation

SGD for Hinge loss. In the file skeleton sgd.py there is an helper function. The function reads the examples labelled 0, 8 and returns them with the labels −1/+1. In case you are unable to read the MNIST data with the provided script, you can download the file from Here.

equation

SGD for log-loss. In this exercise we will optimize the log loss defined as follows:

equation

equation

equation
4.1 SVM with multiple classes
4.2 Soft-SVM bound using hard-SVM
4.3 Separability using polynomial kernel
4.4 Expressivity of ReLU networks
4.5 Implementing boolean functions using ReLU networks.

equation

SVM Exploring different polynomial kernel degrees for SVM. We will use an existing implementation of SVM, the SVC class from sklearn.svm.

Neural Networks we will implement the back-propagation algorithm for training a neural network. We will work with the MNIST data set that consists of 60000 28x28 gray scale images with values of 0 to 1. Define the log-loss on a single example

equation

And the loss we want to minimize is

equation