MachineLearningNotes

My study&review notes on machine learning.

code enviroment:

Python 3.6.5 | Tensorflow 1.13.1 | Pytorch 1.2.0 | Sci-kit 0.22.1 | Keras 2.2.4

20-02-29 updates:

Logistic Regression Algorithm & Implementing with numpy.

20-03-02 updates:

Linear Discriminant Analysis & Implementing with numpy.

Principal Component Analysis & Implementing with numpy.
- Both PCA and LDA are methods of reducing feature dimensions.
- LDA is a supervised method while PCA is unsupervised.
- LDA can be used as classification method.
- PCA cares about the principal features of datas while LDA cares about seprating each categories.
- Both eigenvalue decomposition and singluar value decomposition can be used in PCA or LDA.
- Better centeralizing the datas while using PCA.

20-03-04 updates:

Decision Tree & Implementing with numpy.
- Implemented ID3.
- Information entropy, conditional entropy, information gain, information gain ratio.
- Recursively building decision tree and pruning.

20-03-05 updates:

Neural Network & Implementing with numpy.
- Implemented basic fully connected neural network with numpy.
- Sigmoid activation function only, updates in subsequent version.
  - newly updated model architecture: tanh -> tanh -> ... -> sigmoid
- Basic back propagation algorithm only, updates in subsequent version.
- Choose hidden layers, hidden units, epochs and batch size artificially before start training.

20-03-08 updates:

Support Vector Machine & Implementing with numpy.
- Implementing a SVC with soft margin and kernel funcion (linear kernel, RBFkernel).
- The implementing of SMO algorithm in this project can be further optimized.
- The mathematical principals in SVM and formulas derivation.

Naive Bayes Classifier & Implementing with numpy.
- Calculate discrete features by statistics.
- Calculate continuous features by Gaussian distribution.
- Predict test sample by calculating argmax{c} p(c)∏p(x|c).

20-03-09 updates:

Clustering algorithm K-means & Implementing with numpy.
- Distance between tow vectors, p-norm or cosine similarity.

K-Nearest Neighbors & Implementing with numpy.

20-03-12 updates:

EM Algorithm
- Gaussian Mixture Model (in EMAlgorithm.py)
- The mathematical principals are as follow:

- updated EMAlgorithm.py

20-03-15 updates:

Hidden Markov Model
- Viterbi algorithm.

20-03-16 updates:

Chinese part of speech tagging by HMM

20-03-18 updates:

word2vec skip-gram&cbow model
- skip-gram: given a center word, to predicting its context words;
- cbow: given a set of context words, to predicting the center word;
- datasets: "词性标注@人民日报199801.txt";
- the result below seems not very accurate due to lack of high-quality training data or insufficient training process;
- quality of word vectors is dependent on dataset, pre-processing and training setups.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
MachineLearningCode		MachineLearningCode
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MachineLearningNotes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MachineLearningNotes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages