Skip to content

gimmy49699/MachineLearningNotes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 

Repository files navigation

MachineLearningNotes

My study&review notes on machine learning.

code enviroment:

Python 3.6.5 | Tensorflow 1.13.1 | Pytorch 1.2.0 | Sci-kit 0.22.1 | Keras 2.2.4

20-02-29 updates:

  • Logistic Regression Algorithm & Implementing with numpy.

20-03-02 updates:

  • Linear Discriminant Analysis & Implementing with numpy.

  • Principal Component Analysis & Implementing with numpy.
    • Both PCA and LDA are methods of reducing feature dimensions.
    • LDA is a supervised method while PCA is unsupervised.
    • LDA can be used as classification method.
    • PCA cares about the principal features of datas while LDA cares about seprating each categories.
    • Both eigenvalue decomposition and singluar value decomposition can be used in PCA or LDA.
    • Better centeralizing the datas while using PCA.

20-03-04 updates:

  • Decision Tree & Implementing with numpy.
    • Implemented ID3.
    • Information entropy, conditional entropy, information gain, information gain ratio.
    • Recursively building decision tree and pruning.

20-03-05 updates:

  • Neural Network & Implementing with numpy.
    • Implemented basic fully connected neural network with numpy.
    • Sigmoid activation function only, updates in subsequent version.
      • newly updated model architecture: tanh -> tanh -> ... -> sigmoid
    • Basic back propagation algorithm only, updates in subsequent version.
    • Choose hidden layers, hidden units, epochs and batch size artificially before start training.

20-03-08 updates:

  • Support Vector Machine & Implementing with numpy.
    • Implementing a SVC with soft margin and kernel funcion (linear kernel, RBFkernel).
    • The implementing of SMO algorithm in this project can be further optimized.
    • The mathematical principals in SVM and formulas derivation.

  • Naive Bayes Classifier & Implementing with numpy.
    • Calculate discrete features by statistics.
    • Calculate continuous features by Gaussian distribution.
    • Predict test sample by calculating argmax{c} p(c)∏p(x|c).

20-03-09 updates:

  • Clustering algorithm K-means & Implementing with numpy.
    • Distance between tow vectors, p-norm or cosine similarity.

  • K-Nearest Neighbors & Implementing with numpy.

20-03-12 updates:

  • EM Algorithm
    • Gaussian Mixture Model (in EMAlgorithm.py)
    • The mathematical principals are as follow:

- updated EMAlgorithm.py

20-03-15 updates:

  • Hidden Markov Model
    • Viterbi algorithm.

20-03-16 updates:

  • Chinese part of speech tagging by HMM

20-03-18 updates:

  • word2vec skip-gram&cbow model
    • skip-gram: given a center word, to predicting its context words;
    • cbow: given a set of context words, to predicting the center word;
    • datasets: "词性标注@人民日报199801.txt";
    • the result below seems not very accurate due to lack of high-quality training data or insufficient training process;
    • quality of word vectors is dependent on dataset, pre-processing and training setups.

About

My study&review notes on machine learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages