Skip to content

jasonheesanglee/theoretical_study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Theoretical Study

After continuing my studies in the field of AI, I realized that my knowledge needs to be improved to better understand the contents I am dealing with.
Therefore, I decided to study from the very beginning.
Here is the list of the contents I will study.
The list can be updated and refined as I progress through the studies.

Research Paper Implementation

In this directory, Research Paper Implementation will be done.
Here is the order that I completed; the order does not necessarily follow the chronological order of the model development.
I believe the below are the MUST to learn to start with NLP.
If anyone has any suggestions, please feel free to tell me! :)

  • After going through Single Layer Perceptron, I realized that reading a research paper line by line is quite ineffective.
    Therefore, I have decided to go through only the equations and their explanations and conclusions.
  1. Attention Is All You Need (Transformer) ☑️
  2. Single Layer Perceptron ☑️
  3. Back-Propagation & Multilayer Perceptron ☑️
  4. Recurrent Neural Network (RNN) & Long Short-term Memory (LSTM) ☑️
  5. Convolutional Neural Network (CNN) ☑️
  6. Gated Recurrent Unit (GRU) <<<< On Going >>>>
  7. Batch Normalization
  8. The graph neural network model (GNN)
  9. Transformers (All publicly available models)

image
image retrieved from ResearchGate

Model Characteristics Main Weaknesses
Single-Layer Perceptron - It is the first Neural Network Model, imitating Human Brain Cell (Neurons)
- It is able to solve simple Linear Classification Problems
- Unable to solve XOR problems, which requires more than one classifying standard or Non-Linear Classification.
Multi-Layer Perceptron - It can solve XOR problems with multiple hidden layers.
- It is the model where the term "Deep-Learning" came from
- Vanishing & Exploding Gradients
- High computation cost for large inputs
RNN - Utilizes hidden states as memory
therefore, it is able to process time-series data
- Struggle in long-term dependencies.
- The larger data it takes as input, the smaller the inclination the further part gets (due to the multiplication of weights).
LSTM - Solves the long-term dependency problem by applying Cell State (Long-Term memory) and the summation (not multiplication) of the weights
- It is designed to mimic the human brain (forgetting unimportant things and remembering important things)
- Too many gates, too much computational cost needed.
- Bigger storage (or memory) needed to train longer data, as it is remembering more of the past data than other RNN models.
CNN - -
GRU - -
Transformer

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published