A brief history of NLP

Bag of words

A long long time ago bag-of-words model was used for NLP. It relied on the frequency of words, no sequence or order, and predicted the most obvious next word

Sequence models

Next came sequence models. Cue RNN and LSTM. Here the effort was put in understanding sentence by 'seeing' words in sequence.

Attention is all you need

Based on the idea of word association. Weighted memory for seen words in association with other words in the sentence. Bigger better NLP models developed from here.

Big transformers

Self-attention used to predict words in the middle of a sentence to develop better understanding of which words affect the masked word. Develop word context. Multiple layers of encoders and decoders offer parallelism. Transformers like BERT and GPT are used for transfer learning, i.e., they are pre-trained on large corpus of general data, then fine-tuned with domain specific data.

Preparing dataset

download book summary dataset from here
run python dataprep.py

Train model

Refer this blog

clone transformers repository
git clone git@github.com:huggingface/transformers.git
install tensorflow, transformers, pytorch
run sh run_model.sh

Plug and play

Using Uber's Plug and Play Language Model (PPLM)

git clone pplm model
git clone git@github.com:uber-research/PPLM.git
run sh pplmrun.sh [genre] [output file]
genre options are sci-fi, dystopian, fantasy, romance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A brief history of NLP

Bag of words

Sequence models

Attention is all you need

Big transformers

Preparing dataset

Train model

Plug and play

Using Uber's Plug and Play Language Model (PPLM)

Files

README.md

Latest commit

History

README.md

File metadata and controls

A brief history of NLP

Bag of words

Sequence models

Attention is all you need

Big transformers

Preparing dataset

Train model

Plug and play

Using Uber's Plug and Play Language Model (PPLM)