A long long time ago bag-of-words model was used for NLP. It relied on the frequency of words, no sequence or order, and predicted the most obvious next word
Next came sequence models. Cue RNN and LSTM. Here the effort was put in understanding sentence by 'seeing' words in sequence.
Based on the idea of word association. Weighted memory for seen words in association with other words in the sentence. Bigger better NLP models developed from here.
Self-attention used to predict words in the middle of a sentence to develop better understanding of which words affect the masked word. Develop word context. Multiple layers of encoders and decoders offer parallelism. Transformers like BERT and GPT are used for transfer learning, i.e., they are pre-trained on large corpus of general data, then fine-tuned with domain specific data.
- download book summary dataset from here
- run
python dataprep.py
Refer this blog
- clone transformers repository
git clone git@github.com:huggingface/transformers.git
- install tensorflow, transformers, pytorch
- run
sh run_model.sh
- git clone pplm model
git clone git@github.com:uber-research/PPLM.git
- run
sh pplmrun.sh [genre] [output file]
genre options are sci-fi, dystopian, fantasy, romance