I. Original source for the code base.

Dataset
republic.txt from project gutenberg. The text file should be in respective folders.


HMM
https://www.youtube.com/watch?v=s3kKlUBa3b0 

Back-off Model
 https://medium.com/@davidmasse8/predicting-the-next-word-back-off-language-modeling-8db607444ba9   

LSTM Model
https://medium.com/coinmonks/word-level-lstm-text-generator-creating-automatic-song-lyrics-with-neural-networks-b8a1617104fb

II. HMM_Model.ipynb : Implemented methods to train and evaluate the model with the baeline model.

backoff_model.ipynb : Code to train the model was referred from the medium blog. Implemented own methods to preprocess the data and evaluate the predictions made by the
					  model with our baseline model.

word_level_lstm_config_1.ipynb : Referred model architecture from the medium Blog. Implemented methods to preprocess data, train the model and evaluate the model with the baseline model.
								 To train the model under different network configurations run word_level_lstm_config_2.ipynb and word_level_lstm_config_3.ipynb.

III. Software Requirements:

Trained and tested on google collab.

Python 3.6	
Keras 2.1.5  
tensorflow-gpu 1.14
flake 8 3.5.0
numpy 2.0.0
sklearn 0.0
nltk 3.2.5
pandas 0.25.3
matplotlib 3.1.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls