Skip to content

Latest commit

 

History

History
42 lines (26 loc) · 1.31 KB

README.md

File metadata and controls

42 lines (26 loc) · 1.31 KB

I. Original source for the code base.

Dataset
republic.txt from project gutenberg. The text file should be in respective folders.


HMM
https://www.youtube.com/watch?v=s3kKlUBa3b0 

Back-off Model
 https://medium.com/@davidmasse8/predicting-the-next-word-back-off-language-modeling-8db607444ba9   

LSTM Model
https://medium.com/coinmonks/word-level-lstm-text-generator-creating-automatic-song-lyrics-with-neural-networks-b8a1617104fb  

II. HMM_Model.ipynb : Implemented methods to train and evaluate the model with the baeline model.

backoff_model.ipynb : Code to train the model was referred from the medium blog. Implemented own methods to preprocess the data and evaluate the predictions made by the
					  model with our baseline model.

word_level_lstm_config_1.ipynb : Referred model architecture from the medium Blog. Implemented methods to preprocess data, train the model and evaluate the model with the baseline model.
								 To train the model under different network configurations run word_level_lstm_config_2.ipynb and word_level_lstm_config_3.ipynb.

III. Software Requirements:

Trained and tested on google collab.

Python 3.6	
Keras 2.1.5  
tensorflow-gpu 1.14
flake 8 3.5.0
numpy 2.0.0
sklearn 0.0
nltk 3.2.5
pandas 0.25.3
matplotlib 3.1.2