Skip to content

Kaustubh-Sable/EfficientN-grams

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

I. Original source for the code base.

Dataset
republic.txt from project gutenberg. The text file should be in respective folders.


HMM
https://www.youtube.com/watch?v=s3kKlUBa3b0 

Back-off Model
 https://medium.com/@davidmasse8/predicting-the-next-word-back-off-language-modeling-8db607444ba9   

LSTM Model
https://medium.com/coinmonks/word-level-lstm-text-generator-creating-automatic-song-lyrics-with-neural-networks-b8a1617104fb  

II. HMM_Model.ipynb : Implemented methods to train and evaluate the model with the baseline model.

backoff_model.ipynb : Code to train the model was referred from the medium blog. Implemented own methods to preprocess the data and evaluate the predictions made by the
					  model with our baseline model.

word_level_lstm_config_1.ipynb : Referred model architecture from the medium Blog. Implemented methods to preprocess data, train the model and evaluate the model with the baseline model.
								 To train the model under different network configurations run word_level_lstm_config_2.ipynb and word_level_lstm_config_3.ipynb.

III. Software Requirements:

Trained and tested on google collab.

Python 3.6	
Keras 2.1.5  
tensorflow-gpu 1.14
flake 8 3.5.0
numpy 2.0.0
sklearn 0.0
nltk 3.2.5
pandas 0.25.3
matplotlib 3.1.2

About

Efficient n-gram predictions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.3%
  • Python 3.7%