Predict Next Word

This is a N-gram language model that predicts the next word based on a precalculated conditional probability. This model is trained on bigrams from COCA corpus which can be downloaded from this link.

Steps to run this script

Download the bigram dataset from above provided link. You will need to register with an email ID for doing so. There are three files for bigram model.

Non case sensitive - w2_.zip
Case sensitive - w2.zip
Case sensitive with POS tagging - w2c.zip

We need only the Non case sensitive zip.

Extract the Non case sensitive zip. You will get w2_.txt file. Put this file in the same folder where the scripts 'getTopBigram.py' and 'bigramConditionalProbability.py' are there.
Make a new directory named pickleDumps.
Run 'bigramConditionalProbability.py' file using python bigramConditionalProbability.py. Note that you will need to have pickle module installed before that. You can install it using pip install pickle. (It would be recommended to have a virtual environment for such stuff). Previous command will create some pickle dumps in pickleDumps directory.
Run python getTopBigram.py and enjoy :p.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
outputScreenshot		outputScreenshot
README.md		README.md
bigramConditionalProbability.py		bigramConditionalProbability.py
getTopBigram.py		getTopBigram.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predict Next Word

Steps to run this script

Screenshot of Output

About

Releases

Packages

Contributors 2

Languages

rikenshah/predict-next-word

Folders and files

Latest commit

History

Repository files navigation

Predict Next Word

Steps to run this script

Screenshot of Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages