Skip to content

Goal of this work is to take Bengali one or more words as input in a system and predict the next most likely word and also suggest the full possible sentence as output. Recurrent Neural Network (RNN) was used with Gated Recurrent Unit (GRU) to train and create the model. About 200,000 plus word data has been used as dataset. The dataset has been…

Notifications You must be signed in to change notification settings

OmorFarukRakib/Bangla-Word-Prediction-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

About the work

Textual information exchange, by typing the information and send it to the other end, is one of the most prominent mediums of communication throughout the world. People occupy a lot of time sending emails or additional information on social networking sites where typing the whole information is redundant and time-consuming in this advanced era. To make textual information exchange more speedy and easier, word predictive systems are launched which can predict the next most likely word so that people do not have to type the next word but select it from the suggested words. In this study, we have proposed a method that can predict the next most appropriate and suitable word in Bangla language, and also it can suggest the corresponding sentence to contribute to this technology of word prediction systems. This proposed approach is, using GRU (Gated Recurrent Unit) based RNN (Recurrent Neural Network) on n-gram dataset to create such language models that can predict the word(s) from the input sequence provided. We have used a corpus dataset, collected from different sources in Bangla language to run the experiments. Compared to the other methods that have been used such as LSTM (Long Short Term Memory) based RNN on n-gram dataset and Naïve Bayes with Latent Semantic Analysis, our proposed approach gives better performance. It gives an average accuracy of 99.70% for 5-gram model, 99.24% for 4-gram model, 95.84% for Tri-gram model, 78.15%, and 32.17% respectively for Bi-gram and Uni-gram models on average

Dataset

We have collected 200 thousand word data from various trusted sources such as Daily Prothom alo, Bangla Academic Books, BBC News Bangla and others.

See our work in action

We have created a GUI using python and this interface shows our work in action.

Example 1 interface

Example 2 Interface_2

Paper

We Also published a paper based on our work. To understand our work deeply, Please see our paper.

Tittle: Bangla Word Prediction and Sentence Completion Using GRU: An Extended Version of RNN on N-gram Language Model

Find our paper through the link below:
Click to find our paper in ieeexplore OR Click to find our paper in Researchgate

About

Goal of this work is to take Bengali one or more words as input in a system and predict the next most likely word and also suggest the full possible sentence as output. Recurrent Neural Network (RNN) was used with Gated Recurrent Unit (GRU) to train and create the model. About 200,000 plus word data has been used as dataset. The dataset has been…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published