Skip to content

evalvarez12/Spellchecker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spell checker

Build Status

Simple spell correction and word prediction using a trie structure.

Algorithm

Word completion is accomplished by using what is called a trie, that is an organization of words using the letters as the nodes of a tree, which by following the one way edges one recovers the words1. Tries provide a simple yet effective way to do word completion by storing how many times each word is used in the tree itself, thus allowing for a prediction of the incomplete input based on the user history. To illustrate the main idea lets think of an example, imagine we have a word history input of these commands.

chmod  = 100
chown  = 55
cd     = 142
cat    = 32
cal    = 16

image

The image above represents the trie formed using that input history. The nodes with numbers on them represent the number of times the input of the word formed by following the tree from root to that node has been used by the user. Imagine the user enters the character c the results of auto completion are obtained by traversing the trie from the node c to all the child-nodes that have completed commands, then it returns all of them and their used number value to make a hierarchical suggestion of these based on that number. In these case that would be cd chmod chown cat cal.

The problem of spell checking or “fuzzy search” is for example when the user inputs the following chdwm we want our program to include chown as the first suggestion. To accomplish this we want to traverse in a smart way our tree by using a measure called the Levenshtein Distance2, which simply represents the number of characters we need to exchange on one word to get to another.

References

1: R. Sedgewick and K. Wayne,Tries,(http://algs4.cs.princeton.edu/lectures/52Tries.pdf)

2: N. Johnson,BK-Trees,(http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages