This algorithm could be used to build a list of word string combinations for a basic approach at language recognition.
word -> apple
string combos -> [app, ap, le, ple, ... ]
word -> house
string combos -> [ho, se, ous, ... ]
The data.txt file is used to build a pattern of word behaviour. It's very simple, if a string combination is present in a word, it moves up the list.
Pretty basic machine learning ..