Approach:
- mined 337 topics
- for each topic, mined 100 documents
- cataloged each document counting words
- kept a running total of all words
- kept an individual log of per/document words
- calculated probabilities in a Markov fasion
Current Results:
- currently running tests (ostensibly working)
Usage:
- the main use case for this would be someone looking to disambiguate text in a large computational fashion
- ideally, this module will server as a hook for larger scale purposes