Mining of massive datasets

A python implementation of the Apriori, PCY, Multistage and Multihash algorithms

To run a particular algorithm, cd into that directory and run 'python index.py'. index.py has a collection of all passes for all the algorithms and prints the result of each pass (i.e., item index table, the frequent k sets, etc.). For the given sample dataset, we do not require more than 3 passes and hence we stop after checking for candidate tripletons

Reference: Mining of massive datasets by Anand Rajaraman and Jeffrey D. Ullman

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Mining of massive datasets

A python implementation of the Apriori, PCY, Multistage and Multihash algorithms

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Mining of massive datasets

A python implementation of the Apriori, PCY, Multistage and Multihash algorithms