Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 566 Bytes

readme.md

File metadata and controls

7 lines (4 loc) · 566 Bytes

Mining of massive datasets

A python implementation of the Apriori, PCY, Multistage and Multihash algorithms

To run a particular algorithm, cd into that directory and run 'python index.py'. index.py has a collection of all passes for all the algorithms and prints the result of each pass (i.e., item index table, the frequent k sets, etc.). For the given sample dataset, we do not require more than 3 passes and hence we stop after checking for candidate tripletons

Reference: Mining of massive datasets by Anand Rajaraman and Jeffrey D. Ullman