This project aimed to classify the malware; one might need to run the spark version at Google DataProc with Spark kernel with Python 3.
This spark program was implemented with Naive Bayes algorithm from Dr. Quinn lecture slides. It is assumed that all the features of one item are independent with each other.
Yulong, Md Redwan Islam, Ruili