This repository incorporates the material used in experiments for the paper Benchmarking Apache Spark and Hadoop MapReduce on Big Data Classification; which was submitted to 5th International Conference on Cloud and Big Data Computing ICCDBC'21, in Liverpool, UK. You can reach the paper via DOI link.
Script that implements Naive-Bayes Classifier using MLlib library of Spark.
Shell script to use Mahout's NB implementations.
Script to convert datasets in libsvm format to sequential file format.