MPI-ML

MPI-ML is parallel implementation of classification task.
MPI-ML runs a few different classifiers on given dataset (loaded from csv file).
Each classifier is parallely trained on the same dataset.

Testing dataset is equally divided and distributed to each process.
Each process runs classification task on the received part of testing data.

Project was created to compare the performance and accuracy of different classifiers with the use of Message Passing Interface in Python.

Dependencies

MPI
numpy
pandas
sklearn
mpi4pi

Build & Run

Install MPI (e.g. Ubuntu)

sudo apt install libmpich-dev

Install Python dependencies

sudo pip3 install sklearn pandas numpy mpi4py

Run

mpirun -n 4 python3 main.py

Note:
main.py must be run by mpirun to make the execution parallel. Otherwise only one process will be created and as a result only one classifier will be run.

Number of processes to be used for computation (4 in example) depends on number of classifiers you want to run parallely.
Current version of contains four classifiers: KNeighborsClassifier, DecisionTreeClassifier, MLPClassifier, SVC therefore 4 processes were used for computation.
If you want to run more classifiers parallely then you may want to use more processes - depending on your hardware.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MPI-ML

Dependencies

Build & Run

About

Languages

marmal95/MPI-ML

Folders and files

Latest commit

History

Repository files navigation

MPI-ML

Dependencies

Build & Run

About

Topics

Resources

Stars

Watchers

Forks

Languages