Skip to content

Despite having no time whatsoever, I have to do an entire project to finish my degree. Here it is.

License

Notifications You must be signed in to change notification settings

PabloAceG/ComputingProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repository contains my dissertation. I presented this work as my final year project.

The paper is about how datasets are affected by the values of complexity metrics, and how some techniques that try to mitigate the effect of some of those metrics affect the evaluation results.

Pre-requisites

To be able to execute the experiments within the repository, Python3 is needed. Anaconda or the official Python located in the official repositories can be used as long as version 3 or posterior is used. Trying to replicate the experiments on some operative systems might terminate in error. If this is the case, python can be changed (Linux) with:

sudo update-alternatives --config python

R (programming language) is needed before trying to execute the project. Also, the following packages are mandatory in order to replicate the experiments:

Once the previous requisites are fulfilled, the R server can be started by executing the following commands:

library(Rserve) # import the library
run.Rserve() # start the server. Or simply Rserve()

Now, it is time to download the project to install the remaining Python packages. The project can be downloaded from this repo.

Same as before, some Python packages are mandatory to execute the project. These packages are available in requirements.txt file. To automatically install those packages, run (execute all commands from parent repository):

pip install -r .\code\requirements.txt

It might happen that pip install -r might not install all packages. To solve this, the failing packages must be installed manually:

pip install <package_name>

Execution

Now, the experiments should be replicable. The experiment's code is under the code folder. To run them, execute:

python code/metrics_comparison.py
python code/metrics_kfold.py
python code/metrics_kfold_undersampling.py
python code/metrics_kfold_oversampling.py

Each of the previous commands execute one experiment.

As final remarks, the class r_connect.py (go here) is the client connection the server in R (Rserve). It makes the requests to the ECoLpackage to obtain the complexity metrics.

The class data.py (go here) standardizes the datasets input (parsing data) and some other metrics from the package sklearn.

About

Despite having no time whatsoever, I have to do an entire project to finish my degree. Here it is.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages