This repository contains the code used for my second assignment on the elective course Programming Languages.
- The project involves the implementation of Linear Regression, Decision Tree Regressor, Random Forest Regressor and K-Nearest Regressor algorithms and the calculation of performance metrics MSE, R2, MAE and MAPE for evaluating the dataset "Bioconcentration Factor Dataset".
- Three different neural network architectures were developed and trained on the previous dataset. The models were evaluated using metrics on both training and test sets and a detailed hyperparameter tuning process was carried out.
- Dimensionality reduction was performed using PCA and t-SNE and KMeans clustering was also applied to the reduced data.
- A Random Forest surrogate model was developed to approximate the relationship between input features and BCF. The surrogate was used to perform input optimization, aiming to minimize the predicted BCF.