Skip to content

jhu-statprogramming-fall-2024/project4-team-bean

Repository files navigation



Bean There, Sorted That: Automating Quality Classification

Final Project for PH.140.777 Statistical Programming Paradigms and Workflows

Authors (Team BEAN): Bella Satpathy-Horton, Emily Potts, Asabere Asante, and Nowell Phelps

This repository contains the code that we used to build our machine learning models (models folder) that fed into the output (final dashboard folder), and was supplemented with additional web scraping and data visualization (data visualization folder). The main dataset for bean classification was pulled from the beans package in R, which was originally sourced from Koklu and Ozkan [1]. The repository is structured as follows.

Data Visualization Folder

  • Visualization.qmd and Webscraping_and_data_visualization.qmd

    • Beans production data scraped from the web and visualization of both global and regional beans production trends.

Model Folder

  • Emily_ML.qmd

    • This file contains the cross-validation for Random Forest and XGBoost, building the collective ensemble model, and bundling model objects.
  • Bella_ML.qmd

    • This file contains cross-validation for Classification Tree, Lasso, PCA, and SVM Model_metrics.qmd AUC curves, PCA component visualization, accuracy by bean type and model

Final_dashboard Folder

  • This directory contains the code needed to run the dashboard deploying results. The file “app.R” contains the code needed to run the dashboard. The “RData” files contain exported models for use in the dashboard.

[1] Koklu, Murat, and Ilker Ali Ozkan. ‘Multiclass classification of dry beans using computer vision and machine learning techniques.’ Computers and Electronics in Agriculture 174 (2020): 105507.

About

project4-team-bean created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages