Skip to content

Data analysis and implementations of a few basic machine learning algorithms, including Multiple Linear Regression, Logistic Regression, and SVM.

License

Notifications You must be signed in to change notification settings

adamreidsmith/Basic-ML-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Basic ML Models

This repository contains basic implementations of a few machine learning algorithms as well as the preliminary data analysis involved.

Description of Models

Linear Regression

In this project, we implement a linear regression model to predict the number of crimes in Calgary's communities based on the Calgary Crimes dataset using various attributes such as the crime type, number of community residents, and date range. Preliminary data analysis involved imputing missing values, numerical encoding of categorical variables, investigating potential new variables by combining the existing ones, and preparing the data for the LR algorithm. The MLR was carried out with the open source machine learning framework scikit-learn.

MLR

A multiple linear regression (MLR) model was used to predict the fuel economy of vehicles given attributes of the vehicles, such as weight, engine displacement, and horsepower, which were found within the Auto MPG Data Set from the UCI Machine Learning Repository. Preliminary data analysis involved imputing missing values, inspecting relationships between variables, investigating potential new variables by combining the existing ones, and preparing the data for the MLR algorithm. The MLR was carried out with the open source machine learning framework PyTorch.

Logistic Regression

Income

This binary classification model aims to predics whether an individuals annual income exceeds $50,000 given characteristics about their education, ethnicity, working class, and a few others found in the Adult Data Set from the UCI Machine Learning Repository. Preliminary data analysis involves investigation the relationships between given variables and income, adjusting variables to better suit the learning algorithm, and preparing the data types for the model. The logistic regression model was carried out with the open source machine learning framework PyTorch. Several measures were used to analyze the model's performance including a confusion matrix, F-1 score, and an ROC curve.

Crimes

In this project, we implement a logistic regression model and a fully-connected neural network to predict the category of crimes in Calgary's communities based on the Calgary Crimes dataset. Preliminary data analysis involved imputing missing values, numerical encoding of categorical variables, investigating potential new variables by combining the existing ones, and preparing the data for the machine learning algorithms. The logistic regression was carried out with the open source machine learning framework scikit-learn, whereas TensorFlow was used to implement the fully-connected neural network.

SVM

The data used in this binary classification model is the same as that for the logistic regression model. The preliminary data analysis proceeds the same as well, except that missing values are imputed with a KNN imputer using the machine learning library scikit-learn. The model is created with scikit-learn and a grid search over the classifier's C and gamma values is implemented in an effort to optimize the model. Several measures of the models performance are computed, and the ROC curve from this model is compared to that of the logistic regression model. Both models are found to perform similarly with little difference in accuracy and F-1 score.

Technologies Used

Installation

$ git clone https://github.com/adamreidsmith/Basic-ML-Models
$ cd Basic-ML-Models
$ sudo pip3 install -r requirements.txt

License

MIT

About

Data analysis and implementations of a few basic machine learning algorithms, including Multiple Linear Regression, Logistic Regression, and SVM.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published