Amazon Face Mask Rating Prediction with Neural Network Model

Online retailers saw an unprecedented demand for face masks in the year 2020 with the rise of COVID-19. Analyzing the reviews and ratings of products using sentiment analysis with Natural Language Processing (NTLK) can help businesses understand consumer behavior and lead to better product development.

In this machine learning project, I created a Neural Network (NN) model that will predict the star rating on a scale of 1 to 5, based on a written review from real customers on Amazon.

This project demonstrates a complete machine learning workflow:

data scraping using Selenium web automation
data cleaning and EDA with python, numpy, and pandas
data visualization seaborn and matplotlib
text data processing and encoding with NTLK
neural network modeling, training, and testing with scikit-learn, keras, and tensorflow
model evaluation using various metrics

Summary | Notebooks

Sale Volume Prediction with XGBoost

The goal of this project is to work with time-series data and use XGBoost to forecast sales volume for each store. A unique aspect of the dataset is that that the list of stores and products changes every month and there are new items in the testing dataset that are not present in the training dataset.

Project workflow summary:

process outliers
impute missing data
discover data duplication
encode features
time-series analysis
feature engineering using target lags
generate trend features
modeling with XGBoost
model evaluation

Summary | Notebook

House Price Prediction with Stacked Regression

The goal of this notebook is to predict house prices using stacked regression models.

Project workflow summary:

process outliers
process missing data, impute values from other features
perform logarithmic transformation on skewed data
shuffle and splitting the data for training, validation, and testing
produce base models using lasso regression, elastic net regression, kernel ridge regression, gradient boosting regression, XGBoost, and LightGMB
stacking models using the meta-model method where out-of-fold predictions made on the holdout dataset are used as training for a meta-model.
using root mean squared log error to evaluate results, which is more robust to outliers compared to traditional RMSE.

Summary | Notebook

Credit Card Fraud Detection with CNN

It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. In this notebook, we will compare the different ways we can handle an imbalanced dataset in machine learning.

Most machine learning algorithms work best when the number of samples in each class is roughly equal and balance. However, with anomaly detection problems, the positive class will always be a small portion of the overall data. For example, in this credit card dataset, only 0.17% of transactions being classified as fraudulent.

The goal of this notebook is to explore uneven data distributions and use a CNN model to detect anomalies.

Project workflow summary:

sample from positive class to balance the data
split data into train and test sets
use StandardScalar() to normalize features
create model using convolutional neural network deep learning algorithm
evaluate model accuracy

Notebook

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
AWS-boto3-lambda		AWS-boto3-lambda
Credit Card Fraud Detection with CNN		Credit Card Fraud Detection with CNN
Face Mask Rating Prediction		Face Mask Rating Prediction
House Price Prediction with Stacked Regression		House Price Prediction with Stacked Regression
Sales Prediction with XGBoost		Sales Prediction with XGBoost
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Amazon Face Mask Rating Prediction with Neural Network Model

Sale Volume Prediction with XGBoost

House Price Prediction with Stacked Regression

Credit Card Fraud Detection with CNN

About

Uh oh!

Releases

Packages

Languages

jinysong/data-engineering-projects

Folders and files

Latest commit

History

Repository files navigation

Amazon Face Mask Rating Prediction with Neural Network Model

Sale Volume Prediction with XGBoost

House Price Prediction with Stacked Regression

Credit Card Fraud Detection with CNN

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages