Welcome to my Data Science Projects Repository! This repository contains a collection of my data science projects, showcasing my skills and expertise in the field. Each project demonstrates different aspects of data analysis, machine learning, and visualization.
-
- Description: The project focus on constructing an end-to-end data analysis project based on the titanic dataset
- Keywords:
- Descriptive Analysis (pie charts, bar plots, histograms, box plots, data imbalance, data cleaning, data wrangling, feature engineering, etc.)
- Diagnostic Analysis (correlation matrix, scatter plots, pair plots, histogram, etc.)
- Predictive Analysis (Data transformation, Nested cross-validation, hyperparameter tuning, confusion matrix, Logistic Regression, Decision Tree, Random Forest, SVM, KNN, etc.)
- Prescriptive Analysis (SHAP, feature importance, model interpretation, etc.)
- Results: The accuracy of the neural network model on the test dataset is about 78% (Rank 3502/15745, Top 23% on Kaggle)
-
- Description: The project focus on extracting data from the MySQL database and creating a BI Dashboard to track DRM key usage accurately.
- Keywords:
- Data Extraction (MySQL, SQL queries)
- Data Visualization (BI Dashboard)
- Results:
- The data for tracking the DRM key is correctly extracted from the MySQL database.
- The BI Dashboard is created to track the DRM key usage accurately.
-
Stock Volatility Forecasting In Vietnam
- Description: The project focus on making an end-to-end production including performing data ETL, construct a forecast model and creating an custom API application to forecast the volatility of the stock market in Vietnam.
- Keywords:
- Data Extraction (
https://eodhd.com/
API,Requests
) - Data Transformation (
Pandas
) - Data Loading (
SQLite
) - Forecasting Model (
GARCH
) - API Application (
FastAPI
)
- Data Extraction (
- Results:
- To be updated
-
AB Testing Advertising Campaign
- Description: The project focus on analyzing the results of an A/B test for an advertising campaign.
- Keywords:
- Exploratory Data Analysis (Data Cleaning, Data Wrangling, Data Visualization)
- Hypothesis Testing (A/B Testing, T-Test, Permutation Test)
- Results:
- No significant difference in the
# of Purchases
between the two groups in the advertising campaign. - Need more improvements in designing the advertising campaign.
- Need more data to make a more accurate decision.
- No significant difference in the
This project is licensed under the GNU License. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
I welcome any feedback, suggestions, or questions you may have about the projects or any kind of sponsorships for the repository. Feel free to reach out to me via email at haison19952013@gmail.com
Enjoy exploring my data science projects!