Skip to content

Latest commit

 

History

History
73 lines (62 loc) · 2.35 KB

README.md

File metadata and controls

73 lines (62 loc) · 2.35 KB

Data Science Bootcamp


  1. Overview About Bootacamp
  2. Bootcamp Plan

1. Overview About Bootacamp

This repository contains my work during the Himmah data science Bootcamp. It's presented by SDA and Coding Dojo.

2. Bootcamp Plan

This bootcamp divided into 4 main stacks:

Business Intelligence

  • Week-1
    • Analysis data using Excel.
  • Week-2
    • Understand SQL and NoSQL.
  • Week-3
    • Business Intelligence overview.
    • Tableau.

The statistic Programming Language R

  • Week-1
    • Introduction to R.
    • Load data and packages.
    • Conditional statements and data vectorization.
  • Week-2
    • Data visualization using ggplot.
    • Understand Exploratory Data Analysis (EDA).
    • Understand Tidy Data.
    • Data types.
  • Week-3
    • Probability and decision analysis.
    • Understand training, validation, and testing sets.
    • Build a shiny app (Dashboard).

Introduction to Python

  • Week-1
    • Intro to anaconda and Jupyter notebook.
    • Intro to Numpy and linear equations.
    • Loops, conditional statements, and functions.
  • Week-2
    • Extract data from an API.
    • Data manipulation and statistic analysis.
    • Data visualization using Matplotlib and Seaborn.
    • EDA with insightful visualization.
  • Week-3
    • Deal with missing values and implication techniques.
    • Build a dash application.

Introduction to Machine Learning

  • Week-1
    • Intro to Machine Learning.
    • Build heuristic model.
    • Understanding Cost functions.
    • Build a linear regression model.
  • Week-2
    • Understand the data pipeline.
    • Understand Scikit-learn API.
    • Build logistic regression.
    • Apply feature engineering techniques.
    • Improve model using GridSearch.
  • Week-3
    • Overview of ensemble modeling and understanding boosting and bagging techniques.
    • Understand Principal Component Analysis (PCA)
    • Understand the Decision trees and random forests.
    • Clustering technique.

Capstone Project

The aim of this project is to build customer segmentation models for a delivery app using Kmena and DBSCAN. For more details about the capstone project see project repository.