From 1f3200daf4df00c14af9249cb9e21eba3710c77d Mon Sep 17 00:00:00 2001
From: AyseSubasi <86684300+AyseSubasi@users.noreply.github.com>
Date: Tue, 7 Sep 2021 21:50:05 +0200
Subject: [PATCH] toc+doc

---
 Olist-Data cleaning.ipynb | 45 +++++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 21 deletions(-)
diff --git a/Olist-Data cleaning.ipynb b/Olist-Data cleaning.ipynb
index 5442cc1..80a6c98 100644
--- a/Olist-Data cleaning.ipynb	
+++ b/Olist-Data cleaning.ipynb	
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "eb3e9893",
+   "id": "c5229c99",
    "metadata": {},
    "source": [
     "## Table of Contents"
@@ -10,7 +10,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "324573fc",
+   "id": "9a57f8ed",
    "metadata": {},
    "source": [
     "+ 1. [Libaries](#Libaries)\n",
@@ -101462,7 +101462,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "d5385a52",
+   "id": "93a5cf9d",
    "metadata": {},
    "source": [
     "### 4. Data Transformation <a class=\"anchor\" id=\"transform\"></a>"
@@ -101470,7 +101470,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4ca0e598",
+   "id": "b665c299",
    "metadata": {},
    "source": [
     "#### 4.1 Translating data <a class=\"anchor\" id=\"trans\"></a>"
@@ -101711,14 +101711,17 @@
   },
   {
    "cell_type": "markdown",
-   "id": "acf7df20",
+   "id": "23e67708",
    "metadata": {},
    "source": [
     "Why we decided not using a pipeline:\n",
     "\n",
-    "we dont work with one dataset\n",
-    "lack of time to build pipelines for each table\n",
-    "creating functions-> put it into new dataframes\n"
+    "* we dont work with one dataset\n",
+    "* lack of time to build pipelines for each table\n",
+    "* creating functions-> problem we need to put the function into a new dataframe\n",
+    "\n",
+    "Type of the tables - > checked in Workbench EER Diagramm everything was correct\n",
+    "\n"
    ]
   },
   {
@@ -101930,7 +101933,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "5982ad79",
+   "id": "6c7411f7",
    "metadata": {},
    "source": [
     "#### 5.2  Handling empty values <a class=\"anchor\" id=\"fill\"></a>"
@@ -101938,7 +101941,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "41188677",
+   "id": "51f8b179",
    "metadata": {},
    "source": [
     "Product dataframe - we consider the following columns as columns of interest:\n",
@@ -102072,7 +102075,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7d4326c1",
+   "id": "0d31b614",
    "metadata": {},
    "source": [
     "##### Now we fill in the missing values.. "
@@ -102133,7 +102136,7 @@
   {
    "cell_type": "code",
    "execution_count": 71,
-   "id": "e0da08c2",
+   "id": "90d258bd",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -102143,7 +102146,7 @@
   {
    "cell_type": "code",
    "execution_count": 70,
-   "id": "61dda57f",
+   "id": "d638e8d2",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -102193,7 +102196,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "1352b8c3",
+   "id": "80b80827",
    "metadata": {},
    "source": [
     "##### Checking orders column.. "
@@ -102337,7 +102340,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "d747a410",
+   "id": "6dfe5620",
    "metadata": {},
    "source": [
     "#### 5.3 Checking for duplicates <a class=\"anchor\" id=\"duplicates\"></a>"
@@ -102345,7 +102348,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "071940ba",
+   "id": "a0db1936",
    "metadata": {},
    "source": [
     "Checking for duplicate rows with - DataFrame.duplicated(subset=None, keep='first') - "
@@ -102521,7 +102524,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "aa32732d",
+   "id": "17123296",
    "metadata": {},
    "source": [
     "* geolocation is the only table with duplicates "
@@ -102529,7 +102532,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "30c015fe",
+   "id": "355927d0",
    "metadata": {},
    "source": [
     "#### 5.4  Drop duplicates <a class=\"anchor\" id=\"drop\"></a>"
@@ -102598,7 +102601,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "e7a3304d",
+   "id": "aef2559f",
    "metadata": {},
    "source": [
     "### 6. Applying functions <a class=\"anchor\" id=\"apply\"></a>"
@@ -102627,7 +102630,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "ec66da90",
+   "id": "d6100051",
    "metadata": {},
    "source": [
     "### 7. Save to csv<a class=\"anchor\" id=\"csv\"></a>"
@@ -102636,7 +102639,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "406d2b59",
+   "id": "bc011e83",
    "metadata": {},
    "outputs": [],
    "source": [