Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etl #16

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Etl #16

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,10 @@ This repository serves as a collection of things learned by people working in AG

- [Basic SQLite3](sqlite3/basic-sqlite3.md)

### ETL

- [Basic ETL](etl/basic-etl.md)

## Contributing

If you want to share your learnings for today, please check out CONTRIBUTING.md
Expand Down
24 changes: 24 additions & 0 deletions etl/basic-etl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# ETL - Basic

**Date: (June 29, 2016)**

ETL or *Extract - Transform - Load* is a process done in data warehousing. It is used when:
- You want to aggregate data from different sources into one collection (the warehouse)
- You want to select and arrange only the pertinent information for any kind of analytic, and provide his/her own user view of the data (Data mart)
- You want to clean the data and make meaningful sense out of it.

ETL can be broken down to three major steps:

1. Extract
- The part where you gather data from different data sources (csv files, databases, etc...). Data can also come from different data warehouses.
- It should be designed to avoid negative effects on source system such as its performance, response time, or any kind of data locking.

2. Transform
- Making the extracted data usable.
- This includes mapping the data, matching rows, enhancing data, summarizing data, etc.
- Transformation also includes standardizing data (such as currency and time formats) and handling encoding

3. Load
- Fetches prepared data and storing them to the data warehouse and database, or data mart.

Source: [ETL Tutorial | Extract Transform and Load](https://www.youtube.com/watch?v=WZw0OTgCBOY)