Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Physical logical design til #18

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,15 @@ This repository serves as a collection of things learned by people working in AG

- [Basic SQLite3](sqlite3/basic-sqlite3.md)

### Data Warehouse

- [Logical Design for Data Warehouse](data-warehouse/logical-design.md)
- [Physical Design for Data Warehouse](data-warehouse/physical-design.md)

### ETL

- [Basic ETL](etl/basic-etl.md)

## Contributing

If you want to share your learnings for today, please check out CONTRIBUTING.md
Expand Down
34 changes: 34 additions & 0 deletions data-warehouse/logical-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Data Warehouse: Logical Design

**Date: June 29, 2016**

### Logical Design

Being able to identify logical relationships between objects.

#### Factors to consider

- data content
- relationships of data
- data warehouse environment
- data transformation requirement
- frequency of refresh

#### Components

- entity
- attribute
- relationship

#### Output

- should present entity and attributes as fact tables and dimensions
- should be able to have a model of data from source to subjective information

### References
- [Oracle9i Data Warehousing Guide](https://docs.oracle.com/cd/B10501_01/server.920/a96520/toc.htm)
- [Intricity101 videos](https://www.youtube.com/user/Intricity101/videos)

### Author

Almer Mendoza
49 changes: 49 additions & 0 deletions data-warehouse/physical-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Data Warehouse: Physical Design

**Date: June 29, 2016**

### Physical Design

Taking ways and effectivity of storage into consideration.

#### Structures

###### Tablespaces

Tablespaces are container of Physical Design structures.

###### Tables and partitioned tables

Tables and Partioned Tables are container of raw data. They are the basic unit of storage.

###### Data Segment Compression

Ensures that speed and time spent on execution queries must increase and decrease, respectively.

###### Views

Visualizes data using tables.

###### Integrity Constraints

Adds rules on data manipulation to avoid invalid information.

###### Dimensions

Schema object defining relationships between fields.

###### Materialized Views

Does advance calculations and creates summaries to avoid expensive aggregate operations.

###### Indexes and Partitioned Indexes

Use of indexes to further partition table. Usuaully uses binary digit to signify on what category (or if it is part of the table category).

### References
- [Oracle9i Data Warehousing Guide](https://docs.oracle.com/cd/B10501_01/server.920/a96520/toc.htm)
- [Intricity101 videos](https://www.youtube.com/user/Intricity101/videos)

### Author

Almer Mendoza
24 changes: 24 additions & 0 deletions etl/basic-etl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# ETL - Basic

**Date: (June 29, 2016)**

ETL or *Extract - Transform - Load* is a process done in data warehousing. It is used when:
- You want to aggregate data from different sources into one collection (the warehouse)
- You want to select and arrange only the pertinent information for any kind of analytic, and provide his/her own user view of the data (Data mart)
- You want to clean the data and make meaningful sense out of it.

ETL can be broken down to three major steps:

1. Extract
- The part where you gather data from different data sources (csv files, databases, etc...). Data can also come from different data warehouses.
- It should be designed to avoid negative effects on source system such as its performance, response time, or any kind of data locking.

2. Transform
- Making the extracted data usable.
- This includes mapping the data, matching rows, enhancing data, summarizing data, etc.
- Transformation also includes standardizing data (such as currency and time formats) and handling encoding

3. Load
- Fetches prepared data and storing them to the data warehouse and database, or data mart.

Source: [ETL Tutorial | Extract Transform and Load](https://www.youtube.com/watch?v=WZw0OTgCBOY)