You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-2Lines changed: 8 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,9 @@ This repository is a project consisting of an ELT pipeline using Airflow in a Do
6
6
1. Request the follwing endpoint to download information about weather foerecast in Mexico per day by municipality: https://smn.conagua.gob.mx/tools/GUI/webservices/?method=1 and https://smn.conagua.gob.mx/tools/GUI/webservices/?method=3 per hour.
7
7
2. Uploads the Data into an S3 Bucket
8
8
3. Load the raw data into BigQuery and computes the following aggregates:
9
-
4. Generate a table that has the average temperature and precipitation by municipality of the last two hours.
10
-
5. Generate a joined table between the first generated table and the latest pre-computed data (data_municipios).
9
+
4. Generate a sample query.
10
+
5. Generate a table that has the average temperature and precipitation by municipality of the last two hours. (WIP)
11
+
6. Generate a joined table between the first generated table and the latest pre-computed data (data_municipios). (WIP)
11
12
12
13
## How to use it?
13
14
@@ -30,6 +31,11 @@ I used Pandas as data manipulation layer because it offers a complete solution t
30
31
The file directory follows and standard etl pipeline structure, we have:
31
32
**airflow/* Includes:
32
33
- dags/ directory where it's located the etls modules and custom operators that we will be using into the pipelines DAG to mantain the DAG file clean and organized
34
+
- custom_operators: Directory containing all the custom operators
35
+
- daily_etl_modules: Modules used in daily_pipeline.py DAG
36
+
- hourly_etl_modules: Modules used in hourly_pipeline.py DAG
37
+
- daily_pipeline.py and hourly_pipeline.py DAGS
38
+
- utils.py. Helper common functiions
33
39
-*data/* Includes:
34
40
-*data_municipios/*: Where it's stored static data about municipios (It should be in another place like a Database or storage service)
0 commit comments