Skip to content

Commit a66cbc2

Browse files
Merge pull request #27 from axiom-of-choice/develop
UPDATE README
2 parents 5f96f63 + 7a02675 commit a66cbc2

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,9 @@ This repository is a project consisting of an ELT pipeline using Airflow in a Do
66
1. Request the follwing endpoint to download information about weather foerecast in Mexico per day by municipality: https://smn.conagua.gob.mx/tools/GUI/webservices/?method=1 and https://smn.conagua.gob.mx/tools/GUI/webservices/?method=3 per hour.
77
2. Uploads the Data into an S3 Bucket
88
3. Load the raw data into BigQuery and computes the following aggregates:
9-
4. Generate a table that has the average temperature and precipitation by municipality of the last two hours.
10-
5. Generate a joined table between the first generated table and the latest pre-computed data (data_municipios).
9+
4. Generate a sample query.
10+
5. Generate a table that has the average temperature and precipitation by municipality of the last two hours. (WIP)
11+
6. Generate a joined table between the first generated table and the latest pre-computed data (data_municipios). (WIP)
1112

1213
## How to use it?
1314

@@ -30,6 +31,11 @@ I used Pandas as data manipulation layer because it offers a complete solution t
3031
The file directory follows and standard etl pipeline structure, we have:
3132
* *airflow/* Includes:
3233
- dags/ directory where it's located the etls modules and custom operators that we will be using into the pipelines DAG to mantain the DAG file clean and organized
34+
- custom_operators: Directory containing all the custom operators
35+
- daily_etl_modules: Modules used in daily_pipeline.py DAG
36+
- hourly_etl_modules: Modules used in hourly_pipeline.py DAG
37+
- daily_pipeline.py and hourly_pipeline.py DAGS
38+
- utils.py. Helper common functiions
3339
- *data/* Includes:
3440
- *data_municipios/*: Where it's stored static data about municipios (It should be in another place like a Database or storage service)
3541
- Airflow config files.: *aiflow.cfg, airflow.db, webserver_config.py*

0 commit comments

Comments
 (0)