Demonstration of various OSS technologies to construct an ETL/CLI pipeline
Steps to create environment
conda env create --file environment.yml
conda activate etl
conda list
conda info
conda deactivate
1(a). Setting the user credentials using flask fab
,
follow the instructions on the command line
FLASK_APP=airflow.www.app flask fab create-admin
1(b). Creating user using airflow's create users command
airflow users create
--username admin
--password your_password
--firstname your_first_name
--lastname your_last_name
--role Admin
--email your_email@some.com
- Run the below command to confirm if the user is created
airflow users list
- Initialise airflow database
airflow db init
-
Setup mysql database and secure it using password - macOS setup instructions
-
Make changes in the
airflow.cfg
-
Star the airflow webserver and schedular
airflow webserver
airflow schedular
- CLI commands
- Doit looks for
dodo.py
but you can explicitly add -f flag for other file namesdoit -f <filrname.py>
- doc attribute to add better messaging for end-user
- Defining tasks and dependency
def task_compile():
return {'actions': ["cc -c main.c"],
'file_dep': ["main.c", "defs.h"],
'targets': ["main.o"],
'doc': 'nice message'
}
Command usage - python <filename.py> <agrs*>