This project illustrates how to use the LocalStack Snowflake+MWAA to run a data transformation pipeline entirely on your local machine.
The code is based on the Snowflake Guide for Data Engineering with Apache Airflow, Snowflake, Snowpark, dbt & Cosmos.
- A valid LocalStack for Snowflake license. Your license provides a
LOCALSTACK_AUTH_TOKEN. localstackCLI withLOCALSTACK_AUTH_TOKENenvironment variable setawslocalCLI- LocalStack Snowflake emulator
Start LocalStack with the custom Snowflake/Airflow networking flags:
docker network create --attachable --subnet 172.20.0.0/24 localstack
DOCKER_FLAGS='-e SF_LOG=trace --network localstack --name=localhost.localstack.cloud --network-alias=snowflake.localhost.localstack.cloud'
DEBUG=1 \
localstack start -s snowflake -dThe sample application provides Makefile targets to simplify the setup process.
Run the following command to initialize the Airflow environment in LocalStack (this may take a couple of seconds):
make init
After deploying the Airflow environment, you should be able to request its details, and extract the webserver URL:
awslocal mwaa get-environment --name my-mwaa-env
...
"Status": "AVAILABLE",
"WebserverUrl": "http://localhost.localstack.cloud:4510"
...
Now use the following command to deploy the Airflow DAG with our dbt transformation logic locally:
make deploy
Once the Airflow environment has spun up, and the DAG has been successfully deployed, you should be able to access the Airflow UI under http://localhost.localstack.cloud:4510/home
(Note that the port number may be different - make sure to copy the WebserverUrl from the output further above.)
You can use localstack/localstack as the username/password to log into the Airflow UI.
You can now trigger a DAG run from the UI. If all goes well, the DAG execution result should look something similar to this:
The code in this project is licensed under the Apache 2.0 License.
