Skip to content

An Airflow plugin which supply some AWS operators with extended feature over the build in Airflow's operators

Notifications You must be signed in to change notification settings

personali/airflow_extended_aws_plugin

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Airflow Extended AWS Plugin

An Airflow plugin which supply some AWS operators with extended feature over the build in Airflow's operators

Deployment

  1. Copy the extended_aws_plugin.py into you Airflow's plugins directory.
  2. Create A DAG using the operators ( take a look at the examples ) and put it under the DAGS directory
  3. Configure the aws_default connection
  4. Restart Airflow services.

Operators

ExtendedEmrCreateJobFlowOperator

The ExtendedEmrCreateJobFlowOperator uses the built in aws_hook and give the following enhancements over the built in EmrCreateJobFlowOperator:

  1. Can optionally keep the cluster up and running even if you submit the create job flow without any steps or want the cluster to keep running even after it finished all steps.
  2. Can create an Airflow connection to the created Livy service. This can later on be used by LivySparkOperator to submit concurrent spark jobs to the cluster while keeping contact with the running jobs. Check out the airflow spark plugin supplying the ability to run jobs using Livy: https://github.com/rssanders3/airflow-spark-operator-plugin
  3. Specify the default api params inside the operator definition and not on an "emr_connection".

AthenaStartQueryOperator

The AthenaStartQueryOperator uses the built in aws_hook and gives the ability to run queries agains AWS Athena

About

An Airflow plugin which supply some AWS operators with extended feature over the build in Airflow's operators

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%