Skip to content

build a simple airflow Proj around some dataset you decide

License

Notifications You must be signed in to change notification settings

OCIPOC/AirflowWeekend

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AirflowPipelineProject

ingest, clean, process and report.

Data Sources that can be used:

  • Tennis tournament data
  • Spotify API on music
  • Housing data from a few sources (gov data mixed with MLS)
  • Personal history of Amazon purchases
  • Wearable data
  • Fantasy football
  • a dataset from your passion project domain

Create pipeline to ingest data from some source and go through some type of process of your choice. Use Airflow to manage the running of the project

  • ingest from one source
  • clean (add missing data, create a feature or two)
  • process and perform some transformations
  • report (charts, graphs, some kind of output which informs our understanding of the datasets)
  • keep it pretty simple, it needs to be done in a day or two.

Take a look into airflow-proj-src to see s super simple, docker-hosted, Airflow project to start with.

About

build a simple airflow Proj around some dataset you decide

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 54.9%
  • Dockerfile 34.6%
  • Shell 10.5%