Skip to content

Supplementary material For Naukri Learning's video on ETL report generation using Python and Airflow

Notifications You must be signed in to change notification settings

pavelchowdhury99/automated_reports_generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automated Report Generation Using Python and Airflow

Objective

To create an Reporting ETL pipeline using Airflow and Python and update after particular intervals.

Tools used

  • Python
    • Webscraping
    • newspaper package
    • Pycharm IDE
  • ETL Pipeline using Airflow
  • Version control - Git using GitHub
  • Docker and docker-compose

Design Architecture

  1. Get the latest news related to a phrase/keyword from 1st page of Google search result.
  2. Get all the links, content of the pages and images link
  3. Create summary of each news article
  4. Put into an HTML file
  5. Run this procedure after given intervals using Airflow

Learnings from this exercise

  • Creating news article summarizer
  • Introduction to Docker and Docker Compose
  • Introduction to Airflow
  • Creation of reporting pipeline using Python

References

  1. Airflow docker-compose
  2. Newspaper Package
  3. gnews Package
  4. Download and install Docker
  5. Docker-compose
  6. Docker references
  7. Airflow Python Operator
  8. Airflow Documentation

Walkthrough

How to generate automated reports using Python & Airflow? | Report automation with Python

About

Supplementary material For Naukri Learning's video on ETL report generation using Python and Airflow

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published