Skip to content

Latest commit

 

History

History
110 lines (90 loc) · 5.27 KB

README.md

File metadata and controls

110 lines (90 loc) · 5.27 KB

✈️ Tunisair airlines flight performance

This repository contains the data pipeline and analysis to generate a daily key performance indicator report on Tunisair flight delays.

Report Overview

The generated report includes:

  • The count of departure delays that Tunisair has made
  • The min, max, and average delays for departures and arrival (in minutes)
  • A bar chart to compare performance between Tunisair, Nouvelair, Airfrance (& Transavia)
  • The report will be published daily at 9 a.m Europe/Paris timezone on the Twitter account @Tunisairalert

Tunisair alert report Preview



Python PyTest Pandas Matplotlib SQLite Json API


Features

Data Ingestion

  • The tasks are scheduled to run hourly from 7 am to midnight using CRON.
  • An API request is made to Airlabs to gather data in JSON format.
  • The JSON data is cleaned, enriched and saved into a SQLite3 database tunisair_delay.db
  • The airport data is enriched using the pyairpots module, thanks to NICTA for providing the module.

This process ensures that the data is always up-to-date and accurate, allowing for the most accurate analysis of flight performance.

Data Analysis

  • The tasks are scheduled to run every day at 9 a.m Paris/Timezone using CRON.
  • A daily query is performed on the SQLite3 database to extract the necessary data for analysis.
  • The Pandas and Matplotlib` frameworks are used to create visual representations of the data, such as plots and charts.
  • The Pillow package is used to generate a daily report using the visualizations created in the previous step.

This process ensures that the report is always up-to-date, providing the most current information on Tunisair flight performance. The report will be easy to understand, as it is accompanied with visual representation of data.

Twitter Posting

  • Once the daily report is generated, a tweet is automatically posted to the @Tunisairalert account, providing real-time updates on Tunisair's flight performance to followers.
  • The tweet will include a summary of the key performance indicators and a link to the full report for those who want to dive deeper into the data.

This allows for easy dissemination of the report to a wider audience, and also allows for real-time monitoring of Tunisair's performance. The transparency of this process will make it easy for stakeholders to stay informed about the airline's performance.

Server Management

  • Since the script will be hosted on a personal server using FreeBSD, a FTP script is made to update local .db data
  • CRON JOB for api_job.py 0 0,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 * * * root python daily_cron.py
  • CRON JOB for twitter_job.py 0 9 * * * root python twitter_job.py

Configuration

  • You need to obtain a token from airlabs.co and add it to the .env file located in the root directory of the project.
  • You also need to obtain Twitter API codes and add them to the .env file. See tutorial and past the information in .env

The .env file will loke like this

consumer_key=
consumer_secret=
access_token=
access_token_secret=
path=
file_name=tunisair_delay.db
ip_adress=
login=
password=
token_airlab=

Folder Structure

📁|- data-analysis : containing all pandas, and matplotlib features
📁|- data-pipeline : containing the api requests, sql queries and the table.db
📁|- src : containing media, utils and consts
📁|- test : containing some function test
🐍.api_job.py : will be used daily for data scrapping
🐍.post_to_twitter.py : will be used daily to post on twitter

Setup

  • Install the packages in requirements.txt
  • api_job.py is the module that will ingest the data from Airlabs API
  • twitter_job.py is the module that will post the report on Twitter

📫 Contact me

LinkedIn

---

License

You can check out the full license here

This project is open source and has no buisness intent.