Skip to content

Latest commit

 

History

History
82 lines (55 loc) · 3.53 KB

README.md

File metadata and controls

82 lines (55 loc) · 3.53 KB

flight-price-prediction

SDAIA Bootcamp project 2 - web scraping/linear regression.

This project aims to predict ticket prices for upcoming flights to help customers in selecting the optimum time for travel and the cheapest flight to the desired destination. A random forest regression model is applied to forecast the flight prices based on data scraped from Kayak.

Table of Contents

Project Proposal

The project proposal can be found here.

Project MVP

The project MVP can be found here.

Scraping

The Kayak Scraper Notebook can be found here.

Here's a demo of the scraper in action (played at 2x speed):

scraper (1)

The scraped data can be found here.

image

In total, the data consists of 55,363 rows and 7 columns.

Analysis and Results

The project notebook can be found here.

Selected features are:

  • Source (4 Sources were selected for this project)
  • Destination (4 Destinations were selected for this project)
  • Total Stops
  • Average Price per Airline
  • Duration
  • Price (Target)

Correlation of features:

image

Experimenting with different models:

image

The final selected model is the random forest regression model with:

Metric Score
MAE 61.87
MSE 40409.87
RMSE 201.02

Therefore, the final model is able to predict flight ticket prices within around ≈ $61.87.

The final model can be found here.

image

Presentation

The presentation can be found here.

Mobile App

We've also developed an app on Android that finds the average estimated prices for a selected route and month based on our scraped data.

image image

Below, a demo of the mobile app is shown:

flight-pred-app

Authors