A machine learning model which predicts the prices of domestic flights to and from any of the metro cities based on various parameters like airline, number of stops/layovers, arrival and departure times, origin, destination, ticket class, and the number of days from the trip.
This project implements a prediction system which can be used to predict the prices of flights based on various parameters. The machine leaning model analyses historical data to estimate flight prices.
- Input Parameters: airline, number of stops/layovers, arrival and departure times, origin, destination, ticket class, and the number of days from the trip.
- Model: Random Forest (Regressor)
- Price Estimation: Estimates the cost of domestic flights.
- Streamlit Frontend: Simple frontend built using Streamlit.
The dataset used has been taken from Kaggle.
airline: The name of the airline company.flight: The flight code of the plane.source_city: The city from which the flight takes off.departure_time: The departure time of the flight.stops: The number of stops the flight makes en-route.arrival_time: The arrival time of the flight.destination_city: The city where the flight will land.class: The ticket class (Economy or Business).duration: The time required to travel between the two cities.days_left: The number of days left for the flight/trip.price: The ticket price (in INR).
- Clone the repository:
git clone https://github.com/sayanjit082805/Flight-Price-Predictor.git
cd Flight-Price-Predictor- Create a virtual environment (recommended):
python -m venv flight_predictor_env
source flight_predictor_env/bin/activate # On Windows: flight_predictor_env\Scripts\activate- Install required dependencies:
pip install -r requirements.txtA sample test.py has been provided, simply tweak the required parameters and run the file using python3 test.py.
# features = ['airline', 'stops', 'source_city', 'departure_time', 'arrival_time', 'destination_city', 'class', 'days_left']
import joblib
import pandas as pd
model = joblib.load("flight_price_model.pkl")
model_columns = joblib.load("flight_price_model_columns.pkl")
new_price = pd.DataFrame(
[
{
"airline": "Air_India",
"stops": "zero",
"source_city": "Kolkata",
"departure_time": "Morning",
"arrival_time": "Afternoon",
"destination_city": "Mumbai",
"class": "Economy",
"days_left": 5,
}
]
)
new_price_encoded = pd.get_dummies(new_price)
new_price_encoded = new_price_encoded.reindex(columns=model_columns, fill_value=0)
predicted_price = model.predict(new_price_encoded)
print(f"The price of the ticket is: {round(predicted_price[0]):,} INR")The price of the ticket is: 8,191 INR
Deployed here.
| Data | MAE | R² Score | MSE |
|---|---|---|---|
| Training Data | 2,333 | 0.965 | 18234775.38 |
| Test Data | 2,344 | 0.964 | 18326984.04 |
The dataset has been taken from kaggle.
This project is licensed under The Unlicense License.