This project conducted an analysis of Miami-Dade Transit’s on-time performance using GTFS static data as well as data acquired from the Swiftly API from October 2022 to March 2023. Then predict the bus delay time based on advanced machine learning models.
Miami_bus_SQL.sql
: The bus GPS data was first processed with optimized SQL queries in AWS Redshift, generating 30+ fact and dimension tables to calculate KPIs, such as delay frequency and route-level reliability.
AnalysisReport.pdf
: this provides a detailed report of Miami-Dade Transit on-time performance, which was posted on Transit Alliance Miami.
Visualization.ipynb
: We applied two KPIs for analyzing the service reliability: arrival time differencee and headway difference.
StopAnalysis.ipynb
: We computed the daily service time for each route and the daily number of transit vehicles serving each transit stop to understand the transit service supply.
We are now building a real-time reliability dashboard with PowerBI/JavaScript. This dashboard will be used for Miami Transit Agency for continuous operational monitoring.
Project_Report.pdf
: This study predicted the bus on-time performance based on several machine learning models, including decision tree, random forest, support vector machine, and XGBoost. Through feature engineering and random hyperparameter grid search, the best model can achieve a 20% MAE reduction.
TRB.pdf
: We developed a space-time regression model in R to examine the association of service reliability with transit ridership. This paper is presented on the 2024 Transportation Research Conference.