This repository contains the solution to the AI Engineering Challenge presented by Digital Product School.
This PythonAnywhere-based application Link -> https://flaskdeploy.pythonanywhere.com/ provides accident forecasts based on various parameters. It predicts accident values for different categories and types. The Models which are used by the web application are developed using Meta's Prophet ML Model here. There are 2 distinct models whose source code and development can be found in the "PROPHET&NN MODELS.ipynb".
- Tabs Available:
- Accident Forecast: Use this tab to predict accident values.
- Parameters for Prediction:
- Category: Select the category for which you want to forecast accidents.
- Type: Choose the type of accident prediction (e.g., overall, specific type).
- Year and Month: Input the desired year and month for the prediction.
- Getting Forecast:
- After setting the parameters, click on the "Get Forecast" button.
- Predicted Value:
- The application will display the predicted value based on your selected parameters.
- Category: Indicates the selected accident category.
- Type: Specifies the type of prediction (e.g., overall, specific type).
- Year and Month: Shows the selected year and month for the prediction.
- Predicted Value: Displays the forecasted accident value.
- The application will display the predicted value based on your selected parameters.
For instance, to predict alcohol-related accidents for November 2021:
- Category: Alcohol Accidents
- Type: Overall
- Year: 2021
- Month: 11
- Click "Get Forecast"
- Predicted Value: 31.31 (This is an example; the actual value may vary)
Feel free to experiment with different parameters to forecast accidents based on your preferences.
In this part of the challenge, the goal was to create an AI model using the "Monatszahlen Verkehrsunfälle" Dataset obtained from the München Open Data Portal. The model forecasts the number of accidents per category, specifically for 'Alkoholunfälle' and type 'insgesamt' in the year 2021 and the first month.
The "Monatszahlen Verkehrsunfälle" dataset, available for download here, encompasses detailed information related to traffic accidents. This dataset contains valuable insights into various aspects of traffic incidents, providing a comprehensive overview across specific categories.
The solution includes:
- Data preprocessing to handle the dataset and filter records up to the end of 2020.
- Development of the prediction model using here
- Visualization of historical accident counts per category
- Forecasting of alcoholic accident vs value
This part involved:
- The code is organized into detailed commits, each accompanied by descriptive messages that outline the specific changes made at every step of the development process. These commits serve as a comprehensive log, providing insights into the evolution of the project.
- The model deployment involves creating an endpoint capable of receiving POST requests containing specific JSON content for prediction retrieval. I created 3 endpoints that return the predictions in JSON format for Alcohol accidents, Traffic accidents, and Escape accidents. This endpoint facilitates interactions by accepting incoming data, enabling the model to process and generate predictions, and providing a streamlined and accessible interface for users to leverage the model's capabilities.
- Mean Absolute Error (MAE): 4.09
- Mean Squared Error (MSE): 16.74
- R-squared: Not a Number (NaN)
- Mean Absolute Error (MAE): 0.83
- Mean Squared Error (MSE): 0.70
- Root Mean Squared Error (RMSE): 0.83
The Neural Network model presented NaN for the R-squared value due to the usage of MinMaxScaler during the model training process. MinMaxScaler scales features to a range between 0 and 1. While this can be beneficial for neural networks by improving convergence and speed, it can lead to challenges in computing R-squared values, resulting in NaN. Despite this, the model achieved an impressive MAE of 0.83 and an MSE of 0.70, indicating its efficacy in predicting accident values.
- Flask: Infrastructure of Web-related elements
- NumPy : Fundamental package for scientific computing
- pandas: Used for manipulation and analysis of data frames
- matplotlib and seaborn : Basic libraries used to create graphical outputs
- scikit-learn: Library used to implement machine learning and related methods
- Prophet: library used to implement prediction model
- TensorFlow: Used for AI-based(NN) models and methods