A logistics management company (public safety vertical) is working with other government agencies to create data analytics to improve road safety. The company also wants to create an application + dashboard which support these agencies (safety managers/analysts) to analyze fatality data to uncover few key insights to identify causes of accidents, and improve their road safety planning. This portfolio project for data analysis demonstrates the overall “end-to-end” lifecycle of a typical data analysis project within any data analysis endeavor, i.e. from business requirements gathering to development to final UAT and delivery.
The solution approach and steps are described in detail below:
- Business problem scenario, objectives and KPIs.
- Skills and applications used.
- Solution approach and process to build this project.
- Data preparation process - Data gathering, Data Cleaning, Data Transformation.
- Data Visualization – Dashboard, Reports, Charts, Key findings, Data Insights.
- Portfolio Project Documentation – Final Conclusion.
Logistics managers and safety managers would like to know about the following business queries and KPIs. This dashboard should satisfy the KPIs to achieve business goals. This application should also be flexible for further enhancements to handle future queries.
- Primary KPIs:
- Total casualties taken place after the accidents.
- Total casualties & percentage of total with respect to accident severity and maximum casualties by type of vehicle.
- Secondary KPIs:
- Total casualties with respect to vehicle type.
- Monthly trend showing comparison for current year and previous year.
- Maximum casualties by road type.
- Distribution of total casualties by road surface.
- Relation between casualties by area/location & day/night conditions.
This dashboard application will be used for the following job roles to increase their productivity and achieve business goals.
- DMV (State Department of Motor Vehicle)
- Local Law Enforcement Agencies such as Local Police Departments, Fire Departments.
- Emergency Service Companies – Ambulance Provider companies
- Local Government/Municipal departments such as Public Works departments.
- Transport Logistics companies – trucking, car/van sharing companies etc.
- Traffic Management Companies
- Public Safety Organizations
- News Media Companies
- Business Analysts
- Microsoft Excel
- Knowledge of advanced formulae and functions to be used in any typical data analytical solutions.
- Advanced data visualization skills in Excel
- SME (Subject Matter Expertise)
- Business requirement gathering skill and techniques.
- Logistics data knowledge and expertise.
To achieve the company’s objective and develop analytical tools to support business requirements, I used the following subject matter knowledge and Microsoft Excel skills:
- Advanced Excel functions/formulas, Visualization skills.
- Pivot Table Analyzer – Calculated Items for SET.
- GETPIVOTDATA function for data manipulation and visualization.
- Advanced ‘Style’ creation for Treemap chart, Donut chart, Timeline, Slicer etc.
- Reports Visualization:
- Trending Chart Report – Current Year v/s Previous Year Casualties – Monthly Trend (Pivot Table/Line Chart)
- Bar Chart Report – Casualties by Road Type (Pivot Table/Bar Chart)
- Treemap Chart Report – casualties by Road Surface (Pivot Table/Treemap Chart)
- Dashboard Style data visualization – Fatal Casualties, Serious Casualties, Slight Casualties, Casualties by Car (Pivot Table/Donut Charts)
- Dashboard Tiles Style data visualization – Total Casualties by Vehicle type (Pivot Table/Tiles)
- Dashboard Visualization:
- Dynamic Global Filter: Accident Date (Timeline Selector)
- Dynamic Global Filter: Urban / Rural (Slicer)
- Dynamic report’s data refresh (Report Connections)
- Navigation Panel for easy navigation in application, sending Email, View referenced website.
- Visual Formatting, Styles as per company’s UI/UX standards.
-
Metadata for source data
- File format: Excel (xlsx) or .csv format.
- Around 3 M rows of raw data.
- 21+ columns of data
- This data is available publicly.
- NOTE: Due to file size limit/restrictions (25 mb) in GitHub.com, I had to delete some of data rows from this Excel data file (data sheet tab) before file upload. This data analytics dashbord's features and functionalities still work perfectly.
-
Data Gathering – The Dataset (Microsoft Excel/ CSV File).
- Data Sheet: All fatality data details given in a single workbook.
-
Data Cleaning
- Identify and fix issues in data such as null values for mandatory columns, wrong datatypes, invalid values, duplicate rows etc.
- Adding a top row filter for better data pre-processing.
- Validate and prepare data to make it useful for calculations.
-
Data Pre-Processing and Transformation.
- Validate and fix cell values – NULL values, Default values, Fix formatting issues,
- Create new calculated columns for further calculations/processing (Month names, Year)
- I created ‘Pivot Tables’ and related charts (Line Charts, Bar Charts, Donut Charts) in the new tabs: ‘KPI’, ‘Monthly Trend’, ‘Road Type’, ‘Road Surface’ and ‘Light Condition’.
- Created a new dynamic dashboard in a new workbook/tab ‘Dashboard’ which contains all required primary KPIs and secondary KPIs.
- 6 Donut Charts.
- 1 Monthly Trend Line Chart.
- 1 Treemap Chart
- 1 Bar Chart
- Multiple Tiles style data points
- Global filters (Timeline and Slicers) to change data dynamically.
- Navigational panel using ‘Link’ to various tabs, Email feature, Internet link etc.
- Please refer to the screenshots taken below for dashboard, charts, filters for this solution.
Documentation:
- ReadMe-Data-Analysis-Fatality-Reporting-System-FARS-Excel.docx Output File:
- Data-Analysis-Fatality-Reporting-System-FARS-Excel.xlsx
- Excel (csv) data file for raw fatality records data.
- I created Excel functions and formulae in various tabs (“KPI”, “Monthly Trend”, “Road Type”, “Road Surface”, “Light Condition”). You can see all of these available in one place for easier reference in the “Data Analysis Sheet” tab.
- I created a final “Dashboard” page to combine all reports (charts) and dashboard filters (Timeline filter and Slicer filters) to analyze the data.
The “Dashboard” page is to combine all reports (charts) and dashboard filters.
- The 'Timeline' filter and 'Slicer' filters can be used to filter and analyze the data.
- Dynamic data changes as user select and unselect the options in these 'Timeline' and 'Slicers'.
(Using ‘Links’, you can navigate to different areas, Send Emails, View fatality reference available on internet website).