Prepared and presented by: Leah Pope (full time Data Science student)
Presentation: here
Presentation Video: here
Blog:here
The goal of this project is to use data from the Tanzania Ministry of Water to gain insight into the country's waterpoints.
The Stakeholders for my project are Tanzania Ministry of Water officals who want a dashboard view of the country's offical waterpoints. Addtionally, the officals want to know if Classifer can be created using the current data to predict waterpoint operational status. A data-supported understanding of which waterpoints may be more likely to fail can improve maintenance operations and ensure that clean, potable water is available to communities across Tanzania.
Data Set Used: Waterpoint data for the Republic of Tanzania:
- tza_waterpoint_train.csv
- 59400 records in Original Training Set
- tza_waterpoint_test.csv
- 14850 records in Original Test Set
- Data is from the Pump It Up: Data Mining the Water Table Challenge hosted on DrivenData
Question 2: What is the reported Quality and Quantity of Working (Functioning and Functioning Needs Repair) waterpoints?
Futher analysis into the following areas could yield additional insights.
-
idea 1 Futher Analysis into Working waterpoints:
- What are the water sources?
- Who are the major installers?
- Who are the major management groups?
- Who are the major scheme managers?
-
idea 2 Water Availability Analysis:
- How many people use waterpoints? By Country, Region, District, Ward, etc.
- How many people use lower Quality/Quantity waterpoints? By Country, Region, District, Ward, etc.
- What do we know about the payment types? By Country, Region, District, Ward, etc.
- Review the non-technical presentation here
- View the non-technical presentation video here
- Read my blog post here
- Contact the author Leah Pope
--notebooks
----data_cleaning.ipynb
----eda.ipynb
----classifer_modeling.ipynb
--data
----train_processed_labeled.csv
----test_processed.csv
----challenge_submission.csv
----original (dir for raw data downloaded from challenge website)
--extras (dir for Project Presentation and other supporting files)