Skip to content

lspope/dsc-phase-3-project

 
 

Repository files navigation

Tanzania Waterpoint Operational Status

Phase 3 Project

Flatiron Online Data Science Bootcamp

Prepared and presented by: Leah Pope (full time Data Science student)

Presentation: here

Presentation Video: here

Blog:here

water

tanzania_map

Introduction

The goal of this project is to use data from the Tanzania Ministry of Water to gain insight into the country's waterpoints.

The Stakeholders for my project are Tanzania Ministry of Water officals who want a dashboard view of the country's offical waterpoints. Addtionally, the officals want to know if Classifer can be created using the current data to predict waterpoint operational status. A data-supported understanding of which waterpoints may be more likely to fail can improve maintenance operations and ensure that clean, potable water is available to communities across Tanzania.

Data Description

Data Set Used: Waterpoint data for the Republic of Tanzania:

EDA Questions Explored

Question 1: What is the operational status of waterpoints in Tanzania?

Question 2: What is the reported Quality and Quantity of Working (Functioning and Functioning Needs Repair) waterpoints?

Question 3: Is there a difference between the average age of Waterpoints by Operation Status?

Modeling

Can current data on waterpoints be used to create a Classifer to predict operational status?

Next Steps/Future Work

Futher analysis into the following areas could yield additional insights.

  • idea 1 Futher Analysis into Working waterpoints:

    • What are the water sources?
    • Who are the major installers?
    • Who are the major management groups?
    • Who are the major scheme managers?
  • idea 2 Water Availability Analysis:

    • How many people use waterpoints? By Country, Region, District, Ward, etc.
    • How many people use lower Quality/Quantity waterpoints? By Country, Region, District, Ward, etc.
    • What do we know about the payment types? By Country, Region, District, Ward, etc.

For More Information

  • Review the non-technical presentation here
  • View the non-technical presentation video here
  • Read my blog post here
  • Contact the author Leah Pope

Repository Structure

--notebooks
----data_cleaning.ipynb
----eda.ipynb
----classifer_modeling.ipynb
--data
----train_processed_labeled.csv
----test_processed.csv
----challenge_submission.csv
----original (dir for raw data downloaded from challenge website)
--extras (dir for Project Presentation and other supporting files)

About

Tanzania Waterpoint Operational Status Analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%