Skip to content

Latest commit

 

History

History
15 lines (11 loc) · 1 KB

README.md

File metadata and controls

15 lines (11 loc) · 1 KB

Audit OpenStreetMap data for Greater Houston Area

Intro

This project chose the Great Houston Area and used data munging techniques to assess the quality of the data for validity, accuracy, completeness, consistency and uniformity. The following steps are accomplished in the project:

  • Audited dataset (800+mb) in XML format for Greater Houston Area
  • Fixed street names and deleted problematic nodes and ways
  • Found the most popular cuisine and religion in Houston using SQLite

In details, the dataset is cleaned by fixing street names and deleting some problematic nodes and ways. Using SQL, it was able to get some Houston-related insights like the most popular cuisine, the most popular religion and the busiest streets with most merchants.

Data Source

The data is from OpenStreetMap. You can download the Greater Houston Area data that I used here.

Note

This project is part of the efforts for Udacity Data Anaylst Nanodegree.