Skip to content

Project: Bus routes database

answerquest edited this page Aug 4, 2017 · 3 revisions

Aim

To curate the bus stops and routes data published by PMPML on Pune Open Data portal into a human and machine readable database that can be used by a program for routing / GTFS feed generation

Skills needed

  • Working with excel/spreadsheets,
  • sorting/filtering, doing comparisons between lists (which can be done through R or a simpler online tool like Venny)

Dependencies

This project can begin once the Bus stops database project is complete, as a part of the starting data will be prepared there.

Starting Data

Analysis

Between them, the datasets provide the following:

  • Full names of routes
  • Sequence of stops in each route
  • Timetables

Steps

  1. Template for gtfs conversion : https://docs.google.com/spreadsheets/d/1JL5ClgiB1VFY54hg8KTkQm0T\_8xfNUwK1y4RnMWnXcg/edit?usp=sharing Download a copy and open it up; delete the existing dummy data rows once you've understood it.
  2. Figure/sort out, clean and put the data from the pmpml datasets into the routes-db and sequence-db; sheets in the template. Note that this will involve some brainstorming and decision-making and isn't a straightforward copy-paste job.
  3. Once this and the work on bus stops is done, we'll have a database that is machine-readable and ready for creating static GTFS feed for PMPML.

Bigger Picture

This task / project ties in to a long term process of improving PMPML through increased transparency and systemization. The global standard data format for public transit is (GTFS), which is used by Google Transit and most transit related apps. It critically needs a stop-centric database and routes laid out in a systemized way. We want to achieve this critical dataset in an open sourced manner, such that there is no private ownership or secrecy and the database is openly and freely available to all persons and groups who are working on improving PMPML.

Clone this wiki locally