Skip to content

Research Request - Fix Certain Speeds #804

Closed
@amandaha8

Description

@amandaha8
Contributor

Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).

Research Question

Single sentence description: After flagging rows with zero in the meters_elapsed and sec_elapsed and inspecting a few routes for some typical errors.

Detailed description:

  • start with the speeds_stop_segments_{analysis_date} parquet (which is produced in B1_speeds_by_segment_trip).
  • grab stop_segments_{analysis_date}, get the segment's length.
  • merge with the speeds by segment-trip, which contains the meters_elapsed column
  • calculate pct where meters_elapsed/segment_length
  • exploratory descriptives. does pct look ok? how many rows are we dropping if we set pct >= 30%, 40%, 50%? we want to keep enough to get more stable values for speeds, not calculated over super portions of segments, and not throw away too many just to get a narrow distribution.

How will this research be used?

  • Increase the accuracy of the speeds data.

Stakeholders & End-Users

  • Users of the Open Data Portal.

Metrics

  • 1: Vehicle positions that are too far from the segment shapes are not being joined (using sjoin) to the segment shapes. What percentage of the vehicle positions are we missing for each route, due to this?

  • 2: Some segments are short and only one vehicle position is captured for it. Therefore, there isn't a way to calculate the distance and time. Find a way to use time stamp in the next segment as a replacement to get a calculation.

  • 3: Find length of segments and see how many rows are retained based on the threshold, aka how strict it is.

  • 7/26/23 update: wait on above approach. tackle Newmark's spatial accuracy metric in Research Request - Spatial accuracy metric #820 .

Data sources

  • Cal-ITP data sources:
  • Cal-ITP data sources: GCS folder: rt_segment_speeds
  • vp
  • vp_stop_segment
  • speeds_stop_segment

Deliverables:

Notebook and scripts.

Timeline of deliverables:

Estimated completion date

Metadata

Metadata

Assignees

No one assigned

    Labels

    gtfs-rtWork related to GTFS-Realtimeresearch requestIssues that serve as a request for research (summary and handoff)

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @tiffanychu90@amandaha8

      Issue actions

        Research Request - Fix Certain Speeds · Issue #804 · cal-itp/data-analyses