Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research Request - Fix Certain Speeds #804

Closed
amandaha8 opened this issue Jul 12, 2023 · 0 comments · Fixed by #854
Closed

Research Request - Fix Certain Speeds #804

amandaha8 opened this issue Jul 12, 2023 · 0 comments · Fixed by #854
Labels
gtfs-rt Work related to GTFS-Realtime research request Issues that serve as a request for research (summary and handoff)

Comments

@amandaha8
Copy link
Contributor

amandaha8 commented Jul 12, 2023

Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).

Research Question

Single sentence description: After flagging rows with zero in the meters_elapsed and sec_elapsed and inspecting a few routes for some typical errors.

Detailed description:

  • start with the speeds_stop_segments_{analysis_date} parquet (which is produced in B1_speeds_by_segment_trip).
  • grab stop_segments_{analysis_date}, get the segment's length.
  • merge with the speeds by segment-trip, which contains the meters_elapsed column
  • calculate pct where meters_elapsed/segment_length
  • exploratory descriptives. does pct look ok? how many rows are we dropping if we set pct >= 30%, 40%, 50%? we want to keep enough to get more stable values for speeds, not calculated over super portions of segments, and not throw away too many just to get a narrow distribution.

How will this research be used?

  • Increase the accuracy of the speeds data.

Stakeholders & End-Users

  • Users of the Open Data Portal.

Metrics

  • 1: Vehicle positions that are too far from the segment shapes are not being joined (using sjoin) to the segment shapes. What percentage of the vehicle positions are we missing for each route, due to this?

  • 2: Some segments are short and only one vehicle position is captured for it. Therefore, there isn't a way to calculate the distance and time. Find a way to use time stamp in the next segment as a replacement to get a calculation.

  • 3: Find length of segments and see how many rows are retained based on the threshold, aka how strict it is.

  • 7/26/23 update: wait on above approach. tackle Newmark's spatial accuracy metric in Research Request - Spatial accuracy metric #820 .

Data sources

  • Cal-ITP data sources:
  • Cal-ITP data sources: GCS folder: rt_segment_speeds
  • vp
  • vp_stop_segment
  • speeds_stop_segment

Deliverables:

Notebook and scripts.

Timeline of deliverables:

Estimated completion date

@amandaha8 amandaha8 added the research request Issues that serve as a request for research (summary and handoff) label Jul 12, 2023
@tiffanychu90 tiffanychu90 mentioned this issue Aug 4, 2023
@amandaha8 amandaha8 mentioned this issue Aug 30, 2023
@tiffanychu90 tiffanychu90 added the gtfs-rt Work related to GTFS-Realtime label Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gtfs-rt Work related to GTFS-Realtime research request Issues that serve as a request for research (summary and handoff)
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants