Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find vp nearest stop #933

Merged
merged 11 commits into from
Oct 24, 2023
6 changes: 4 additions & 2 deletions gtfs_funnel/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
download_gtfs_data_one_day:
download_gtfs_data:
# make sure to update update_vars.py for dates to download
python download_trips.py
python download_stops.py
Expand All @@ -10,4 +10,6 @@ download_gtfs_data_one_day:
preprocess:
python stop_times_with_direction.py
python vp_keep_usable.py
python vp_direction.py
python vp_direction.py
python cleanup.py

2 changes: 1 addition & 1 deletion gtfs_funnel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ Use `update_vars` and input one or several days to download.

1. **Schedule data**: download data for [trips](./download_trips.py), [stops](./download_stops.py), [shapes](./download_shapes.py), and [stop times](./download_stop_times.py) and cache parquets in GCS
1. **Vehicle positions data**: download [RT vehicle positions](./download_vehicle_positions.py)
1. Use the `Makefile` and download schedule and RT data. In terminal: `make download_gtfs_data_one_day`
1. Use the `Makefile` and download schedule and RT data. In terminal: `make download_gtfs_data`
49 changes: 49 additions & 0 deletions gtfs_funnel/logs/find_vp_direction.log
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,52 @@
2023-10-12 11:21:52.344 | INFO | __main__:<module>:176 - export vp direction: 0:05:29.299659
2023-10-12 11:23:14.557 | INFO | __main__:<module>:186 - export usable vp with direction: 0:01:22.212409
2023-10-12 11:23:14.558 | INFO | __main__:<module>:187 - execution time: 0:06:51.512068
2023-10-19 10:58:26.750 | INFO | __main__:<module>:184 - Analysis date: 2023-09-13
2023-10-19 11:01:08.229 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:41.478652
2023-10-19 11:04:05.857 | INFO | __main__:attach_prior_vp_add_direction:125 - np vectorize arrays for direction: 0:02:57.627657
2023-10-19 11:04:25.727 | INFO | __main__:<module>:191 - export vp direction: 0:05:58.976162
2023-10-19 11:05:26.722 | INFO | __main__:<module>:196 - export usable vp with direction: 0:01:00.995301
2023-10-19 11:05:26.723 | INFO | __main__:<module>:197 - execution time: 0:06:59.971463
2023-10-19 11:05:26.724 | INFO | __main__:<module>:184 - Analysis date: 2023-10-11
2023-10-19 11:08:10.013 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:43.288529
2023-10-19 11:10:57.486 | INFO | __main__:attach_prior_vp_add_direction:125 - np vectorize arrays for direction: 0:02:47.473068
2023-10-19 11:11:17.743 | INFO | __main__:<module>:191 - export vp direction: 0:05:51.017843
2023-10-19 11:12:14.833 | INFO | __main__:<module>:196 - export usable vp with direction: 0:00:57.090460
2023-10-19 11:12:14.834 | INFO | __main__:<module>:197 - execution time: 0:06:48.108303
2023-10-19 11:22:48.570 | INFO | __main__:<module>:185 - Analysis date: 2023-03-15
2023-10-19 11:44:08.820 | INFO | __main__:<module>:185 - Analysis date: 2023-03-15
2023-10-19 11:46:41.490 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:32.668923
2023-10-19 11:49:18.408 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:36.918447
2023-10-19 11:49:36.829 | INFO | __main__:<module>:192 - export vp direction: 0:05:28.008511
2023-10-19 11:50:34.563 | INFO | __main__:<module>:197 - export usable vp with direction: 0:00:57.733907
2023-10-19 11:50:34.565 | INFO | __main__:<module>:198 - execution time: 0:06:25.742418
2023-10-19 11:50:34.566 | INFO | __main__:<module>:185 - Analysis date: 2023-04-12
2023-10-19 11:53:00.392 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:25.825681
2023-10-19 11:55:43.433 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:43.040656
2023-10-19 11:56:02.076 | INFO | __main__:<module>:192 - export vp direction: 0:05:27.509401
2023-10-19 11:56:58.366 | INFO | __main__:<module>:197 - export usable vp with direction: 0:00:56.290053
2023-10-19 11:56:58.368 | INFO | __main__:<module>:198 - execution time: 0:06:23.799454
2023-10-19 11:56:58.368 | INFO | __main__:<module>:185 - Analysis date: 2023-05-17
2023-10-19 11:59:23.853 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:25.485009
2023-10-19 12:02:10.887 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:47.034093
2023-10-19 12:02:28.048 | INFO | __main__:<module>:192 - export vp direction: 0:05:29.680081
2023-10-19 12:03:24.619 | INFO | __main__:<module>:197 - export usable vp with direction: 0:00:56.570424
2023-10-19 12:03:24.620 | INFO | __main__:<module>:198 - execution time: 0:06:26.250505
2023-10-19 12:03:24.620 | INFO | __main__:<module>:185 - Analysis date: 2023-06-14
2023-10-19 12:05:48.202 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:23.581493
2023-10-19 12:08:28.397 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:40.195186
2023-10-19 12:08:45.600 | INFO | __main__:<module>:192 - export vp direction: 0:05:20.979952
2023-10-19 12:09:41.253 | INFO | __main__:<module>:197 - export usable vp with direction: 0:00:55.653037
2023-10-19 12:09:41.254 | INFO | __main__:<module>:198 - execution time: 0:06:16.632989
2023-10-19 12:09:41.254 | INFO | __main__:<module>:185 - Analysis date: 2023-07-12
2023-10-19 12:12:23.972 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:42.717672
2023-10-19 12:15:14.864 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:50.891639
2023-10-19 12:15:32.063 | INFO | __main__:<module>:192 - export vp direction: 0:05:50.808333
2023-10-19 12:16:37.518 | INFO | __main__:<module>:197 - export usable vp with direction: 0:01:05.455225
2023-10-19 12:16:37.519 | INFO | __main__:<module>:198 - execution time: 0:06:56.263558
2023-10-19 12:16:37.519 | INFO | __main__:<module>:185 - Analysis date: 2023-08-15
2023-10-19 12:19:21.523 | INFO | __main__:attach_prior_vp_add_direction:89 - persist vp gddf: 0:02:44.003497
2023-10-19 12:22:02.828 | INFO | __main__:attach_prior_vp_add_direction:126 - np vectorize arrays for direction: 0:02:41.304747
2023-10-19 12:22:21.129 | INFO | __main__:<module>:192 - export vp direction: 0:05:43.609000
2023-10-19 12:23:18.532 | INFO | __main__:<module>:197 - export usable vp with direction: 0:00:57.403234
2023-10-19 12:23:18.533 | INFO | __main__:<module>:198 - execution time: 0:06:41.012234
16 changes: 16 additions & 0 deletions gtfs_funnel/logs/usable_rt_vp.log
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,19 @@
2023-10-13 10:38:23.748 | INFO | __main__:attach_prior_vp_add_direction:88 - persist vp gddf: 0:04:24.538800
2023-10-13 10:39:00.233 | INFO | __main__:attach_prior_vp_add_direction:114 - np vectorize arrays for direction: 0:00:36.484908
2023-10-13 10:39:07.270 | INFO | __main__:<module>:181 - export vp direction: 0:05:08.060546
2023-10-19 10:21:27.377 | INFO | __main__:<module>:161 - Analysis date: 2023-09-13
2023-10-19 10:22:59.586 | INFO | __main__:<module>:171 - pare down vp: 0:01:32.208062
2023-10-19 10:22:59.586 | INFO | __main__:<module>:161 - Analysis date: 2023-10-11
2023-10-19 10:24:21.940 | INFO | __main__:<module>:171 - pare down vp: 0:02:54.562217
2023-10-19 11:34:28.126 | INFO | __main__:<module>:161 - Analysis date: 2023-03-15
2023-10-19 11:36:28.520 | INFO | __main__:<module>:171 - pare down vp: 0:02:00.393163
2023-10-19 11:36:28.521 | INFO | __main__:<module>:161 - Analysis date: 2023-04-12
2023-10-19 11:37:58.174 | INFO | __main__:<module>:171 - pare down vp: 0:03:30.047838
2023-10-19 11:37:58.177 | INFO | __main__:<module>:161 - Analysis date: 2023-05-17
2023-10-19 11:39:35.480 | INFO | __main__:<module>:171 - pare down vp: 0:05:07.353337
2023-10-19 11:39:35.481 | INFO | __main__:<module>:161 - Analysis date: 2023-06-14
2023-10-19 11:41:06.197 | INFO | __main__:<module>:171 - pare down vp: 0:06:38.070240
2023-10-19 11:41:06.197 | INFO | __main__:<module>:161 - Analysis date: 2023-07-12
2023-10-19 11:42:34.062 | INFO | __main__:<module>:171 - pare down vp: 0:08:05.936015
2023-10-19 11:42:34.063 | INFO | __main__:<module>:161 - Analysis date: 2023-08-15
2023-10-19 11:43:55.229 | INFO | __main__:<module>:171 - pare down vp: 0:09:27.102851
21 changes: 16 additions & 5 deletions gtfs_funnel/stop_times_with_direction.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

from calitp_data_analysis import utils
from shared_utils import rt_utils
from segment_speed_utils import helpers
from segment_speed_utils import helpers, wrangle_shapes
from segment_speed_utils.project_vars import RT_SCHED_GCS, PROJECT_CRS


Expand Down Expand Up @@ -139,12 +139,25 @@ def assemble_stop_times_with_direction(analysis_date: str):

prior_geom = other_stops.prior_geometry.compute()
current_geom = other_stops.geometry.compute()


# Create a column with readable direction like westbound, eastbound, etc
stop_direction = np.vectorize(
rt_utils.primary_cardinal_direction)(prior_geom, current_geom)

# Create a column with normalized direction vector
# Add this because some bus can travel in southeasterly direction,
# but it's categorized as southbound or eastbound depending
# on whether south or east value is larger.
# Keeping the normalized x/y direction allows us to distinguish a bit better later
direction_vector = wrangle_shapes.get_direction_vector(prior_geom, current_geom)
normalized_vector = wrangle_shapes.get_normalized_vector(direction_vector)

other_stops_no_geom = other_stops_no_geom.assign(
stop_primary_direction = stop_direction
stop_primary_direction = stop_direction,
# since we can't save tuples, let's assign x, y normalized direction vector
# as 2 columns
stop_dir_xnorm = normalized_vector[0],
stop_dir_ynorm = normalized_vector[1]
)

scheduled_stop_times_with_direction = pd.concat(
Expand All @@ -165,8 +178,6 @@ def assemble_stop_times_with_direction(analysis_date: str):
f"stop_times_direction_{analysis_date}"
)



end = datetime.datetime.now()
print(f"execution time: {end - start}")

Expand Down
4 changes: 3 additions & 1 deletion gtfs_funnel/update_vars.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
from pathlib import Path
from shared_utils import rt_dates

months = ["sep", "oct"]
months = [
"sep", "oct"
]

analysis_date_list = [
rt_dates.DATES[f"{m}2023"] for m in months
Expand Down
55 changes: 33 additions & 22 deletions gtfs_funnel/vp_direction.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@
from loguru import logger

from calitp_data_analysis.geography_utils import WGS84
from segment_speed_utils import helpers, segment_calcs
from segment_speed_utils import helpers, segment_calcs, wrangle_shapes
from segment_speed_utils.project_vars import SEGMENT_GCS, PROJECT_CRS
from shared_utils import rt_utils

fs = gcsfs.GCSFileSystem()
fs = gcsfs.GCSFileSystem()

def attach_prior_vp_add_direction(
analysis_date: str,
Expand Down Expand Up @@ -56,7 +56,7 @@ def attach_prior_vp_add_direction(
# calculated in projected CRS
vp_gddf = dg.from_dask_dataframe(
vp2,
geometry = dg.points_from_xy(vp2, x="x", y="y", crs=WGS84)
geometry = dg.points_from_xy(vp2, x="x", y="y")
).set_crs(WGS84).to_crs(PROJECT_CRS)

vp_ddf = vp_gddf.assign(
Expand All @@ -81,34 +81,46 @@ def attach_prior_vp_add_direction(
).query('prior_vp_idx >= min_vp_idx')[
["vp_idx", "prior_x", "prior_y", "x", "y"]
].reset_index(drop=True)

full_df = full_df.persist()

keep_cols = ["vp_idx", "prior_x", "prior_y", "x", "y"]
full_df = full_df[keep_cols].compute()

time1 = datetime.datetime.now()
logger.info(f"persist vp gddf: {time1 - time0}")

def column_into_array(df: dd.DataFrame, col: str) -> np.ndarray:
return df[col].compute().to_numpy()

vp_indices = column_into_array(full_df, "vp_idx")
prior_geom_x = column_into_array(full_df, "prior_x")
prior_geom_y = column_into_array(full_df, "prior_y")
current_geom_x = column_into_array(full_df, "x")
current_geom_y = column_into_array(full_df, "y")
vp_indices = full_df.vp_idx.to_numpy()
distance_east = full_df.x - full_df.prior_x
distance_north = full_df.y - full_df.prior_y

distance_east = current_geom_x - prior_geom_x
distance_north = current_geom_y - prior_geom_y
# Get the normalized direction vector split into x and y columns
normalized_vector = wrangle_shapes.get_normalized_vector(
(distance_east, distance_north)
)

direction_result = np.vectorize(
rt_utils.cardinal_definition_rules)(distance_east, distance_north)

# Stack our results and convert to df
results_array = np.column_stack((vp_indices, direction_result))
results_array = np.column_stack((
vp_indices,
normalized_vector[0],
normalized_vector[1]
))

vp_direction = pd.DataFrame(
results_array,
columns = ["vp_idx", "vp_primary_direction"]
).astype({"vp_idx": "int64"})
columns = ["vp_idx", "vp_dir_xnorm", "vp_dir_ynorm"]
).astype({
"vp_idx": "int64",
"vp_dir_xnorm": "float",
"vp_dir_ynorm": "float"
})

# Get a readable direction (westbound, eastbound)
vp_direction = vp_direction.assign(
vp_primary_direction = vp_direction.apply(
lambda x:
rt_utils.cardinal_definition_rules(x.vp_dir_xnorm, x.vp_dir_ynorm),
axis=1
)
)

time2 = datetime.datetime.now()
logger.info(f"np vectorize arrays for direction: {time2 - time1}")
Expand Down Expand Up @@ -168,7 +180,6 @@ def add_direction_to_usable_vp(
format="{time:YYYY-MM-DD at HH:mm:ss} | {level} | {message}",
level="INFO")


for analysis_date in analysis_date_list:

logger.info(f"Analysis date: {analysis_date}")
Expand Down
Loading
Loading