From 9f5a7fd48fa9311e7a14320af456511525c8c7fd Mon Sep 17 00:00:00 2001
From: Andrew Byrd <andrew@fastmail.net>
Date: Mon, 1 Jan 2024 21:18:51 +0800
Subject: [PATCH] clean up javadoc

---
 .../r5/analyst/cluster/SelectedLink.java      | 233 ++++++++++--------
 .../r5/analyst/scenario/SelectLink.java       |  58 ++---
 .../com/conveyal/r5/transit/TransitLayer.java |   2 +
 3 files changed, 157 insertions(+), 136 deletions(-)

diff --git a/src/main/java/com/conveyal/r5/analyst/cluster/SelectedLink.java b/src/main/java/com/conveyal/r5/analyst/cluster/SelectedLink.java
index 3341db9e9..4cbae63bb 100644
--- a/src/main/java/com/conveyal/r5/analyst/cluster/SelectedLink.java
+++ b/src/main/java/com/conveyal/r5/analyst/cluster/SelectedLink.java
@@ -28,104 +28,133 @@
 import static com.conveyal.r5.common.GeometryUtils.polygonForEnvelope;
 import static com.google.common.base.Preconditions.checkState;
 
+/*
+
+Implementation considerations:
+
+- Shapes are one of the biggeset parts of GTFS feeds.
+- TransitLayer can associate shapes with each TripPattern and extract the sub-shapes between each stop.
+- However this functionality is hard-wired to be disabled (with a constant SAVE_SHAPES) during network build.
+- This means all existing TransportNetworks do not have shapes attached to their TripPatterns.
+- Enabling this across the board is expected to make all TransitLayers significantly larger.
+
+- Every single path generated needs to be subject to filtering.
+- Geographic intersections can be quite slow.
+- The geographic intersections need to be precomputed and looked up quickly during routing.
+
+- The SAVE_SHAPES section of TransitLayer building is relatively complicated.
+- It invovles projecting stops onto shapes and splitting them, and must handle cases where shapes are missing.
+- We don't want to replicate this existing logic elsewhere.
+
+GtfsController accesses GeometryCache in GtfsCache.patternShapes, but this just saves entire exemplar trip geometries,
+not inter-stop segments. TripPattern.getHopGeometries looks relatively simple using LocationIndexedLine, but depends on
+some fairly complicated stop-to-shape snapping logic in the SAVE_SHAPES section of TransitLayer to pre-build fields.
+We could either re-run this code after the fact to inject the shapes into existing network files, or we could enable it
+with a network build time switch. We need to turn on shape storage in the TripPatterns, or otherwise iterate through
+all of them in a streaming fashion to record every one that passes through the bounding box.
+
+TripPattern does make the assumption that all trips on the same pattern have the same geometry (or can be reasonably
+represented with the same geometry drawn from one of the trips).
+
+In existing serialized TransitLayers, TripPattern.getHopGeometries usually returns straight lines because
+TripPattern.shape is null (it is hard-wired to not save shapes in TransitLayers). However, the GTFS MapDBs so still
+contain the shapes for each trip. This is how we show them in GtfsController making VectorMapTiles. We already have
+spatial index capabilities at gtfsCache.patternShapes.queryEnvelope(bundleScopedFeedId, tile.envelope). See L206-209 of
+GtfsController. However, this does not retain enough information about the segments between stops in the patterns, and
+uses a lot of space for all those geometries.
+
+Networks are always built and scenarios always applied on workers. Workers do have access to GTFSFeed files.
+WorkerComponents has a TransportNetworkCache which is injected into the AnalysisWorker constructor. This is the only
+path to access a GtfsCache, and that GtfsCache is private, so we need methods on TransportNetworkCache. The full path
+to this GtfsCache is: AnalysisWorker.networkPreloader.transportNetworkCache.gtfsCache.
+
+The best way to prototype the intended behavior is to create a new modification type. This provides a mechanism for
+attaching things to a network, at a point where we may still have access to the GTFS feeds. It also ensures that the
+network with this extra information is properly cached for similar subsequent requests (as in a regional analysis).
+We can't attach the precomputed selected link information to the raw base TransportNetwork, because then the first
+request for that network would always need to be one with the selected-link behavior specified. Networks are expected
+to be read-only once loaded, and anyway subsequent requests for the same network hit a cache and don't pass through
+the right place to access the GTFSCache and update the in-memory network. We need to be able to apply it later to a
+network that was already loaded without the selected link specified. We could treat the network as mutable and write to
+it, but this does not follow existing design and would require mentally modeling how to manupulate the system to get
+the desired effect.
+
+TransportNetworkCache#getNetworkForScenario is where we always apply scenarios in the worker, and that class has direct
+access to the GtfsCache.
+
+Deciding whether to create SelectedLink via a Modification or a per-request parameter:
+
+The SelectedLink instance (fast checking whether paths pass through an area) needs to be stored/referenced:
+- Somewhere that is reachable from inside PathResult.setTarget or PathResult.summarizeIterations
+- Somewhere that is correctly scoped to where the selected-link filtering is specified (request/task or scenario)
+- Somewhere that is writable in the places where we have access to the gtfsCache
+- Somewhere that is PERSISTENT across requests - this is inherently the case for TransportNetwork but for Task we'd
+  need to introduce another cache. The problem being that the base TransportNetwork's scope is too wide (could be
+  used in requests with or without the SelectedLink), so it needs to be a modification on a specific scenario.
+
+The PathResult constructor is passed a Task and a TransitLayer. It retains only the TransitLayer but could retain both.
+In AnalysisWorker.handleAndSerializeOneSinglePointTask we still have the task context, but deeper on the stack in
+networkPreloader and then transportNetworkCache (which has gtfsCache), we have the scenario but not the task. But
+then once you go deeper into applying the individual scenario modifications, the gtfsCache is no longer visible.
+
+SelectedLink doesn't feel like a modification. It feels like a parameter to the CSV path output in the task.
+The AnalysisWorker could have a Map from SelectionBox to SelectedLink, but then the keys are full of floating-point
+coordinate numbers, which requires fuzzy matching on these keys to look up the precomputed data. This could get very
+ugly.
+
+We also need to tie pre-existin items in the TransportNetwork (TripPatterns) to new items from the GTFS. It feels like
+this should be on a scenario copy of a TransportNetwork. It's ugly, but it would be possible to scan over the incoming
+modifications and inject the GtfsCache (or pre-selected GTFSFeeds) onto a transient field of any SelectedLink
+Modification present.
+
+Getting this information into the network resulting from applying a Scenario makes it auto-retained, gives it a stable
+identity so we don't need to fuzzy-match it in the task to cache. That could also be done by uploading a geometry file
+with an ID, but that's so much indirection for a single small polygon. In the future it would make sense to treat all
+lat/lon as effectively integers (fixed-point) since it simplifies this kind of keying and matching.
+
+Alternatively we could enable on the storage of GTFS route shapes on the network file when it's built. Then the
+modification could be applied normally without injecting a GtfsCache or GtfsFeeds. But again that bloats the size of
+every network just for the odd case where someone wants to do selected link analysis.
+
+Bundle Scoping problem:
+
+The feed IDs expected by gtfsCache (i.e. gtfs file names) are bundle-scoped but the ones in the TripPatterns are not.
+TransportNetworks and TransitLayers apparently do not retain their bundle ID. In any case they can have multiple feeds
+originally uploaded with different bundles. TransitLayer.feedChecksums keys are the same feed IDs prefixing
+TripPattern.routeId, which are the gtfsFeed.feedId, which is not bundle-scoped so can't be used to get a feed from
+gtfsCache.
+
+A network is always based on one bundle with the same ID, but the bundle config can also reference GTFS with a
+different bundle scope (originally uploaded for another bundle). So knowing the network ID is not sufficient to find
+a GTFS feed from its un-scoped UUID.
+
+Based on GtfsController.bundleScopedFeedIdFromRequest, the bundleScopedFeedId is feedId_feedGroupId. They're no longer
+based on the bundle/network ID, but the feed group. It seems like we wouldn't need these scopes at all since all feeds
+now have unique IDs. Removing them could cause a lot of disruption though.
+
+When we make the TransportNetwork from these bundles, it's always on a worker, using information from the bundle's
+TransportNetworkConfig JSON file. This is in TransportNetworkCache.buildNetworkFromConfig(). At first it looks like
+the bundleScopedId is completely lost after we go through the loading process. GtfsCache.get(String id) does store
+that key id in feed.uniqueId, but that field is never read (or written) anywhere else.
+
+This means the bundle ID is available during network creation to be retained in the TransportNetwork, but they aren't
+retained. I think the only place we can get these bundle scoped feed IDs is from the TransportNetworkConfig JSON file.
+Perhaps that should be serialized into the TransportNetwork itself (check the risk of serializing applied Modifications).
+But in the meantime TNCache has a method to load that configuration and get the bundle scopes.
+
+The distinction between stop indexes at the TransitLayer level and stop indexes (positions) within the TripPattern is
+critical and can cause some comparisons to fail silently, while superficially appearing to do something meaningful.
+The lightweight newtype pattern would be really useful here but doesn't exist in Java. It is not practical to change
+the source path information we receive such that it stores stop indexes within the TripPattern and defers resolution
+to TransitLayer indexes and names. Looking at the Path constructor and RaptorState, we don't have this information as
+RaptorStates are stored in array slots indexed on stop indexes from the TransitLayer (not the TripPattern level ones).
+
+ */
+
 /**
- * For Selected Link Analysis.
- * This object caches a collection of every segment of every pattern that passes through a certain polygon.
- * It also provides methods for quickly checking whether any leg of a public transit trip overlaps these selected segments.
- *
- * Implementation considerations follow:
- *
- * Simplifications:
- * Assumes all trips on the same pattern have the same geometry.
- *
- * TripPattern.getHopGeometries usually returns straight lines because TripPattern.shape is null (it is hard-wired to
- * not save shapes in TransitLayers). However, the GTFS MapDBs should have the shapes for each trip. This is how we
- * show them in GtfsController making VectorMapTiles. In fact we already have spatial index capabilities:
- * gtfsCache.patternShapes.queryEnvelope(bundleScopedFeedId, tile.envelope)
- * See L206-209 of GtfsController. However this does not retain enough information about the segments between stops
- * in the patterns, and uses a lot of space for all those geometries.
- *
- * We'll need to turn on shape storage in the TripPatterns, or otherwise iterate through all of them in a streaming
- * fashion to record every one that passes through the bounding box.
- *
- * Do the workers have access to the GTFS files or not?
- * WorkerComponents has a TransportNetworkCache which is injected into the AnalysisWorker constructor. This is the only
- * path to access a GtfsCache which is private, so we need a method on TransportNetworkCache.
- * The full path to the GtfsCache is: AnalysisWorker.networkPreloader.transportNetworkCache.gtfsCache.
- *
- * The best way to easily get the intended behavior is probably to create a new modification type.
- * This provides a mechanism for attaching things to a network, at a point where we may still have access to the gtfs.
- * And ensure that the network with extra information is properly cached for similar requests.
- * We can't attach it to the raw TransportNetwork, because then the first request for that network would always need
- * to be one with the selected-link behavior specified. We need to be able to apply it to a network that was loaded
- * without the selected link. Or we could treat the network as mutable and write to it, which is not clean but would
- * get the job done.
- *
- * Well, TransportNetworkCache#getNetworkForScenario is where we always apply scenarios in the worker, and that class
- * has direct access to the GtfsCache.
- *
- * The SelectedLink instance will need to be stored/referenced:
- * - Somewhere that is reachable from inside PathResult.setTarget or PathResult.summarizeIterations
- * - Somewhere that is correctly scoped to where the selected-link filtering is specified (request/task or scenario)
- * - Somewhere that is writable in the places where we have access to the gtfsCache
- * - Somewhere that is PERSISTENT across requests - this is inherently the case for TransportNetwork but for Task we'd
- *   need to introduce another cache. The problem being that the base TransportNetwork's scope is too wide (could be
- *   used in requests with or without the SelectedLink), so it needs to be a modification on a specific scenario.
- *
- * PathResult is constructed with a Task and a TransitLayer. It retains only the TransitLayer but could retain both.
- * In AnalysisWorker.handleAndSerializeOneSinglePointTask we still have the task context, but deeper on the stack in
- * networkPreloader and then transportNetworkCache (which has gtfsCache), we have the scenario but not the task. But
- * then once you go deeper into applying the scenario modifications, the gtfsCache is no longer visible.
- * Anyway this doesn't feel like a modification. It feels like a parameter to the CSV path output in the task.
- * The AnalysisWorker could have a Map from SelectionBox to SelectedLink (fuzzy matching keys... ugh... linear scan
- * maybe).
- * Also we need to tie items in the TransportNetwork to the GTFS... it feels like this should be on the TransportNetwork
- * of a scenario.
- * What if we scan over the incoming modifications and inject the GtfsCache onto a transient field of any
- * SelectedLinkModification present?
- * Basically: getting this into the Scenario makes it auto-retained, gives it a stable identity so we don't need to
- * fuzzy-match it in the task to cache. That could also be done by uploading a geometry file with an ID, but ugh.
- * In the future it would make sense to treat all lat/lon effectively integers (fixed-point) since it simplifies this
- * kind of keying and matching.
- *
- * Alternatively to all this we could switch on the storage of GTFS route shapes on the network file. Then the
- * modification could be applied normally without injecting a GtfsCache.
- *
- * Additional problem:
- * The gtfsCache feed IDs (gtfs file names) are bundle-scoped but the ones in the TripPatterns are not.
- * TransportNetworks and TransitLayers apparently do not retain their bundle ID. In any case they can have multiple
- * feeds originally uploaded with different bundles.
- * TransitLayer.feedChecksums keys are the same feed IDs prefixing TripPattern.routeId, which are the gtfsFeed.feedId,
- * which is not bundle-scoped so can't be used to get a feed from gtfsCache.
- *
- * A network is always based on one bundle with the same ID, but the bundle config can also reference GTFS with a
- * different bundle scope (originally uploaded for another bundle). So knowing the network ID is not sufficient.
- *
- * Based on GtfsController.bundleScopedFeedIdFromRequest, the bundleScopedFeedId is feedId_feedGroupId. So they're no
- * longer based on the bundle/network ID, but the feed group.
- * It seems like we wouldn't need these scopes at all since all feeds now have unique IDs.
- *
- * When we make the TransportNetwork from these bundles, it's always on a worker, based on the bundle's
- * TransportNetworkConfig JSON file. This is in TransportNetworkCache.buildNetworkFromConfig().
- * At first it looks like the bundleScopedId is completely lost after we go through the loading process.
- * But GtfsCache.get(String id) stores that key id in feed.uniqueId. That field is never read (or written) anywhere else.
- * This means they're available during network creation to be retained in the TransportNetwork... but aren't retained.
- * I think the only place we can get these bundle scoped feed IDs is from the TransportNetworkConfig JSON file.
- * Perhaps that should be serialized into the TransportNetwork itself (check risk of serializing used Modifications).
- * But in the meantime TNCache has a method to load that configuration.
- *
- * Another problem: after realizing that the stop indexes are indexes at the TransitLayer level and not within the
- * TripPattern, the SelectedLink.hopsInTripPattern was updated to use TransitLayer stop indexes. But this invalidates
- * the simple range-based comparison hop >= boardStop && hop < alightStop in SelectedLink.includes(). The comparison
- * doesn't fail hard, but only superficially appears to do anything meaningful.
- * The lightweight newtype pattern would be really useful here but doesn't exist in Java.
- * Solutions are either to scan down the stops in the TripPattern (which involves a lot more comparisons and
- * indirection, and requires access to the TransitLayer) or to change the source data structure so it stores stop
- * indexes within the TripPattern and defers resolution to TransitLayer indexes until stop name lookup is needed.
- * This also draws attention to the fact that specifying board and alight locations with TransitLayer stop indexes is
- * ambiguous because the same stop can appear more than once in a single TripPattern.
- * Looking at the Path constructor and RaptorState though, it looks like we don't have this information as RaptorStates
- * are stored in array slots indexed on stop indexes from the TransitLayer (not the TripPattern level ones).
+ * This class is used in performing "selected link" analysis (R5 issue #913). It retains a precomputed collection
+ * containing every segment of every TripPattern that passes through a certain polygon, and provides methods for quickly
+ * checking whether any leg of a public transit trip overlaps these precomputed segments.
  */
 public class SelectedLink {
 
@@ -136,7 +165,7 @@ public class SelectedLink {
      * Keys are the index of a TripPattern in the TransitLayer, and values are arrays of stop positions within that
      * TripPattern (NOT the stop index within the TransitLayer). A hop from stop position N to stop position N+1 on
      * pattern at index X is recorded as the mapping X -> N. There may be several such hops within a single pattern,
-     * thus an array with more than one element, in the case where one or more transit stops on the pattern fall
+     * thus an array with more than one element, in the case where one or more transit stops along the pattern fall
      * within the SelectedLink search radius.
      */
     private final TIntObjectMap<int[]> hopsInTripPattern;
@@ -156,10 +185,10 @@ public SelectedLink(TransitLayer transitLayer, TIntObjectMap<int[]> hopsInTripPa
     }
 
     /**
-     * For a given transit ride from a boardStop to an alightStop on a TripPattern, return true if that ride
+     * For a given transit ride from a board stop to an alight stop on a TripPattern, return true if that ride
      * passes through any of the hops in this SelectedLink area, or false if it's entirely outside the area.
-     * This is complicated by the fact that the boardStop and alightStop are TransitLayer-wide stop indexes,
-     * not indexes of the stop position within the pattern. It is possible (though unlikely) that a board and alight
+     * This is complicated by the fact that the board and alight stops are TransitLayer-wide stop indexes,
+     * not the position of the stop within the pattern. It is possible (though unlikely) that a board and alight
      * stop pair could ambiguously refer to more than one sub-segment of the same pattern when one of the stops appears
      * more than once in the pattern's stop sequence. We find the earliest matching sub-segment in the sequence.
      */
@@ -229,7 +258,7 @@ private boolean traversedBy (PatternSequence patternSequence) {
      * filtered COPY of the supplied Multimap, with all mappings removed for keys that do not pass through this
      * SelectedLink area. This often yields an empty Multimap, greatly reducing the number of rows in the CSV output.
      */
-    public Multimap<PatternSequence, Iteration> filterPatterns (Multimap<PatternSequence, Iteration> patterns){
+    public Multimap<PatternSequence, Iteration> filterPatterns (Multimap<PatternSequence, Iteration> patterns) {
         Multimap<PatternSequence, Iteration> filteredPatterns = HashMultimap.create();
         for (PatternSequence patternSequence : patterns.keySet()) {
             if (this.traversedBy(patternSequence)) {
diff --git a/src/main/java/com/conveyal/r5/analyst/scenario/SelectLink.java b/src/main/java/com/conveyal/r5/analyst/scenario/SelectLink.java
index 0b29749db..c00b1b696 100644
--- a/src/main/java/com/conveyal/r5/analyst/scenario/SelectLink.java
+++ b/src/main/java/com/conveyal/r5/analyst/scenario/SelectLink.java
@@ -1,19 +1,14 @@
 package com.conveyal.r5.analyst.scenario;
 
-import com.conveyal.gtfs.GTFSCache;
 import com.conveyal.gtfs.GTFSFeed;
 import com.conveyal.r5.analyst.cluster.SelectedLink;
 import com.conveyal.r5.transit.RouteInfo;
 import com.conveyal.r5.transit.TransitLayer;
 import com.conveyal.r5.transit.TransportNetwork;
 import com.conveyal.r5.transit.TripPattern;
-import com.conveyal.r5.util.TIntIntHashMultimap;
-import com.conveyal.r5.util.TIntIntMultimap;
 import gnu.trove.list.array.TIntArrayList;
 import gnu.trove.map.TIntObjectMap;
 import gnu.trove.map.hash.TIntObjectHashMap;
-import gnu.trove.set.TIntSet;
-import gnu.trove.set.hash.TIntHashSet;
 import org.locationtech.jts.geom.LineString;
 import org.locationtech.jts.geom.Polygon;
 import org.slf4j.Logger;
@@ -21,7 +16,6 @@
 
 import java.lang.invoke.MethodHandles;
 import java.util.Arrays;
-import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.stream.Collectors;
@@ -31,15 +25,16 @@
 import static com.google.common.base.Strings.isNullOrEmpty;
 
 /**
- * Designates all transit and/or roads passing through a given rectangle for inclusion in CSV output.
- * This allows cutting down and consolidating results for some network assignment and congestion problems.
+ * This custom Modification restricts CSV path output to only include transit passing through a specified rectangle.
+ * This allows cutting down the size of the output considerably, consolidating results in a way that's useful for some
+ * network assignment and congestion problems. The parameters lon, lat, and radiusMeters define a selection box.
  * @see SelectedLink
  */
 public class SelectLink extends Modification {
 
     private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
-    /// Public fields supplied on the custom modification
+    /// Public fields supplied on the custom modification.
 
     public double lon;
 
@@ -47,7 +42,7 @@ public class SelectLink extends Modification {
 
     public double radiusMeters;
 
-    /// Private derived fields used in subsequent calculations
+    /// Private derived fields used in subsequent calculations.
 
     private Polygon boxPolygon;
 
@@ -58,8 +53,9 @@ public boolean resolve(TransportNetwork network) {
         // Convert the incoming description of the selected link area to a Geometry for computing intersections.
         boxPolygon = polygonForEnvelope(envelopeForCircle(lon, lat, radiusMeters));
         // Iterate over all TripPatterns in the TransitLayer and ensure that we can associate a feed with each one.
-        // The feed IDs recorded in the TripPatterns are not bundle-scoped.
-        // Check that a feed with a correctly de-scoped ID was supplied for every TripPattern in this TransitLayer.
+        // These feeds must have been previously supplied with the injectGtfs() method. The feed IDs recorded in the
+        // TripPatterns are not bundle-scoped. Check that a feed with a correctly de-scoped ID was supplied for every
+        // TripPattern in this TransitLayer.
         for (TripPattern tripPattern : network.transitLayer.tripPatterns) {
             String feedId = feedIdForTripPattern(tripPattern);
             if (isNullOrEmpty(feedId)) {
@@ -76,12 +72,12 @@ public boolean resolve(TransportNetwork network) {
 
     @Override
     public boolean apply(TransportNetwork network) {
-        // This method is basically serving as a factory method for a SelectedLink instance.
-        // Those instances are immutable, so need some kind of external factory or builder to construct them.
+        // This method is basically serving as a factory method for a SelectedLink instance. Those instances are
+        // immutable, so need some kind of external factory or builder to construct them incrementally.
         TIntObjectMap<int[]> hopsInTripPattern = new TIntObjectHashMap<>();
 
         // During raptor search, paths are recorded in terms of pattern and stop index numbers rather than
-        // TripPattern and Stop instance references, so iterate over numbers.
+        // TripPattern and Stop instance references, so iterate over the numbers.
         for (int patternIndex = 0; patternIndex < network.transitLayer.tripPatterns.size(); patternIndex++) {
             TripPattern tripPattern = network.transitLayer.tripPatterns.get(patternIndex);
             // Make a sacrificial protective copy to avoid interfering with other requests using this network.
@@ -91,7 +87,7 @@ public boolean apply(TransportNetwork network) {
             String feedId = feedIdForTripPattern(tripPattern);
             GTFSFeed feed = feedForUnscopedId.get(feedId);
             TransitLayer.addShapeToTripPattern(feed, tripPattern);
-            // TransitLayer parameter allows fetching stop locations for straight lines in case shapes are not present.
+            // TransitLayer parameter enables fetching straight lines between stops in case shapes are not present.
             List<LineString> hopGeometries = tripPattern.getHopGeometries(network.transitLayer);
             TIntArrayList intersectedHops = new TIntArrayList();
             for (int hop = 0; hop < hopGeometries.size(); hop++) {
@@ -105,11 +101,12 @@ public boolean apply(TransportNetwork network) {
             }
         }
 
-        // After finding all links (TripPattern hops) in this area, release GTFSFeeds which don't really belong here.
-        // This avoids memory leaks, and protects us from inadvertently relying on or modifying the feeds later.
+        // After finding all links (TripPattern hops) in the SelectedLink area, release the GTFSFeeds which don't really
+        // belong in a Modification. This avoids memory leaks, and protects us from inadvertently relying on or
+        // modifying those feed objects later.
         feedForUnscopedId = null;
 
-        // For informational purposes record all selected links in the Modification.info, and log to console.
+        // To confirm expected behavior, record all selected links in Modification.info for the user, and log to console.
         LOG.info("Selected links for CSV path output:");
         hopsInTripPattern.forEachEntry((int patternIndex, int[] stopPositions) -> {
             TripPattern tripPattern = network.transitLayer.tripPatterns.get(patternIndex);
@@ -130,7 +127,7 @@ public boolean apply(TransportNetwork network) {
         return errors.size() > 0;
     }
 
-    // By returning false for both affects* methods, we make a very shallow copy of the TransitNetwork for efficiency.
+    // By returning false for both affects methods, we make a very shallow copy of the TransitNetwork for efficiency.
 
     @Override
     public boolean affectsStreetLayer() {
@@ -145,28 +142,21 @@ public boolean affectsTransitLayer() {
     @Override
     public int getSortOrder() {
         // This modification needs to be applied after any modifications affecting the transit network.
+        // It appears this method is never called, maybe because sort order from CustomModificationHolder is used.
         return 80;
     }
 
-    // The SAVE_SHAPES section of TransitLayer is quite complicated.
-    // We should not replicate all that logic again in another place.
-    // I had a different impression from looking somewhere else. The GtfsController accesses the GeometryCache in
-    // GtfsCache.patternShapes, but this just saves entire exemplar trips from end to end, not inter-stop segments.
-    // I was probably thrown off by TripPattern.getHopGeometries which uses LocationIndexedLine, but depends on some
-    // fairly complicated stop-to-shape snapping logic in the SAVE_SHAPES section of TransitLayer to pre-build fields.
-    // We could either re-run this code after the fact to inject the shapes into existing network files,
-    // or we could enable it with a network build time switch.
-
     /**
      * Currently we do not include the GTFS shapes in the TransportNetwork, so in order to determine which routes pass
      * through a given geographic area, we need access to the original GTFS data. When resolving and applying
-     * Modifications, the only thing supplied is the TransportNetwork itself. This is a hack, but we need to inject
-     * the GtfsFeed somehow. The reference can be nulled out after the modification is applied to avoid memory leaks.
+     * Modifications, the only thing available is the TransportNetwork itself. This method is used to supply any needed
+     * GtfsFeeds keyed on their non-bundle-scoped, Conveyal-assigned feed UUID.
      * A TransportNetwork may be made from a bundle with multiple feeds, so we can't attach just one GTFSFeed.
      * The TransportNetwork does not directly retain information on which feeds were used to create it, but each
-     * TripPattern retains a feed-scoped routeId in this format: String.format("%s:%s", gtfs.feedId, route.route_id)
-     *
-     * Potential problem: is the gtfs.feedId the bundle-scoped feed ID expected by gtfs-cache?
+     * TripPattern retains a feedId as a prefix to its routeId in the format feedUUID:route_id.
+     * However, those feed IDs lack the bundle scope (feed group ID) needed to get feeds from GtfsCache.
+     * This all deviates significantly from pre-existing design, but does work and reveals some important
+     * considerations for any future revisions of scenario application or network building system.
      */
     public void injectGtfs(Map<String, GTFSFeed> feedForUnscopedId) {
         this.feedForUnscopedId = feedForUnscopedId;
diff --git a/src/main/java/com/conveyal/r5/transit/TransitLayer.java b/src/main/java/com/conveyal/r5/transit/TransitLayer.java
index cab69a63f..47bda5945 100644
--- a/src/main/java/com/conveyal/r5/transit/TransitLayer.java
+++ b/src/main/java/com/conveyal/r5/transit/TransitLayer.java
@@ -431,6 +431,8 @@ public void loadFromGtfs (GTFSFeed gtfs, LoadLevel level) throws DuplicateFeedEx
      * don't run it when creating the TransportNetwork, but instead run it as needed on single TripPatterns to save
      * space. Factoring pieces out of loadFromGtfs also makes that method a bit more readable as it's very long.
      * TODO test how much this actually increases feed size and consider enabling it when creating TransportNetworks.
+     * This could also be changed to return a compound type of the (shape, stopShapeSegment, and stopShapeFraction)
+     * fields of TripPattern, or only an array or list of hop geometries (if we don't ever want to persist them).
      */
     public static void addShapeToTripPattern (
             GTFSFeed gtfsFeed,