Skip to content

Conversation

@yamilbknsu
Copy link
Collaborator

In this PR, I'm implementing a Haversine distance-based island detection algorithm. We discussed potential alternative criteria for this algorithm but I decided this was enough changes for a single PR.

For each each, we check in parallel if the distance from its midpoint to the midpoint of the initial edge is large enough. This way, we remove edges that are part of components that are small enough.

I assumed all edge lists together could make two vertices adjacent, so the processing is made considering all edge lists at the same time and therefore the "now let's remove the edges" part is a little repetitive. I also welcome feedback on the overall style of the implementation.

@yamilbknsu yamilbknsu linked an issue Feb 9, 2026 that may be closed by this pull request
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an (optional) Haversine distance-based “island” detection pass to prune edges that belong to small/compact connected components during OMF graph vectorization.

Changes:

  • Introduces component_algorithm::island_detection_algorithm and unit tests for midpoint + island detection behavior.
  • Wires an optional island-detection configuration from app config → network run → OmfGraphVectorized::new, and applies edge-list cleaning.
  • Adds indexmap dependency for adjacency representation.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
rust/bambam-omf/src/graph/serialize_ops.rs Adds clean_omf_edge_list helper to filter edge-list-aligned vectors by a boolean mask.
rust/bambam-omf/src/graph/omf_graph.rs Wires island detection + edge removal into graph construction.
rust/bambam-omf/src/graph/mod.rs Registers the new component_algorithm module.
rust/bambam-omf/src/graph/component_algorithm.rs Implements island detection algorithm + adjacency builder + tests.
rust/bambam-omf/src/app/omf_app.rs Reads island detection configuration from config file and passes it into run.
rust/bambam-omf/src/app/network.rs Defines IslandDetectionAlgorithmConfiguration and plumbs it into graph creation.
rust/bambam-omf/Cargo.toml Adds indexmap workspace dependency.
rust/Cargo.toml Adds workspace-level indexmap version pin.
configuration/bambam-omf/travel-mode-filter.json Adds example island detection configuration block.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +47
pub fn island_detection_algorithm(
edge_lists: &[&EdgeList],
vertices: &[Vertex],
distance_threshold: f64,
distance_threshold_unit: DistanceUnit,
) -> Result<Vec<(EdgeListId, EdgeId)>, OvertureMapsCollectionError> {
let forward_adjacency: DenseAdjacencyList = build_adjacency(edge_lists, vertices.len(), true)
.map_err(|s| {
OvertureMapsCollectionError::InternalError(format!(
"failed to compute adjacency matrix for island detection algorithm: {s}"
))
})?;

let island_edges: Result<Vec<_>, _> = edge_lists
.par_iter()
.flat_map(|&el| el.0.par_iter())
.filter_map(|edge| {
match is_component_island_parallel(
edge,
distance_threshold,
distance_threshold_unit,
edge_lists,
vertices,
&forward_adjacency,
) {
Ok(true) => Some(Ok((edge.edge_list_id, edge.edge_id))),
Ok(false) => None,
Err(e) => Some(Err(e)),
}
})
.collect();
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

island_detection_algorithm runs a graph traversal starting from every edge in parallel. On realistic OMF-sized graphs this is likely to be extremely expensive (roughly O(E * explored_edges)) and can dominate runtime/memory. A more scalable approach is to compute connected components once (e.g., union-find / BFS per component), compute each component’s max extent, then mark all edges in components whose extent < threshold.

Copilot uses AI. Check for mistakes.
Comment on lines 82 to 86
"island_algoritm_configuration",
)
.map_err(|e| {
let msg = format!(
"error reading 'edge_lists' key in '{configuration_file}': {e}"
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config key string "island_algoritm_configuration" is misspelled ("algoritm"), which bakes a typo into the user-facing configuration format. Consider renaming to island_algorithm_configuration (and handling backwards compatibility if needed).

Suggested change
"island_algoritm_configuration",
)
.map_err(|e| {
let msg = format!(
"error reading 'edge_lists' key in '{configuration_file}': {e}"
"island_algorithm_configuration",
)
.or_else(|_| {
// Backwards compatibility: support legacy misspelled key.
config.get::<Option<IslandDetectionAlgorithmConfiguration>>(
"island_algoritm_configuration",
)
})
.map_err(|e| {
let msg = format!(
"error reading 'island_algorithm_configuration' (or legacy \
'island_algoritm_configuration') key in '{configuration_file}': {e}"

Copilot uses AI. Check for mistakes.
@yamilbknsu
Copy link
Collaborator Author

yamilbknsu commented Feb 9, 2026

I ran the algorithm with the bbox that we've using and got 36,631 edges out of the initial 37,115, which seems like the right magnitude. On my laptop, run time was a couple of minutes and it seemed to be using 100% CPU all the time so I think we are being efficient.

Copy link
Collaborator

@robfitzgerald robfitzgerald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! i agree with you that using the radius distance of the expanding frontier is a good option from both a compute cost and domain correctness perspective. i would prefer if we included some doc comments on the public functions added to bambam-omf but not worried enough to stop this from rolling in. thanks!

@yamilbknsu yamilbknsu merged commit b495a41 into main Feb 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OMF import - islands checking algorithm

2 participants