-
Notifications
You must be signed in to change notification settings - Fork 7
understory_index: write Grid::visit_rect without a HashSet
#91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Rather than the `HashSet`, this does a check on one of the corners of the intersection between `rect` and `entry.aabb`. This is a speed up, especially so for smaller systems (where the allocation and initializing the `HashSet` are relatively expensive). For larger systems and where there is a lot of overlap, this check appears to still be somewhat cheaper than the hashing and bucket lookup. Timings against main using endoli#90: - `visit_rect_grid`: ``` visit_rect_grid_f64/Grid(10.)/32 time: [258.81 µs 259.31 µs 259.83 µs] thrpt: [3.9410 Melem/s 3.9490 Melem/s 3.9566 Melem/s] change: time: [-42.138% -42.006% -41.869%] (p = 0.00 < 0.05) thrpt: [+72.025% +72.430% +72.824%] Performance has improved. visit_rect_grid_f64/Grid(10.)/64 time: [297.79 µs 298.34 µs 298.95 µs] thrpt: [13.701 Melem/s 13.729 Melem/s 13.755 Melem/s] change: time: [-31.003% -30.837% -30.660%] (p = 0.00 < 0.05) thrpt: [+44.216% +44.587% +44.934%] Performance has improved. visit_rect_grid_f64/Grid(10.)/128 time: [379.81 µs 380.90 µs 382.05 µs] thrpt: [42.884 Melem/s 43.014 Melem/s 43.138 Melem/s] change: time: [-22.038% -21.747% -21.466%] (p = 0.00 < 0.05) thrpt: [+27.334% +27.790% +28.268%] Performance has improved. ``` - `visit_rect_overlap`: I added a `512x512`-sized grid with overlap here to be quite sure this is never slower. Note that in this benchmark, each cell contains multiple AABBs, and the rectangle were using to visit overlaps on average 18 cells. ``` visit_rect_overlap_f64/Grid(10.)/32 time: [731.93 µs 733.00 µs 734.19 µs] thrpt: [1.3947 Melem/s 1.3970 Melem/s 1.3990 Melem/s] change: time: [-21.316% -20.920% -20.538%] (p = 0.00 < 0.05) thrpt: [+25.846% +26.455% +27.091%] Performance has improved. visit_rect_overlap_f64/Grid(10.)/64 time: [861.14 µs 862.88 µs 864.84 µs] thrpt: [4.7361 Melem/s 4.7469 Melem/s 4.7565 Melem/s] change: time: [-23.093% -22.726% -22.278%] (p = 0.00 < 0.05) thrpt: [+28.664% +29.410% +30.028%] Performance has improved. visit_rect_overlap_f64/Grid(10.)/128 time: [1.0383 ms 1.0404 ms 1.0428 ms] thrpt: [15.711 Melem/s 15.748 Melem/s 15.780 Melem/s] change: time: [-16.744% -16.039% -15.540%] (p = 0.00 < 0.05) thrpt: [+18.399% +19.102% +20.112%] Performance has improved. visit_rect_overlap_f64/Grid(10.)/512 time: [1.5065 ms 1.5104 ms 1.5144 ms] thrpt: [173.10 Melem/s 173.56 Melem/s 174.00 Melem/s] change: time: [-11.413% -8.3974% -5.3382%] (p = 0.00 < 0.05) thrpt: [+5.6392% +9.1672% +12.884%] Performance has improved. ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized there's a potential issue with how truncation of higher-order bits and grid cells is handled here. Say we allow geometry to be placed in some cell that's out of range of an i32 and just truncate the higher order bits, then cells don't have a real geometric meaning and we cannot do this trick.
tomcur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized there's a potential issue with how truncation of higher-order bits and grid cells is handled here. Say we allow geometry to be placed in some cell that's out of range of an i32 and just truncate the higher order bits, then cells don't have a real geometric meaning and we cannot do this trick.
Think it should still work, as the lower bits should still identify a cell. But if we allow such overflow of the grid cells in general, we should test that quite well (unrelated to this PR).
|
Are your comments saying it should land or shouldn't? I think it should. I think that overflowing the grid should be an error of some sort. |
|
Under the assumption we want overflow, I first thought this couldn't be made to work, but now I think it can, and so I think it should land (regardless of handling overflow or not). Separately: I only later realized overflow was cared about, in the lint ignore understory/understory_index/src/backends/grid.rs Lines 56 to 75 in 99f24d2
At first glance that's sort of correct, but the word "truncated" should probably be "saturated", and perhaps |
If an `f64`, `f32` or `i64` coordinate is such that it is out of range of cells representable by an `i32`, the grid cell coordinate is saturated to `i32::MIN` or `i32::MAX`). Negative fractional coordinates are still rounded toward negative infinity. This is probably enough for endoli#91, but that requires a bit more thought.
If an `f64`, `f32` or `i64` coordinate is such that it is out of range of cells representable by an `i32`, the grid cell coordinate is saturated to `i32::MIN` or `i32::MAX`). Negative fractional coordinates are still rounded toward negative infinity. This is probably enough for #91, but that requires a bit more thought.
Rather than the
HashSet, this does a check on one of the corners of the intersection betweenrectandentry.aabb.This is a speed up, especially so for smaller systems (where the allocation and initializing the
HashSetare relatively expensive). For larger systems and where there is a lot of overlap, this check appears to still be somewhat cheaper than the hashing and bucket lookup.Timings against main using #90:
visit_rect_grid:visit_rect_overlap:I added a
512x512-sized grid with overlap here to be quite sure this is never slower. Note that in this benchmark, each cell contains multiple AABBs, and the rectangle were using to visit overlaps on average 18 cells.