-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interest in addition of spatial index archive? #81
Comments
We definitely would consider a PR. Also we can easily release a new version on crates.io. |
Did you manage to also reorder nodes/way based on the index to increase data locality? I would assume that an index for nodes is "for free" / implicit as long as their are sorted by (some) space filling curve (z-order, hilbert, etc). Ways and relations require extra data (but there's less of them to go around). |
I wasn't able to reorder nodes/ways due to the lack of sort functionality on ExternalVector. I keep the ZOrderIndexEntries in an in-memory vector and sort by index at the end. I can do a US state on my laptop, but know users might want to run it on a planet file. I asked ChatGPT to make an ascii diagram of how the ZOrderIndexEntry works.
Here's a code snippet of using
|
So I had a thought. Your idea of ordering the nodes, ways, relations each in index order is a good one. It would allow for less data to be stored. To get all the nodes, ways, relations at one spatial index would require 3 binary searches. The data that wouldn't have to be stored would be (8 bits for the GeoType, 64 bits for the index of the node/way/relation) ~100GB on a planet file. Then all that needs to be stored would be an optional spatial_index on each Node, Way, and Relation, 64 bits. To allow for reordering nodes, ways, relations I would use RocksDB during the compilation process, since RocksDB allows iterating all key values by a sorted key. This would allow for the data to never be held in memory, but still be sorted. |
That sounds like a good idea. You might not even need those extra 64 bit for nodes, as the spatial index of a node is (most likely) just the morton code of its coordinates, and could be computed on-the-fly? |
Yeah, That makes sense. Since we have the latitude and longitude, we can calculate the index on the fly. |
I thought to add a spatial index to osmflat-rs, and have it partially implemented here (master...boydjohnson:osmflat-rs:feature/geospatial-archive). It is implemented for Nodes and Ways, but not relations.
The index is this implementation (https://docs.rs/space-time/latest/space_time/xzorder/xz2_sfc/struct.XZ2SFC.html).
It allows for indexing Nodes, Ways, and Relations with the same curve, based on bounding box.
Aside from asking if you would consider a PR, I was wondering if there would be crates.io releases in the future for osmflat and osmflatc?
Best regards.
The text was updated successfully, but these errors were encountered: