Skip to content

Spatial Index improvements (index-per-graph + kryo) #3026

@Aklakan

Description

@Aklakan

Version

5.4.0-SNAPSHOT

Feature

This proposal is to enhance the spatial index with support for index-per-graph as well as to improve its serialization using kryo - via Apache Sedona's kryo/jts implementation.

This is an incremental improvement of the existing JTS-based in-memory implementation - its not a complete overhaul such as a disk-based incrementally updated transaction-aware R-tree (if someone contributed that then this issue's PR could be discarded 😄 ).

The impact of this work have been evaluated and presented at the GeoLD workshop last year proceedings:

Simon Bin, Claus Stadler, Lorenz Bühmann, and Michael Martin
Getting practical with GeoSPARQL and Apache Jena
Slides

The essence is presented on the following slides:

Using an index per graph (unsurprisingly) boosts the performance when multiple graphs have geometries and only a subset is queried (slide 15):

Image

As for serialization performance (slide 16), while index building became a bit slower, this is outweighed by near-instant loading of the spatial index. The reason for the writing overhead is, that the index tree is now serialized as a tree - before, the items were written out as a flat list, and the tree had to be rebuilt from scratch on restart.

Image

A new geosparql:indexPerGraph option (boolean) is added to the geosparql:GeosparqlDataset assembler.

The implementation has been mainly done by @LorenzBuehmann - the writing and presentation is the work of @SimonBin - I supported in evaluation.

As for compatibility, I need to check for whether it is backward compatible but I think due to the change of the serializer, existing spatial indexes would have to be rebuilt.

For reference, a bit of related discussion has happened in #2645.

Are you interested in contributing a solution yourself?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementIncrementally add new feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions