Skip to content

Conversation

@Aklakan
Copy link
Contributor

@Aklakan Aklakan commented Oct 10, 2025

GitHub issue resolved #3507

Pull request Description: This is mainly infrastructure work to further assess RDFS reasoning - and any future changes to it.

  • Fixed literals-in-subject inferences due to range declarations (easy fix). Added MapperX.isLiteral to allow for testing on X possibly bypassing Node-materialization.

  • Fixed bug in IteratorConcat which would raise IndexOutOfBoundsException if close() was called without hasNext.

  • Added testing framework that compares all combinations of invoking find(). (I hope I didn't overlook an existing system for that). Added commons-math4 as a test-scoped dependency for the Combinations class.

  • There is a disabled test in AbstractTestRDFS_Extra which fails. It uses :directType rdfs:subPropertyOf rdf:type. How to solve this is a separate issue - I don't think it's straight forward, so the contribution here is the test case that reveals it.

  • Added infrastructure to ease wrapping Match implementations. Factored out DatasetGraphWithGraphTransform base class from DatasetGraphRDFS that can transform any Graph with a Match. There is a TDB2 test case that performs a simple RDFS inference on the NodeId level. The schema must be loaded into the graph though for the NodeIds to be present.

  • Added initial benchmark class for the RDFS reasoner. [update] Removed the benchmark due to lack of scope. Can be added with a later PR to benchmark impact of specific changes.

  • Added Iter.distinctCached in preparation to filter out some duplicates of MatchRDFS. However, this PR does not change any existing behavior. The goal is to extend AssemblerRDFS with an option for distinct/reduced behavior.


  • Tests are included.
  • [ ] Documentation change and updates are provided for the Apache Jena website
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch from 5aeced7 to 68eb6fa Compare October 10, 2025 20:27
Comment on lines 34 to 52
@TestFactory
@Disabled("Needs investigation!")
public List<DynamicTest> testSubPropertyOfRdfType01() {
String schemaStr = """
PREFIX : <http://ex.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:directType rdfs:subPropertyOf rdf:type .
""";

String dataStr = """
PREFIX : <http://ex.org/>
:fido :directType :Dog .
""";

return prepareRdfsFindTests(schemaStr, dataStr).build();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces the failing test cases.

@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch 5 times, most recently from ea1fcca to bacb1bb Compare October 11, 2025 06:45
@Aklakan Aklakan changed the title RDFS: testing/wrapping/benchmark framework. GH-3507: RDFS testing/wrapping/benchmark framework. Oct 11, 2025
@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch 7 times, most recently from 204dd42 to d7da2a5 Compare October 11, 2025 07:49
@Aklakan Aklakan marked this pull request as draft October 11, 2025 10:58
@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch 2 times, most recently from 5e65298 to 3538860 Compare October 11, 2025 12:51
@Aklakan
Copy link
Contributor Author

Aklakan commented Oct 11, 2025

I added a test case that uses the infrastructure to perform a simple RDFS inference on the NodeId level. It's limited because it requires the ontology and the built-in properties to be present in the graph. Rdfs and owl terms would have to be pre-populated in the NodeTable. At least things can now be wired up in the way that seemed to be the intended once by having the generic X, and one can play around with it.

The relevant snippet is this:

// Add wrapping on NodeId level.
Dataset baseDsg = TDB2Factory.createDataset().asDatasetGraph();
MapperX<NodeId, Tuple3<NodeId>> mapper = MapperXTDB.create(baseDsg);
ConfigRDFS<NodeId> configRDFS = RDFSFactory.setupRDFS(schema, mapper);

DatasetGraph rdfsDsg = new DatasetGraphWithGraphTransform(baseDsg,
    g -> GraphMatch.adapt(g, new MatchRDFSWrapper<>(configRDFS, MatchTDB.wrap(g))));

Note, that this only demonstrates a working wiring with the NodeId realm - it does not leverage QueryEngineTDB to run filters and aggregates on the NodeID level - that tighter integration would be future work.

@Aklakan Aklakan marked this pull request as ready for review October 11, 2025 13:00
derive(o, rdfType, c, out);
subClass(o, rdfType, c, out);
});
if (!mapper.isLiteral(o)) {
Copy link
Contributor Author

@Aklakan Aklakan Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is how to minimize materialization in this check. One can check for whether a NodeId is an inlined value and return early. But not sure if from the NodeId alone it can be decided whether it's a literal or not.

@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch 9 times, most recently from 7bfcbe7 to 5f8ffbc Compare October 11, 2025 15:41
@Aklakan Aklakan changed the title GH-3507: RDFS testing/wrapping/benchmark framework. GH-3507: RDFS testing/wrapping framework. Oct 11, 2025
@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch 3 times, most recently from 9592a5e to 26593dd Compare October 12, 2025 12:55
@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch from 26593dd to d3e311a Compare October 23, 2025 14:35
@Aklakan Aklakan force-pushed the 20250808_rdfs-testing branch from d3e311a to 1746776 Compare October 27, 2025 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RDFS Issues

1 participant