Account for `FILTER`s when considering greedy query planning #1705

joka921 · 2025-01-09T13:59:20Z

Since #1442, QLever switches to greedy query planning for large connected components. A connected component is considered large when the number of connected subgraphs is above the threshold determined by the runtime parameter query-planning-budget.

So far, FILTERs were simply ignored when counting the number of subgraphs. However, FILTERs can add significant complexity to the standard query planning because for each subplan, our query planner considers either adding all applicable FILTERs to it or none of them. As a result, for certain queries with a medium-sized component but a significant number of FILTERs, the query planning complexity was underestimated and the query was not planned greedily and the standard query planning took very long.

This is now fixed by replacing, for the purpose of query planning, each FILTER by a dummy VALUES clause which uses the set of distinct variables from the FILTER. A FILTER that has many variables in common with other triples will then increase the subgraph count substantially. If multiple FILTERs have the same set of distinct variables, the dummy VALUES clause is added only once (because our query planner either adds all applicable FILTERs at a certain point or none of them). Note that this trick overestimates the true query planning complexity. That is, the worst that can happen now is that with many FILTERs, we switch to greedy planning even though standard query planning would have still been feasible,

Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

codecov · 2025-01-09T16:07:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.87%. Comparing base (acb6633) to head (982cff7).
Report is 19 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1705   +/-   ##
=======================================
  Coverage   89.86%   89.87%           
=======================================
  Files         389      389           
  Lines       37308    37329   +21     
  Branches     4204     4209    +5     
=======================================
+ Hits        33527    33549   +22     
+ Misses       2485     2484    -1     
  Partials     1296     1296

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

hannahbast

Looks great + nice trick with the filters. A further optimization would be to not have the dummy VALUES "plans" (one for each FILTER) have edges between them.

hannahbast

After discussion with Johannes: only add one dummy VALUES clause for each distinct set of variables in a FILTEr. Then we can still overestimate, but the cases where that happens will be rare, and the only bad outcome then is that we compute a greedy query plan in a case, where non-greedy would have worked as well.

…planning-budget. Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

sparql-conformance · 2025-01-10T07:43:14Z

Conformance check passed ✅

No test result changes.

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=982cff756dc2d031e44813db62ea73aeb6ac33a7&prev=acb6633debc7341985341aff147b5038cc8d951b

sonarqubecloud · 2025-01-10T08:04:00Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

hannahbast

Thanks a lot + I wrote a proper description and will merge this now!

joka921 added 2 commits January 9, 2025 14:58

Also account for the filters when counting the subgraphs.

f29efc6

Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

Added some comments.

740d186

Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

A small fix.

2050af8

Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

hannahbast requested changes Jan 9, 2025

View reviewed changes

Clean up, this should work with a reasonable threshold for the query-…

982cff7

…planning-budget. Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>

hannahbast changed the title ~~Also account for the filters when counting the subgraphs.~~ Account for FILTERs when considering switching to greedy query planning Feb 4, 2025

hannahbast changed the title ~~Account for FILTERs when considering switching to greedy query planning~~ Account for FILTERs when considering greedy query planning Feb 4, 2025

hannahbast approved these changes Feb 4, 2025

View reviewed changes

hannahbast merged commit aa55057 into ad-freiburg:master Feb 4, 2025
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Account for `FILTER`s when considering greedy query planning #1705

Account for `FILTER`s when considering greedy query planning #1705

Uh oh!

joka921 commented Jan 9, 2025 •

edited by hannahbast

Loading

Uh oh!

codecov bot commented Jan 9, 2025 •

edited

Loading

Uh oh!

hannahbast left a comment

Uh oh!

hannahbast left a comment

Uh oh!

sparql-conformance bot commented Jan 10, 2025

Uh oh!

sonarqubecloud bot commented Jan 10, 2025

Uh oh!

hannahbast left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Account for FILTERs when considering greedy query planning #1705

Account for FILTERs when considering greedy query planning #1705

Uh oh!

Conversation

joka921 commented Jan 9, 2025 • edited by hannahbast Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hannahbast left a comment

Choose a reason for hiding this comment

Uh oh!

hannahbast left a comment

Choose a reason for hiding this comment

Uh oh!

sparql-conformance bot commented Jan 10, 2025

Conformance check passed ✅

Uh oh!

sonarqubecloud bot commented Jan 10, 2025

Quality Gate passed

Uh oh!

hannahbast left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Account for `FILTER`s when considering greedy query planning #1705

Account for `FILTER`s when considering greedy query planning #1705

joka921 commented Jan 9, 2025 •

edited by hannahbast

Loading

codecov bot commented Jan 9, 2025 •

edited

Loading

hannahbast left a comment •

edited

Loading