Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of nested OPTIONALs #72

Open
jindrichmynarz opened this issue Sep 5, 2019 · 1 comment
Open

Poor performance of nested OPTIONALs #72

jindrichmynarz opened this issue Sep 5, 2019 · 1 comment
Assignees
Labels

Comments

@jindrichmynarz
Copy link
Collaborator

jindrichmynarz commented Sep 5, 2019

When you have a SPARQL query with nested OPTIONAL clauses, such as the following, it's performance is poor, typically causing timeouts.

PREFIX bibo:    <http://purl.org/ontology/bibo/>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT *
WHERE {
  {
    SELECT ?article
    WHERE {
      ?article a bibo:Article .
    }
    LIMIT 10
  }

  OPTIONAL {
    OPTIONAL {
      ?article dcterms:issued ?article_issued .
    }
  }
}

Output of Halyard Profile for this query:

Optimized query:
    Projection [2,955,991,897,878,706.5]
        ProjectionElemList
            ProjectionElem "article"
            ProjectionElem "article_issued"
        LeftJoin [2,955,991,897,878,706.5]
            Slice ( limit=10 ) [3,614,563.841]
                Projection [3,614,563.841]
                    ProjectionElemList
                        ProjectionElem "article"
                    StatementPattern [3,614,563.841]
                        Var (name=article)
                        Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
                        Var (name=_const_6dd7acd3_uri, value=http://purl.org/ontology/bibo/Article, anonymous)
            LeftJoin [226.251]
                SingletonSet [1]
                StatementPattern [226.251]
                    Var (name=article)
                    Var (name=_const_884f353b_uri, value=http://purl.org/dc/terms/issued, anonymous)
                    Var (name=article_issued)

The nested OPTIONAL in this query is unnecessary, but it allows to replicate the issue without in a minimal way.

@asotona asotona self-assigned this Sep 9, 2019
@asotona
Copy link
Collaborator

asotona commented Sep 9, 2019

Mapping nested optional to LeftJoin with SingletonSet is correct and it should not cause any issue.
I see minor issue with cardinality of a sub-select with Slice, however it does not affect final query tree.
I'm aware of some specific queries causing performance issues, however unfortunately it is not as simple as just nested OPTIONAL.
It requires further investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants