-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests to document current definition of EXISTS in SPARQL #42
Comments
Thanks Peter, could you do a PR to pull them into this repo? I'll promote these on public-sparql-dev and rdf-tests for consensus. We should get at least two other implementations to pass them. |
OK, #43 I think that virtuoso is going to have the best coverage and it really only peter On 06/18/2016 04:10 PM, Gregg Kellogg wrote:
|
it may be that the file 'existsBlank01.srq' is intended to be 'existsBlank01.srx'. |
Certainly. I've made the change in my fork. I think that the pull request I don't have a test harness, so I haven't verified that everything conforms to peter On 06/19/2016 02:11 PM, james anderson wrote:
|
it may be, that existsMinus01.srx is not a properly encoded result set. |
Peter's changes were also added to the pfps-sparql-exists branch in this repo for convenience. We may end up replacing PR #43 with a new PR based on that branch from this repository. |
Six results files were missing result tags. I've fixed them, I think, and peter On 06/19/2016 04:55 PM, james anderson wrote:
|
now transcribed to the branch which gregg created and pushed. |
I know that Andy prototyped EXISTS in Jena. I would expect it to
-ericP office: +1.617.599.3509 (eric@w3.org) There are subtle nuances encoded in font variation and clever layout |
@pfps could you provide a bit more explanation of your table of virtuoso results above? I'm confused by "Syntax" vs. "Works", and what it means when Works="Y" but Result="wrong". |
On 06/20/2016 05:50 PM, Gregory Todd Williams wrote:
Yeah, I should have put more explanation there. I just put my internal The four columns are:
I've repeated the results here so that you don't have to look them up. Status of tests for Virtuoso
existsScope01.rq Y Y correct - no semantic issue reported existsBlank01.rq Y Y wrong existsSubquery01.rq Y Y correct - no semantic issue reported existsHernandez01.rq Y Y correct As far as I can tell, virtuoso is acting as if EXISTS means putting the peter |
Thanks. So does "correct" correlate with the tests included in #43 (That is, a "correct" result listed here indicates that the corresponding test passed in the test suite.)? |
On 06/20/2016 06:46 PM, Gregory Todd Williams wrote:
Yes, these correlate with the tests in #43. What counts as a pass is a bit hard to determine. Any test that has a "correct" or an "admissable" by itself is a pass. A strict interpretation would be that "no semantic issue reported" is a In the strict interpretation virtuoso gets only 4 out of 18 correct but in the peter |
OK. With a few bug fixes for applying
|
On 06/20/2016 07:31 PM, Gregory Todd Williams wrote:
Not existsScope02? That's the one where SPARQL says that the result is
Sounds like a good candidate for a community-supported erratum.
Hmm. I may have messed this one up when I switched the predicate I was using
Which erratum? This test requires substitution of disconnected variables to peter |
Ah. My mistake. I fail both existsScope01 and existsScope02. I mistook the latter for the former, and also had trouble in modifying my test harness to overlook tests in the manifest that don't have an mf:result. I'm not sure I understand your reasoning for having both of these tests (though I haven't dug too deeply into them).
I now pass this test.
It was an educated guess as to why I was failing the test. That may not be the reason. I'll try to have a deeper look and see what's going on. |
On 06/20/2016 08:11 PM, Gregory Todd Williams wrote:
One of the problems with EXISTS is that it substitutes everywhere, including So the exists in This is semantically suspect as it creates a solution mapping that maps :b to In peter |
Ah. Understood. Those are good tests to have to flesh out the weirdness in the spec as written, but I think that's another case where I don't believe the spec text accurately represents the intention, so wouldn't be making any changes to my systems to deal with the difference between existsScope01 and existsScope02. |
On 06/20/2016 08:28 PM, Gregory Todd Williams wrote:
Another case where a community-backed erratum should be created. You may believe that the spec is wrong, but others may not. That's a bad peter |
in order that these tests be useful, it would help if they were more explicit. the existing test suite suffers severely from test definitions which do not explain their behaviour, but instead presume that the results are self-evident. the test declarations would be much improved were they to not just a claim how the purported interpretation of the recommendation is to be changed, but also to record for the intended interpretation and detail the behaviour for which the test produces the indicated result. |
it is also possible, that this is not a “problem with exists”. the word “substitute” appears in another place in the recommendation text, in addition to the passage which concerns exists. nothing in these tests has yet succeeded to change my view, that the recommendation is incomplete, that some of the behaviour wrt exists is underspecified and, that careful specification suffices. |
andy seaborne responded to the related thread on public-rdf-tests@w3.org with a pointer to issue 68 in the shapes working group. |
On 06/21/2016 02:54 AM, james anderson wrote:
I added an analysis.text file that analyzes many of the tests to show just peter |
On 06/21/2016 06:00 AM, james anderson wrote:
The W3C Data Shapes Working Group can proceed, even with the current SPARQL To be viable with the current SPARQL recommendation, the SHACL specification The normative definition of SHACL depends on a particular behaviour of SPARQL That's not a happy place to be in. The reason that this would be required is that SHACL heavily uses EXISTS. As far as issue 68 goes, that hits another problem with SPARQL. Many SPARQL The first definition was like the definition of EXISTS, and had at least all So the working group could proceed by coming up with a definition of This definition of pre-binding does not necessarily correspond to the Also not a happy place to be in. Pre-binding is used for just about every construct in SHACL. The extension mechanism of SHACL is even more intertwined with SPARQL and peter |
Does that mean SHACL won't work with SPARQL systems that don't also support pre-binding? I don't know anything about SHACL, but that doesn't sound like a good requirement. |
On 06/21/2016 10:54 AM, Gregory Todd Williams wrote:
SHACL is (more or less) defined as an extension to SPARQL. To implement SHACL So it's not the case that the lack of pre-binding is a special problem. In Whether it is a good idea to define SHACL as an extension to SPARQL is a peter |
thank you. |
i read these paragraphs, above, as contradicting each other.
please give a concise concrete example of a query which demonstrates this issue and describe the use case which requires it.
please give a concise concrete example of a query which demonstrates this issue and describe the use case which requires it. |
On 06/21/2016 02:29 PM, james anderson wrote:
How so? It's not going to be a happy solution of course. SHACL will be only I would personally vote against advancing something that has this to
Take a look at the current SHACL spec, at SELECT $this ($this AS ?subject) $predicate (?value AS ?object) This query has problems when ?value is a blank node, where instead of checking
The above query is kicked off in SHACL with initial mappings for $this, peter |
this could be said to be true, but only if one misunderstands “substitute” in a manner which diverges from that in section two. |
This was discussed during the rdf-star meeting on 26 September 2024. View the transcriptAddressing SPARQL EXISTS errata 4ora: Are there people fine with the current syntax? ora: In any case, chairs will discuss this, let's move on AndyS: [about SPARQL EXISTS] There are two proposals AndyS: 1. substitution based on various existing errata AndyS: 2. an other one based on ANTIJOIN. We already have MINUS. Except the behavior with disjoin domain. But outside of it it's ANTIJOIN AndyS: On an other note, there are other things that might go to SPARQL like LATERAL that can be based on substitution. And pure form of anti join and semi join AndyS: It's a possibility to move these additions (LATERAL, anti join...) to sparql dev pchampin: we would add more subtly differences between operators like FILTER NOT EXISTS vs MINUS pchampin: Your point of having multiple ways might create problems ora: SPARQL spec spends a bit of time presenting this difference AndyS: It was quite contentious in SPARQL 1.1 <pchampin> I'm more than happy to let the editors decide on that AndyS: I am not aware of any outgoing opinion, I think it ends up to a choice on which way to go tl: is it related to triple terms in any way of is it a SPARQL errata AndyS: it has nothing to do with triple terms tl: what is the criteria of SPARQL errata to discuss now? tl: it's a central issue, is that the argument? pfps: There are a bunch of problems with SPARQL, the ones with EXIST are the biggies pfps: They end up splitting the SPARQL implementation space pfps: The decision that has to be made is to move SPARQL EXIST toward a more database-like implementation and keep it more consistent with the existing AndyS: The current implementation is present in SQL with correlated subqueries pfps: if you use the semi/anti join interepretation of EXISTS you change SPARQL more than the other option pfps: In the end people who will see and understand the differences are very few ora: I would like to know preferences AndyS: My preference is for substitution and applying errata (option 1) pfps: I don't have much of a horse in this race pfps: Idealy I would love to get more SPARQL developers on board ora: we could talk outside of the group ktk: I reached out to stardog but not got an input gtw: I am not sure much value to reach out to more developers. sparql-dev has been opened for a long time <pchampin> Tpt: I have a signicant preference for option 1; option 2 is basically equivalent to MINUS pfps: One way to check the issue would be to pull some tests <pfps> which PR? <gkellogg> w3c/rdf-tests#42 <gb> Issue 42 tests to document current definition of EXISTS in SPARQL (by pfps) [SPARQL] <gkellogg> w3c/rdf-tests#43 <gb> CLOSED Pull Request 43 Add tests to document current definition of EXISTS (by pfps) ora: Whatever solutions we pick, someone will ask why we pick it AndyS: picking sustitution breaks the least queries ora: That seems to me a as good reason as any, let's make a decision tl: I would like to ask james about it ora: Let's vote on it next Thursday ora: Let's do it |
SPARQL EXISTS has lots of problems. It produces invalid algebraic structures, it hits explicitly undefined situations in the algebra, it produces counterintuitive results, I don't know of any SPARQL implementation that implements it correctly, different implementations of SPARQL implement it differently.
I have added a bunch of tests to document the correct behaviour of EXISTS according to the SPARQL specification. These tests are available at
https://github.com/pfps/rdf-tests/tree/gh-pages/sparql11/data-sparql11/exists
I have manually run all these tests on Virtuoso Open Source 7, with the following results
Status of tests for Virtuoso
existsScope01.rq Y Y correct - no semantic issue reported
existsScope02.rq Y Y admissible - no semantic issue reported
existsValues01.rq Y Y wrong - no semantic issue reported
existsValues02.rq Y Y correct - no semantic issue reported
existsBlank01.rq Y Y wrong
existsBound.rq Y Y wrong
existsMinus01.rq Y Y wrong
existsSubquery01.rq Y Y correct - no semantic issue reported
existsSubquery02.rq Y Y correct - no semantic issue reported
existsSubquery03.rq Y Y correct - no semantic issue reported
existsSubquery04.rq Y Y correct
existsSubquery05.rq Y Y correct
existsSubquery06.rq Y Y correct - no semantic issue reported
existsSubquery07.rq Y Y wrong - no semantic issue reported
existsSubquery08.rq Y Y wrong - no semantic issue reported
existsSubquery09.rq Y Y correct - no semantic issue reported
existsHernandez01.rq Y Y correct
existsHernandez02.rq Y Y correct
Note that one test hits an explicitly undefined part of the SPARQL algebra. I don't know how to indicate that. Quite a few other tests produce invalid algebraic structures which don't show up in the output. I don't know how to indicate that. I have added comments to indicate when this happens.
I have also added my suggestions on what changes should be made to fix EXISTS.
The text was updated successfully, but these errors were encountered: