-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formally describe filter syntax #248
Comments
I'm planning to extend the path expression in one of the next releases. This includes at least one breaking change. Maybe this could be a good time to align the two implementations. |
I tried to summarize all syntax features I could find. Not all of them need to become part of the standard. This looks more complex than it is: Basic
PICA::Data supports all but 3, 8, 9 and 13 is only supported internally. Extended
PICA::Data supports all but 18 and 14 (but 14 needs to be added for sure). Filter conditions Filter conditions can get quite complicated and its one of the strength of pica-rs. I thought to limit them in PICA::Data to basic cases (see gbv/PICA-Data#108), so maybe we define a simple subset as standard and let the rest evolve as optional extension as you like. Most basic include (details to be discussed):
I'd also optionally allow By the way we already had a discussion to extend the syntax with @cKlee and @jorol. In addition to the implementation in PICA::Data which is used in Catmandu and the implementation in pica-rs I plan an implementation of a subset of PICA Path filter syntax to query a Solr index filled with PICA data instead of running filters on a stream of records (e.g. get me all records that have some specific fields, lack some specific subfields and have a given value in another subfield). |
Many thanks for the summary. I'm on vacation until end of july! After my return I'll take a closer look. |
pica-rs now supports multiple subfields in path and filter expressions (see #255) |
pica-rs now supports occurrence ranges (ex |
You can find the specification of the filter syntax (also known as Record Matcher) here: |
PICA has no official query language. I started a specification and implementation based on the more complex MARCSpec by @cKlee:
I was just about to extend this "PICA Path" language by methods to filter subfield existence and values when I discovered pica-rs. As far as I understand pica-rs documentation references the filter language as "query expressions" and as "select expression". The building blocks are:
{
...}
)What's the formal syntax?
We should better try to defined one common (subset) language at least to reference a field or a subfield without additional condition. This should be possible by introduction of alternative syntax elements in both of our implementations (e.g. both
.
and$
could be used before subfield code) or by modification in on of the implementations (breaking backwards compatibility). In any way I it should be worth the effort.The text was updated successfully, but these errors were encountered: