-
Notifications
You must be signed in to change notification settings - Fork 27
XPath Simple Map and Arrow Operator
Takes the results of a sequence one-by-one and maps them one-to-one to a function:
Examples: //* ! name(.)
This yields the same results as setting the name() function at the end of the XPath expression:
//*/name()
//placeName ! normalize-space(.)
Notice how this function works, by comparing it to the results of //placeName
in the Georg Forster file.
You can use simple map whenever you're applying a function that works in a one-to-one (or one-by-one) relationship over your sequence.
Takes the results of any XPath sequence and hands them to a function that produces a single calculation (Think: e pluribus unum => from the many, one.) This can be used when you want to retrieve a count of the number of nodes you're retrieving, or a sum of a series of numerical values. You can also use it to calculate the distinct-values() of a sequence. Using the arrow operator collates the results of a sequence. This is the same as taking a function and wrapping it around an XPath expression.
Examples:
//placeName => distinct-values()
same as distinct-values(//placeName)
//placeName => distinct-values() => count()
same as count(distinct-values(//placeName))
Sometimes you need to process an XPath sequence one result at a time, and then process the whole sequence to return a calculation. If you're combining these, you'd start with a simple map before using the arrow operator. Here's an example to try on the Georg Forster voyage file:
//placeName ! normalize-space(.) => distinct-values()
Compare the results of this to:
//placeName => distinct-values()
Which of these expressions yields a more accurate result?
You can use the arrow operator in predicates where appropriate. Say you're looking for the last paragraph in the Forster file that contains more than 1 descendant placeName:
Step 1: Find the paragraphs that have descendant placeName elements:
//p[descendant::placeName]
Step 2: Limit to the paragraphs where the count of those placeName elements is greater than 2
//p[descendant::placeName => count() gt 2]
Step 3: Walk the whole tree and get the very last of these in the sequence:
(//p[descendant::placeName => count() gt 2])[last()]
Simple map and the arrow operator were introduced to improve the legibility of complex XPath expressions. If you find yourself getting tangled up in parentheses and square brackets and can't easily edit your code, try rewriting it with the simple map and arrow operators as detanglers!