From cc0e853cf0ffcc13490ec447e61b3dcc0a05e9f7 Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Wed, 29 Jan 2025 10:18:24 +0000 Subject: [PATCH 1/4] Further elaboration of duplicates handling in maps --- .../src/function-catalog.xml | 437 ++++++++---------- .../xslt-40/src/element-catalog.xml | 2 +- specifications/xslt-40/src/xslt.xml | 52 ++- 3 files changed, 230 insertions(+), 261 deletions(-) diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index 748137ea5..7e651fdeb 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -22998,8 +22998,7 @@ xs:QName('xs:double')

The function map:merge - returns a map that - is formed by combining the contents of the maps supplied in the $maps + returns a map that is formed by combining the contents of the maps supplied in the $maps argument.

@@ -23008,130 +23007,52 @@ xs:QName('xs:double')

There is one entry in the returned map for each distinct key present in the union - of the input maps, where two keys are distinct if they are not the same key.

+ of the input maps, where two keys are distinct if they are not the same key. The order of the input maps, + and of the entries within these input maps, is retained in the + of the result map.

If there are duplicate keys, that is, if two or more maps contain entries having the same key, then the way this is handled is - controlled by the $options argument.

-
-
- - -

The definitive specification is as follows.

- - - -

If the second argument is omitted or an empty sequence, the effect is the same as - calling the two-argument function with an empty map as the value of $options.

-
- -

The $options argument can be used to control the way in which duplicate keys are handled. - The option parameter conventions apply. -

-
-

In the event that two or more entries in the input maps have the - :

+ >same key, then the relevant entries are combined in a way + that is controlled by the supplied $options.

+ + +

The $options argument takes the same values (with the same meanings) + as the map:of-pairs function, except that the default is different: + for map:merge, the default for duplicate keys is use-first.

+ +

The difference is for backwards compatibility reasons.

+ +

With the default options, when duplicate entries occur:

+ -

A single entry is created by combining the values of the duplicates, - in a way determined by the supplied $options.

-

The key of the combined entry is one of the duplicate keys: - which one is chosen is . - (Two keys that are deemed duplicates may differ: for example they may have +

There will be a single entry in the result + corresponding to a set of duplicate entries in the input. +

+

The value of that entry will be taken from the first + of the duplicates.

+

The position of that entry in the of the result map will correspond to the + position of the first of the duplicates.

+

The key of that entry will be the key used + in the last of the duplicates. (Keys may be + duplicates even though they differ: for example, they may have different type annotations, or they may be xs:dateTime - values with different timezones.)

-

The position of the combined entry in the - of the result map corresponds to the position of the first appearance of - the corresponding key value in the input.

+ values in different timezones.)

- - - -

The entries that may appear in the $options map are as follows:

- - - - Determines the policy for handling duplicate keys: specifically, the action to be - taken if two maps in the input sequence $maps contain entries with key values - K1 and K2 where K1 and K2 are the - same key. This option and the combine - option are mutually exclusive. - - xs:string - use-first - - - Equivalent to specifying "combine": fn(){error(xs:QName("err:FOJS0003"), ...) - (the remaining arguments to fn:error being - ). - - Equivalent to specifying "combine": fn($a, $b){ $a }. - - Equivalent to specifying "combine": fn($a, $b){ $b }. - - Equivalent to specifying "combine": fn($a, $b){ one-of($a, $b) } - where one-of chooses either $a or $b in - an way. - - Equivalent to specifying "combine": fn($a, $b){ $a, $b }. - - - - - - Supplies a function for handling duplicate keys: specifically, the action to be - taken if two maps in the input sequence $maps contain entries with key values - K1 and K2 where K1 and K2 are the - same key. This option and the duplicates - option are mutually exclusive. - - (fn($existing-value as item()*, $new-value as item()*) as item()*)? - fn($a, $b){ $a } - - - A function with signature fn(item()*, item()*) as item()*. - The function is called for any entry in an input map that has the - as a previous entry. The first argument - is the existing value associated with the key; the second argument - is the value associated with the key in the duplicate input entry, - and the result is the new value to be associated with the key. - - - - - +
- - -
- -let $FOJS0003 := QName("http://www.w3.org/2005/xqt-errors", "FOJS0003") -let $combiner := $options?combine - otherwise { - "use-first": fn($a, $b) { $a }, - "use-last": fn($a, $b) { $b }, - "combine": fn($a, $b) { $a, $b }, - "reject": fn($a, $b) { fn:error($FOJS0003) }, - "use-any": fn($a, $b) { fn:random-number-generator()?permute(($a, $b))[1] } - } ($options?duplicates) - otherwise fn($a, $b) { $a } - -return map:of-pairs($maps =!> map:pairs(), { "combine": $combiner }); + +map:of-pairs($maps =!> map:pairs(), + $options[exists((?duplicates, ?combine))] + otherwise { "duplicates": "use-first" }); @@ -23150,32 +23071,11 @@ return map:of-pairs($maps =!> map:pairs(), { "combine": $combiner }); - -

By way of explanation, the function first reduces the sequence of input maps - to a sequence of key-value pairs, retaining order of both the maps and of the - entries within each map. It then combines key-value pairs having the - by applying the $combine function - successively to pairs of duplicates. The position in the - of the result map of an entry formed by combining duplicates corresponds to the - position of the first occurrence of the key in the input sequence. This is true - even whien the option use-last is used: the value of the resulting - entry corresponds to the last entry with a given key, but the position of the entry - in the result map corresponds to the position of the first entry with that key. -

- -

The use of fn:random-number-generator represents one possible conformant - implementation for "duplicates": "use-any", but it is not the only conformant - implementation and is not intended to be a realistic implementation. The purpose of this - option is to allow the implementation to use whatever strategy is most efficient; for example, - if the input maps are processed in parallel, then specifying "duplicates": "use-any" - means that the implementation does not need to keep track of the original order of the sequence of input - maps.

- -
+

If the input is an empty sequence, the result is an empty map.

If the input is a sequence of length one, the result map is - indistinguishable from the supplied map.

+ indistinguishable from the input map.

There is no requirement that the supplied input maps should have the same or compatible types. The type of a map (for example map(xs:integer, xs:string)) is @@ -23297,55 +23197,139 @@ return map:of-pairs($maps =!> map:pairs(), { "combine": $combiner });

The $options argument can be used to control the way in which duplicate keys are handled. The option parameter conventions apply. - The handling of duplicates is defined to be the same as in an equivalent call of - the map:build function: see the formal equivalent below. +

The entries that may appear in the $options map are as follows:

- - A function that is used to combine two different values that are supplied - for the same key. The default is to combine the two values using - , retaining their order - in the input sequence. + + + Determines the policy for handling duplicate keys: specifically, the action to be + taken if two entries in the input sequence have key values + K1 and K2 where K1 and K2 are the + same key. This option and the combine + option are mutually exclusive. - (fn($existing-value as item()*, $new-value as item()*) as item()*)? - fn:op(',') + xs:string + combine - - The function is called for any entry in an input map that has the + + Equivalent to specifying "combine": fn(){error(xs:QName("err:FOJS0003"), ...) + (the remaining arguments to fn:error being + ). + + Equivalent to specifying "combine": fn($a, $b){ $a }. + + Equivalent to specifying "combine": fn($a, $b){ $b }. + + Equivalent to specifying "combine": fn($a, $b){ one-of($a, $b) } + where one-of chooses either $a or $b in + an way. + + Equivalent to specifying "combine": fn($a, $b){ $a, $b }. + + + + + + Supplies a function for handling duplicate keys: specifically, the action to be + taken if entries in the input sequence contain entries with key values + K1 and K2 where K1 and K2 are the + same key. This option and the duplicates + option are mutually exclusive. + + (fn($existing-value as item()*, $new-value as item()*) as item()*)? + fn($a, $b){ $a, $b } + + + A function with signature fn(item()*, item()*) as item()*. + The function is called for any entry in the input sequence that has the as a previous entry. The first argument is the existing value associated with the key; the second argument is the value associated with the key in the duplicate input entry, and the result is the new value to be associated with the key. - - + + + - + - -map:build($input, map:get(?, 'key'), map:get(?, 'value'), $combine) + +let $one-of := fn($a, $b) { + (: select either $a or $b at implementation option :) + if (environment-variable("X")) then $a else $b +} +let $combine as function(item()*, item()*) as item()* := + { "reject": fn($a, $b){ error(xs:QName("err:FOJS0003")) }, + "use-first": fn($a, $b){ $a }, + "use-last": fn($a, $b){ $b }, + "use-any": fn($a, $b){ $one-of($a, $b) }, + "combine": fn($a, $b){ $a, $b } + } ? ($options?duplicates) + otherwise $options?combine + otherwise fn($a, $b) { $a, $b } +return fold-left( $input, {}, + fn ( $out, $next ) { + let $newVal := + if (map:contains( $out, $next?key )) + then $combine( $out?($next?key), $next?value ) + else $next?value + return map:put( $result, $next?key, $newVal ) + }) +

An error is raised if both the combine and duplicates + options are present.

+

An error is raised if the value of + $options indicates that duplicates are to be rejected, and a duplicate key is encountered.

+

An error is raised if the value of + $options includes an entry whose key is defined + in this specification, and whose value is not a permitted value for that key.

-

The function can be made to fail with a dynamic error in the event that - duplicate keys are present in the input sequence by supplying a $combine - function that invokes the fn:error function.

+

In the formal equivalent shown above:

+ +

The call on error() is indicative; the implementation + is free to raise the error in its own way.

+

The function $one-of($a, $b) is intended to + illustrate that either $a or $b is returned, + at the discretion of the implementation. A function body is provided + for completeness, but it is not intended as a realistic implementation.

+

If the input is an empty sequence, the result is an empty map.

There is no requirement that the supplied key-value pairs should have the same or compatible types. The type of a map (for example map(xs:integer, xs:string)) is descriptive of the entries it currently contains, but is not a constraint on how the map may be combined with other maps.

+

When duplicate keys are encountered, the effect is that:

+ +

In the + of the result map, the position of the entry containing the result + of combining a set of entries with duplicate keys corresponds to + the position of the first of the duplicates in the input sequence.

+

The key of the entry containing the combined value is the last of + the several duplicates. (Keys may be duplicates even though they differ: + for example they may have different type annotations, or they might be + xs:dateTime values in different timezones.)

+
{ 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch", 4: "Donnerstag", 5: "Freitag", 6: "Samstag", 7: "Unbekannt" } The value of the existing map is unchanged; the returned map - contains all the entries from $week, supplemented with an additional - entry. + contains all the entries from $week, supplemented + with an additional entry. map:of-pairs(( @@ -23404,7 +23388,7 @@ map:build($input, map:get(?, 'key'), map:get(?, 'value'), $combine) map:of-pairs( (map:pairs($week), { "key": 6, "value": "Sonnabend" }), - fn($old, $new) { $new } + { "combine": fn($old, $new) { $new } } ) { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch", 4: "Donnerstag", 5: "Freitag", 6: "Sonnabend" } @@ -23417,7 +23401,7 @@ map:build($input, map:get(?, 'key'), map:get(?, 'value'), $combine) map:of-pairs( (map:pairs($week), { "key": 6, "value": "Sonnabend" }), - fn($old, $new) { `{ $old }|{ $new }` } + { "combine": concat(?, '|', ?) } ) { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch", 4: "Donnerstag", 5: "Freitag", 6: "Samstag|Sonnabend" } @@ -24120,28 +24104,7 @@ declare function map:find($input as item()*, then the new value replaces the old value and the position of the entry is not changed; otherwise, the new entry is added after all existing entries.

- + @@ -24157,6 +24120,13 @@ declare function map:find($input as item()*,

It is possible to force the new entry to go at the end of the sequence by calling map:remove before calling map:put.

+ +

It can happen that the supplied $key is the + as some existing key present in $map, but nevertheless + differs from the existing key in some way: + for example, it might have a different type annotation, or it might be an xs:dateTime + value in a different timezone. In this situation the key that appears in the result map + is always the supplied $key, not the existing key.

@@ -24564,6 +24534,14 @@ if (map:contains($map, $key)) then map:put($map, $key, $action(map:get($map, $key))) else map:put($map, $key, $action(()))
+ +

It can happen that the supplied $key is the + as some existing key present in $map, but nevertheless + differs from the existing key in some way: + for example, it might have a different type annotation, or it might be an xs:dateTime + value in a different timezone. In this situation the key that appears in the result map + is always the supplied $key, not the existing key.

+
@@ -24620,65 +24598,46 @@ else map:put($map, $key, $action(()))

If the key is not already present in the target map, the processor adds a new key-value pair to the map, with that key and that value.

-

If the key is already present, the processor calls the combine - function in the $options argument to combine the existing value for the key with the new value, - and replaces the entry with this combined value.

-

The key of the combined entry is taken from one of the duplicate entries: - it is which one is used. (It is - possible for two keys to be considered duplicates even if they differ: +

If the key is already present, the processor combines the new value for the key + with the existing value as determined by the combine + and duplicates options.

+

By default, when two duplicate entries occur:

+ +

A single combined entry will be present in the result.

+

This entry will contain the + sequence concatenation + of the supplied values.

+

The position of the combined entry in the + of the result map + will correspond to the position of the first of the duplicates.

+

The key of the combined entry in the + of the result map + will correspond to the key of the last of the duplicates. + (It is possible for two keys to be considered duplicates even if they differ: for example, they may have different type annotations, or they may - be xs:dateTime values with different timezones.) -

-

The position of the combined entry in the - of the result map is based on the position of the first entry having that key - in the input sequence (that is, the order of keys in the result is the order - of first appearance in the input.

+ be xs:dateTime values in different timezones.)

+
+

The $options argument can be used to control the + way in which duplicate keys are handled. The allowed options, and their + meanings, are the same as for the map:of-pairs + function. + The option parameter conventions apply. +

-

The $options argument can be used to control the - and the way in which duplicate keys are handled. - The option parameter conventions apply. -

+ -

The entries that may appear in the $options map are as follows:

- - - - A function that is used to combine two different values that are supplied - for the same key. The default is to combine the two values using - , retaining their order - in the input sequence. - - (fn($existing-value as item()*, $new-value as item()*) as item()*)? - fn:op(',') - - - The function is called for any entry in an input map that has the - as a previous entry. The first argument - is the existing value associated with the key; the second argument - is the value associated with the key in the duplicate input entry, - and the result is the new value to be associated with the key. - - - - - - + -fold-left($input, {}, fn($map, $item, $pos) { - let $v := $value($item, $pos) - return fold-left($keys($item, $pos), $map, fn($m, $k) { - if (map:contains($m, $k)) then ( - map:put($m, $k, $combine($m($k), $v)) - ) else ( - map:put($m, $k, $v) - ) - }) -}) +( for $item at $pos in $input + let $val := $value($item, $pos) + for $key in $keys($item, $pos) + return map:pair($key, $val) +) => map:of-pairs($options) @@ -24687,21 +24646,7 @@ fold-left($input, {}, fn($map, $item, $pos) {

The default function for both $keys and $value is the identity function. Although it is permitted to default both, this serves little purpose: usually at least one of these arguments will be supplied.

-

The default action for combining entries with duplicate keys is to perform a - sequence concatenation - of the corresponding values, - equivalent to the duplicates: combine option on map:merge. Other potentially useful - functions for combining duplicates include:

- -

fn($a, $b) { $a } Use the first value and discard the remainder

-

fn($a, $b) { $b } Use the last value and discard the remainder

-

fn:concat(?, ",", ?) Form the string-concatenation of the values, comma-separated

-

fn:op('+') Compute the sum of the values

-
-

The order of entries in the result reflects - the order of the items in $input from which they were derived. In the - event that two entries have duplicate keys, the position of the combined entry - in the result reflects the position of the first input item with that key.

+ diff --git a/specifications/xslt-40/src/element-catalog.xml b/specifications/xslt-40/src/element-catalog.xml index cd6c7af27..96973de4a 100644 --- a/specifications/xslt-40/src/element-catalog.xml +++ b/specifications/xslt-40/src/element-catalog.xml @@ -1702,7 +1702,7 @@ - + all the keys to upper case. A dynamic error occurs if this results in duplicate keys:

<xsl:map> - <xsl:for-each select="map:key-value-pairs($map)"> + <xsl:for-each select="map:pairs($map)"> <xsl:map-entry key="upper-case(?key)" select="?value"/> </xsl:for-each> </xsl:map> @@ -36208,7 +36208,7 @@ the same group, and the-->

The following example modifies a supplied map $input by wrapping each of the values in an array:

<xsl:map> - <xsl:for-each select="map:key-value-pairs($map)"> + <xsl:for-each select="map:pairs($map)"> <xsl:map-entry key="?key"> <xsl:array select="?value"/> </xsl:map-entry> @@ -36217,7 +36217,7 @@ the same group, and the-->

This could also be written:

+ map:pairs($map) ! { ?key : array{ ?value } }"/> ]]> @@ -36235,7 +36235,7 @@ the same group, and the-->

This section describes what happens when two or more maps in the input sequence of within an xsl:map instruction contain duplicate keys: that is, when one of these maps contains an entry with key K, and another contains an entry with key L, - and fn:atomic-equal(K, L) returns true.

+ and fn:atomic-equal(K, L) returns true.

In the absence of the on-duplicates attribute, @@ -36251,6 +36251,15 @@ the same group, and the--> arguments to this function, and the function returns the value that should be associated with this key in the final map.

+

More formally, the result of the xsl:map instruction is defined by reference to + the function map:merge. Specifically, if $maps + is the input sequence to xsl:map, and $combine + is the of the on-duplicates + attribute, then the result of the instruction is the result of the function + call map:merge($maps, { "combine": $combine }).

+ +

Thus, if the values are all singleton items (which is not necessarily the case), and if the sequence of values is S, then the final result is fold-left(tail(S), head(S), F).

- -

For example, the following table shows some useful callback functions that might be supplied, - and explains their effect:

+ --> +

For example, the following table shows some useful callback functions that might be supplied + as the value of the on-duplicates attribute, and explains their effect:

@@ -36294,9 +36303,9 @@ the same group, and the--> - + @@ -36307,7 +36316,7 @@ the same group, and the--> - + @@ -36318,8 +36327,8 @@ the same group, and the--> - +
fn($a, $b) { $a, $b }The sequence-concatenation of the duplicate values is used. - This could - also be expressed as on-duplicates="op(',')".The + of the duplicate values is used. This could + also be expressed as on-duplicates="op(',')".
fn($a, $b) { max(($a, $b)) }The lowest of the duplicate values is used.
fn($a, $b) { string-join(($a, $b), ', ') }concat(?, ', ', ?) } The comma-separated string concatenation of the duplicate values is used.
fn($a, $b) { error() }Duplicates are rejected as an error (this is the default in the absence of a - callback function).Duplicates are rejected as an error (this is the default in the absence of the + on-duplicates attribute).
@@ -36345,6 +36354,21 @@ the same group, and the--> ]]>
+ +

Specifying the effect by reference to map:merge has + the following consequences when duplicates are combined + into a merged entry:

+ +

The position of the merged entry in the result corresponds + to the position of the first of the duplicate keys in the input.

+

The key used for the merged entry in the result corresponds + to the last of the duplicate keys in the input. This is relevant when + the duplicate keys differ in some way, for example when they have + different type annotations, or when they are xs:dateTime + values in different timezones.

+
+
+ From f8417b0a6622aecdaf8f68bd002fb509e10b94ae Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Thu, 30 Jan 2025 00:18:59 +0000 Subject: [PATCH 2/4] Add note about XSLT 3.0 aberration --- .../src/function-catalog.xml | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index 7e651fdeb..a20013e2f 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -23081,6 +23081,11 @@ map:of-pairs($maps =!> map:pairs(), types. The type of a map (for example map(xs:integer, xs:string)) is descriptive of the entries it currently contains, but is not a constraint on how the map may be combined with other maps.

+

The XSLT 3.0 recommendation included a specification of this function that incorrectly used + the option value {'duplicates':'unspecified'} in place of + {'duplicates':'use-any'}. XSLT implementations wishing to + preserve backwards compatibility may choose to retain support + for this setting.

map:of-pairs($options) - + +

An error is raised if both the combine and duplicates + options are present.

+ +

An error is raised if the value of + $options indicates that duplicates are to be rejected, and a duplicate key is encountered.

+

An error is raised if the value of + $options includes an entry whose key is defined + in this specification, and whose value is not a permitted value for that key.

+ +
From 47cb099ede3734b6e502ab434dbe9f5f7c91badb Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Fri, 31 Jan 2025 09:49:18 +0000 Subject: [PATCH 3/4] Combine the "duplicates" and "combine" options --- .../src/function-catalog.xml | 158 ++++++++++-------- .../src/xpath-functions.xml | 8 +- 2 files changed, 88 insertions(+), 78 deletions(-) diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index a20013e2f..60455bcbb 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -23056,9 +23056,7 @@ map:of-pairs($maps =!> map:pairs(), -

An error is raised if both the combine and duplicates - options are present.

+

An error is raised if the value of @@ -23104,6 +23102,12 @@ map:of-pairs($maps =!> map:pairs(), { 0: "no", 1: "yes" } Returns a map with two entries + + map:merge(({ "red": 0 }, { "green": 1}, { "blue": 2 })) + => map:keys() + "red", "green", "blue" + Note the order of the result. + map:merge( ($week, { 7: "Unbekannt" }) @@ -23153,21 +23157,26 @@ map:of-pairs($maps =!> map:pairs(), entry that appears in the result is the sequence concatenation of the entries in the input maps, retaining order. - - map:merge(({ "red": 0 }, { "green": 1}, { "blue": 2 })) - => map:keys() - "red", "green", "blue" + map:merge( + ({ "oxygen": 0.22, "hydrogen": 0.68, "nitrogen": 0.1 }, + { "oxygen": 0.24, "hydrogen": 0.70, "nitrogen": 0.06 }), + { "duplicates": fn($a, $b){ max(($a, $b)) } }) + + { "oxygen": 0.24, "hydrogen": 0.70, "nitrogen": 0.1 } + The result map holds, for each distinct key, the maximum of the values + for that key in the input. + - +

For consistency with the new functions map:build and map:of-pairs, the handling of duplicates - may now be controlled by the combine option as an alternative - to the existing duplicates option.

+ may now be controlled by supplying a user-defined callback function as an alternative + to the fixed values for the earlier duplicates option.

@@ -23214,54 +23223,49 @@ map:of-pairs($maps =!> map:pairs(), taken if two entries in the input sequence have key values K1 and K2 where K1 and K2 are the same key. This option and the combine - option are mutually exclusive. + def="dt-same-key">same key. - xs:string - combine + (enum( "reject", "use-first", "use-last", "use-any", "combine") | fn(item()*, item()*) as item()*)? + "combine" - - Equivalent to specifying "combine": fn(){error(xs:QName("err:FOJS0003"), ...) - (the remaining arguments to fn:error being - ). + + Equivalent to supplying a function that raises a dynamic error + with error code "FOJS0003". The effect is that duplicate keys + result in an error. - Equivalent to specifying "combine": fn($a, $b){ $a }. + Equivalent to supplying the function fn($a, $b){ $a }. + The effect is that the first of the duplicates is chosen. - Equivalent to specifying "combine": fn($a, $b){ $b }. + Equivalent to supplying the function fn($a, $b){ $b }. + The effect is that the last of the duplicates is chosen. - Equivalent to specifying "combine": fn($a, $b){ one-of($a, $b) } + Equivalent to supplying the function fn($a, $b){ one-of($a, $b) } where one-of chooses either $a or $b in - an way. + an way. The effect is that it is + which of the duplicates is chosen. + + Equivalent to supplying the function fn($a, $b){ $a, $b } + (or equivalently, the function op(",")). + The effect is that the result contains the + of the values having the same key, retaining order. - Equivalent to specifying "combine": fn($a, $b){ $a, $b }. + + A function with signature fn(item()*, item()*) as item()*. + The function is called for any entry in the input sequence that has the + as a previous entry. The first argument + is the existing value associated with the key; the second argument + is the value associated with the key in the duplicate input entry, + and the result is the new value to be associated with the key. The effect + is cumulative: for example if there are three values X, Y, + and Z associated with the same key, and the supplied function is + F, then the result is an entry whose value is + X => F(Y) => F(Z). - - - - Supplies a function for handling duplicate keys: specifically, the action to be - taken if entries in the input sequence contain entries with key values - K1 and K2 where K1 and K2 are the - same key. This option and the duplicates - option are mutually exclusive. - - (fn($existing-value as item()*, $new-value as item()*) as item()*)? - fn($a, $b){ $a, $b } - - - A function with signature fn(item()*, item()*) as item()*. - The function is called for any entry in the input sequence that has the - as a previous entry. The first argument - is the existing value associated with the key; the second argument - is the value associated with the key in the duplicate input entry, - and the result is the new value to be associated with the key. - - @@ -23276,15 +23280,19 @@ let $one-of := fn($a, $b) { (: select either $a or $b at implementation option :) if (environment-variable("X")) then $a else $b } +let $duplicates := $options ? duplicates let $combine as function(item()*, item()*) as item()* := - { "reject": fn($a, $b){ error(xs:QName("err:FOJS0003")) }, - "use-first": fn($a, $b){ $a }, - "use-last": fn($a, $b){ $b }, - "use-any": fn($a, $b){ $one-of($a, $b) }, - "combine": fn($a, $b){ $a, $b } - } ? ($options?duplicates) - otherwise $options?combine - otherwise fn($a, $b) { $a, $b } + if ($duplicates instance of xs:string) + then + { "reject": fn($a, $b){ error(xs:QName("err:FOJS0003")) }, + "use-first": fn($a, $b){ $a }, + "use-last": fn($a, $b){ $b }, + "use-any": fn($a, $b){ $one-of($a, $b) }, + "combine": fn($a, $b){ $a, $b } + } ? $duplicates + else if ($duplicates instance of function(*)) + then $duplicates + else fn($a, $b) { $a, $b } return fold-left( $input, {}, fn ( $out, $next ) { let $newVal := @@ -23295,17 +23303,11 @@ return fold-left( $input, {}, }) -

An error is raised if both the combine and duplicates - options are present.

An error is raised if the value of $options indicates that duplicates are to be rejected, and a duplicate key is encountered.

-

An error is raised if the value of - $options includes an entry whose key is defined - in this specification, and whose value is not a permitted value for that key.

+
@@ -23393,20 +23395,20 @@ return fold-left( $input, {}, map:of-pairs( (map:pairs($week), { "key": 6, "value": "Sonnabend" }), - { "combine": fn($old, $new) { $new } } + { "duplicates": "use-last" } ) { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch", 4: "Donnerstag", 5: "Freitag", 6: "Sonnabend" } The value of the existing map is unchanged; the returned map contains all the entries from $week, with one entry replaced by a new entry. Both input maps contain an entry with the key 6; the - supplied $combine function ensures that the one used in the result + supplied $duplicates option ensures that the one used in the result is the one that comes last in the input sequence. map:of-pairs( (map:pairs($week), { "key": 6, "value": "Sonnabend" }), - { "combine": concat(?, '|', ?) } + { "duplicates": concat(?, '|', ?) } ) { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch", 4: "Donnerstag", 5: "Freitag", 6: "Samstag|Sonnabend" } @@ -23414,6 +23416,17 @@ return fold-left( $input, {}, from the two input maps, with a separator character. + + map:of-pairs( + ( map:pairs({ "England": 2, "Germany": 1 }), + map:pairs({ "France": 2, "Germany": 2 }) + map:pairs({ "England": 0, "France": 1 }) ), + { "duplicates": op("+") }) + + { "England": 2, "Germany": 3, "France": 3 } + The values for each distinct key are summed. + + map:of-pairs((map:pair("red": 0), map:pair("green": 1), map:pair("blue": 2 )) => map:keys() @@ -24604,8 +24617,8 @@ else map:put($map, $key, $action(()))

If the key is not already present in the target map, the processor adds a new key-value pair to the map, with that key and that value.

If the key is already present, the processor combines the new value for the key - with the existing value as determined by the combine - and duplicates options.

+ with the existing value as determined by the + and duplicates option.

By default, when two duplicate entries occur:

A single combined entry will be present in the result.

@@ -24645,9 +24658,6 @@ else map:put($map, $key, $action(())) ) => map:of-pairs($options) -

An error is raised if both the combine and duplicates - options are present.

An error is raised if the value of @@ -24710,7 +24720,7 @@ else map:put($map, $key, $action(())) ("apple", "apricot", "banana", "blueberry", "cherry"), substring(?, 1, 1), string-length#1, - { "combine": op("+") } + { "duplicates": op("+") } ) { "a": 12, "b": 15, "c": 6 } Constructs a map where the key is the first character of an input item, and where the corresponding value @@ -24766,7 +24776,7 @@ return map:build($titles/title, fn($title) { $title/ix })

The following expression creates a map whose keys are employee @location values, and whose corresponding values represent the number of employees at each distinct location. Any employees that lack an @location attribute will be excluded from the result.

- map:build(//employee, fn { @location }, fn { 1 }, { "combine": op("+") }) + map:build(//employee, fn { @location }, fn { 1 }, { "duplicates": op("+") })

The following expression creates a map whose keys are employee @location values, and whose diff --git a/specifications/xpath-functions-40/src/xpath-functions.xml b/specifications/xpath-functions-40/src/xpath-functions.xml index 40e591837..9ddd5ce31 100644 --- a/specifications/xpath-functions-40/src/xpath-functions.xml +++ b/specifications/xpath-functions-40/src/xpath-functions.xml @@ -771,8 +771,7 @@ Michael Sperberg-McQueen (1954–2024).

The type of the options parameter in the function signature is always given as map(*).

Although option names are described above as strings, the actual key may be - any value that compares equal to the required string (using the eq operator - with Unicode codepoint collation; or equivalently, the fn:atomic-equal relation). + any value that is the as the required string. For example, instances of xs:untypedAtomic or xs:anyURI are equally acceptable.

This means that the implementation of the function can check for the @@ -805,6 +804,7 @@ Michael Sperberg-McQueen (1954–2024).

A dynamic error occurs if the supplied value after conversion is not one of the permitted values for the option in question: the error codes for this error are defined in the specification of each function.

+

It is the responsibility of each function implementation to invoke this conversion; it does not happen automatically as a consequence of the function-calling rules.

@@ -13278,12 +13278,12 @@ ISBN 0 521 77752 6.

Raised when the digits in the string supplied to fn:parse-integer are not in the range appropriate to the chosen radix.

-

Raised if an inconsistent set of options is supplied in an option map.

-
+ -->

Raised by regular expression functions such as fn:matches and fn:replace if the From fada1467d49089af6710b9fa149fe05edf0ebeaa Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Tue, 4 Feb 2025 00:22:21 +0000 Subject: [PATCH 4/4] map:put may use either the new or the old key --- .../src/function-catalog.xml | 57 ++++++++------- .../src/xpath-functions.xml | 6 ++ .../xslt-40/src/element-catalog.xml | 2 +- .../xslt-40/src/schema-for-xslt40.rnc | 5 +- .../xslt-40/src/schema-for-xslt40.xsd | 8 +-- specifications/xslt-40/src/xslt.xml | 71 +++++++++++-------- 6 files changed, 84 insertions(+), 65 deletions(-) diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index 60455bcbb..624917ecc 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -23037,8 +23037,9 @@ xs:QName('xs:double')

The position of that entry in the of the result map will correspond to the position of the first of the duplicates.

-

The key of that entry will be the key used - in the last of the duplicates. (Keys may be +

The key of the combined entry + will correspond to the key of one of the duplicates: it is + which one is chosen. (Keys may be duplicates even though they differ: for example, they may have different type annotations, or they may be xs:dateTime values in different timezones.)

@@ -23051,7 +23052,7 @@ xs:QName('xs:double') map:of-pairs($maps =!> map:pairs(), - $options[exists((?duplicates, ?combine))] + $options[exists(?duplicates)] otherwise { "duplicates": "use-first" }); @@ -23203,7 +23204,7 @@ map:of-pairs($maps =!> map:pairs(),

The function map:of-pairs - returns a map that + returns a map which is formed by combining key-value pair maps supplied in the $input argument.

@@ -23332,8 +23333,10 @@ return fold-left( $input, {}, of the result map, the position of the entry containing the result of combining a set of entries with duplicate keys corresponds to the position of the first of the duplicates in the input sequence.

-

The key of the entry containing the combined value is the last of - the several duplicates. (Keys may be duplicates even though they differ: +

The key of the combined entry + will correspond to the key of one of the duplicates: it is + which one is chosen. + (Keys may be duplicates even though they differ: for example they may have different type annotations, or they might be xs:dateTime values in different timezones.)

@@ -24109,20 +24112,18 @@ declare function map:find($input as item()*, any existing entry for the same key.

-

The function map:put returns a map that contains all entries from the supplied $map, - with the exception of any entry whose key is the same key as $key, together with a new - entry whose key is $key and whose associated value is $value.

+

If $map contains an entry whose key is the same key as $key, the function returns + a map in which that entry is replaced (at the same relative position) + with a new entry whose value is $value. It is + whether the key in the new entry + takes its original value or is replaced by the supplied $key. + All other entries in the map are unchanged, and retain their relative order.

-

The entry order - of the entries in the returned map is as follows: - if $map contains an entry whose key is $key, - then the new value replaces the old value and the position of the entry is not changed; - otherwise, the new entry is added after all existing entries.

- - +

Otherwise, when $map contains no such entry, the function + returns a map containing all entries from the supplied $map + (retaining their relative position) followed by a new entry whose key + is $key and whose associated value is $value.

@@ -24143,8 +24144,9 @@ declare function map:find($input as item()*, as some existing key present in $map, but nevertheless differs from the existing key in some way: for example, it might have a different type annotation, or it might be an xs:dateTime - value in a different timezone. In this situation the key that appears in the result map - is always the supplied $key, not the existing key.

+ value in a different timezone. In this situation it is + whether the key that appears in the result map + is the supplied $key or the existing key.

@@ -24180,7 +24182,12 @@ declare function map:find($input as item()*,
-

Enhanced to allow for ordered maps.

+ +

Enhanced to allow for ordered maps.

+
+ +

It is no longer guaranteed that the new key replaces the existing key.

+
@@ -24628,9 +24635,9 @@ else map:put($map, $key, $action(()))

The position of the combined entry in the of the result map will correspond to the position of the first of the duplicates.

-

The key of the combined entry in the - of the result map - will correspond to the key of the last of the duplicates. +

The key of the combined entry + will correspond to the key of one of the duplicates: it is + which one is chosen. (It is possible for two keys to be considered duplicates even if they differ: for example, they may have different type annotations, or they may be xs:dateTime values in different timezones.)

diff --git a/specifications/xpath-functions-40/src/xpath-functions.xml b/specifications/xpath-functions-40/src/xpath-functions.xml index 9ddd5ce31..6a86b7cf9 100644 --- a/specifications/xpath-functions-40/src/xpath-functions.xml +++ b/specifications/xpath-functions-40/src/xpath-functions.xml @@ -13796,6 +13796,12 @@ ISBN 0 521 77752 6. longer possible to supply an instance of xs:anyURI or (when XPath 1.0 compatibility mode is in force) an instance of xs:boolean or xs:duration.

+ +

When fn:put replaces an entry in a map with a new value for an + existing key, in the case where the existing key and the new key differ (for example, + if they have different type annotations), it is no longer guaranteed that the new + entry includes the new key rather than the existing key.

+

For compatibility issues regarding earlier versions, see the 3.1 version of this specification.

diff --git a/specifications/xslt-40/src/element-catalog.xml b/specifications/xslt-40/src/element-catalog.xml index 96973de4a..6b87e08a6 100644 --- a/specifications/xslt-40/src/element-catalog.xml +++ b/specifications/xslt-40/src/element-catalog.xml @@ -1702,7 +1702,7 @@ - + - A new attribute xsl:map/@on-duplicates is available, + A new attribute xsl:map/@duplicates is available, allowing control over how duplicate keys are handled by the xsl:map instruction. @@ -36238,25 +36238,26 @@ the same group, and the--> and fn:atomic-equal(K, L) returns true.

-

In the absence of the on-duplicates attribute, +

In the absence of the duplicates attribute, a dynamic error occurs if the set of keys in the maps making up the input sequence of an xsl:map instruction contains duplicates.

-

The result of evaluating the on-duplicates attribute, if present, must - be a function with arity 2. When the xsl:map instruction encounters two - map entries having the same key, the two values associated with this key are passed as - arguments to this function, and the function returns the value that should be associated - with this key in the final map.

- -

More formally, the result of the xsl:map instruction is defined by reference to - the function map:merge. Specifically, if $maps - is the input sequence to xsl:map, and $combine - is the of the on-duplicates +

The result of evaluating the duplicates attribute, if present, must + be either one of the strings "use-first", "use-last", + "use-any", "combine", or "reject", + or a function with arity 2. These values correspond to the permitted + values of the duplicates option of the + map:of-pairs function.

+ +

The result of the xsl:map instruction is defined by reference to + the function map:of-pairs. Specifically, if $maps + is the input sequence to xsl:map, and $duplicates + is the of the duplicates attribute, then the result of the instruction is the result of the function - call map:merge($maps, { "combine": $combine }).

+ call map:of-pairs(map:pairs($maps), { "duplicates": $duplicates }).

Thus, if the values are all singleton items (which is not necessarily the case), and if the sequence of values is S, then the final result is fold-left(tail(S), head(S), F).

--> -

For example, the following table shows some useful callback functions that might be supplied - as the value of the on-duplicates attribute, and explains their effect:

+

The following table shows some possible values + of the duplicates attribute, and explains their effect:

@@ -36294,41 +36295,47 @@ the same group, and the--> - + - + - + - + - + - + - - - + + + + + + - - + +
fn($a, $b) { $a }duplicates="use-first" The first of the duplicate values is used.
fn($a, $b) { $b }duplicates="use-last" The last of the duplicate values is used.
fn($a, $b) { $a, $b }duplicates="combine" The of the duplicate values is used. This could also be expressed as on-duplicates="op(',')".
fn($a, $b) { max(($a, $b)) }duplicates="fn($a, $b) { max(($a, $b)) }" The highest of the duplicate values is used.
fn($a, $b) { min(($a, $b)) }duplicates="fn($a, $b) { min(($a, $b)) }" The lowest of the duplicate values is used.
concat(?, ', ', ?) }duplicates="concat(?, ', ', ?) }" The comma-separated string concatenation of the duplicate values is used.
fn($a, $b) { $a + $b }The sum of the duplicate values is used. - This could also be expressed as on-duplicates="op('+')" +
duplicates="op('+')"The sum of the duplicate values is used.
duplicates="fn($a, $b) { subsequence(($a, $b), 1, 4) }"The first four of the duplicates are retained; any further duplicates + are discarded.
fn($a, $b) { error() }Duplicates are rejected as an error (this is the default in the absence of the - on-duplicates attribute).duplicates="fn($a, $b) { distinct-values(($a, $b)) }"When multiple entries have the same key, the corresponding values + are retained only if they are distinct from other values having the + same key. +
@@ -36346,7 +36353,7 @@ the same group, and the-->

The logic is:

- + @@ -36355,14 +36362,16 @@ the same group, and the--> -

Specifying the effect by reference to map:merge has +

Specifying the effect by reference to map:of-pairs has the following consequences when duplicates are combined into a merged entry:

The position of the merged entry in the result corresponds to the position of the first of the duplicate keys in the input.

The key used for the merged entry in the result corresponds - to the last of the duplicate keys in the input. This is relevant when + to one of the duplicate keys in the input: it is + which one is chosen. + This is relevant when the duplicate keys differ in some way, for example when they have different type annotations, or when they are xs:dateTime values in different timezones.