Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1725b Further elaboration of duplicates handling in maps #1740

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
540 changes: 260 additions & 280 deletions specifications/xpath-functions-40/src/function-catalog.xml

Large diffs are not rendered by default.

14 changes: 10 additions & 4 deletions specifications/xpath-functions-40/src/xpath-functions.xml
Original file line number Diff line number Diff line change
Expand Up @@ -771,8 +771,7 @@ Michael Sperberg-McQueen (1954–2024).</p>
<item><p>The type of the options parameter in the function signature is always
given as <code>map(*)</code>.</p></item>
<item><p>Although option names are described above as strings, the actual key may be
any value that compares equal to the required string (using the <code>eq</code> operator
with Unicode codepoint collation; or equivalently, the <code diff="chg" at="2023-01-25">fn:atomic-equal</code> relation).
any value that is the <termref def="dt-same-key"/> as the required string.
For example, instances of <code>xs:untypedAtomic</code>
or <code>xs:anyURI</code> are equally acceptable.</p>
<note><p>This means that the implementation of the function can check for the
Expand Down Expand Up @@ -805,6 +804,7 @@ Michael Sperberg-McQueen (1954–2024).</p>
A dynamic error occurs if the supplied value
after conversion is not one of the permitted values for the option in question: the error codes
for this error are defined in the specification of each function.</p>

<note><p>It is the responsibility of each function implementation to invoke this conversion; it
does not happen automatically as a consequence of the function-calling rules.</p></note></item>

Expand Down Expand Up @@ -13278,12 +13278,12 @@ ISBN 0 521 77752 6.</bibl>
<p>Raised when the digits in the string supplied to <function>fn:parse-integer</function> are not in the range appropriate
to the chosen radix.</p>
</error>
<error class="RG" code="0013"
<!--<error class="RG" code="0013"
label="Inconsistent options."
type="dynamic">
<p>Raised if an inconsistent set of options is supplied
in an <termref def="options">option map</termref>.</p>
</error>
</error>-->

<error class="RX" code="0001" label="Invalid regular expression flags." type="static">
<p>Raised by regular expression functions such as <function>fn:matches</function> and <function>fn:replace</function> if the
Expand Down Expand Up @@ -13796,6 +13796,12 @@ ISBN 0 521 77752 6.</bibl>
longer possible to supply an instance of <code>xs:anyURI</code> or (when XPath 1.0 compatibility
mode is in force) an instance of <code>xs:boolean</code> or <code>xs:duration</code>.</p>
</item>
<item diff="add" at="issue1725">
<p>When <function>fn:put</function> replaces an entry in a map with a new value for an
existing key, in the case where the existing key and the new key differ (for example,
if they have different type annotations), it is no longer guaranteed that the new
entry includes the new key rather than the existing key.</p>
</item>
</olist>

<p>For compatibility issues regarding earlier versions, see the 3.1 version of this specification.</p>
Expand Down
2 changes: 1 addition & 1 deletion specifications/xslt-40/src/element-catalog.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1702,7 +1702,7 @@
<e:attribute name="select">
<e:data-type name="expression"/>
</e:attribute>
<e:attribute name="on-duplicates" required="no" default="error(xs:QName(err:XTDE3365))">
<e:attribute name="duplicates" required="no" default="fn($a, $b) { error(xs:QName(err:XTDE3365)) }">
<e:data-type name="expression"/>
</e:attribute>
<!--<e:attribute name="ordered" default="no">
Expand Down
5 changes: 2 additions & 3 deletions specifications/xslt-40/src/schema-for-xslt40.rnc
Original file line number Diff line number Diff line change
Expand Up @@ -1125,10 +1125,9 @@ map.element =
element map {
extension.atts,
global.atts,
sequence-constructor.model
select-or-sequence-constructor.model
}
# TODO: add @on-duplicates
# TODO: add @ordering
# TODO: add @duplicates

map-entry.element =
element map-entry {
Expand Down
8 changes: 3 additions & 5 deletions specifications/xslt-40/src/schema-for-xslt40.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -1128,11 +1128,9 @@ of problems processing the schema using various tools
substitutionGroup="xsl:instruction">
<xs:complexType>
<xs:complexContent mixed="true">
<xs:extension base="xsl:sequence-constructor">
<xs:attribute name="on-duplicates" type="xsl:expression"/>
<xs:attribute name="ordering" type="xsl:avt"/>
<xs:attribute name="_on-duplicates" type="xs:string"/>
<xs:attribute name="_ordering" type="xs:string"/>
<xs:extension base="xsl:sequence-constructor-and-select">
<xs:attribute name="duplicates" type="xsl:expression"/>
<xs:attribute name="_duplicates" type="xs:string"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Expand Down
97 changes: 65 additions & 32 deletions specifications/xslt-40/src/xslt.xml
Original file line number Diff line number Diff line change
Expand Up @@ -15349,7 +15349,7 @@ return $tree =?> depth()]]></eg>
}</eg>
<p>The following code processes this map to produce an XML representation of the same information. The
cities are sorted by name:</p>
<eg><![CDATA[<xsl:for-each select="map:key-value-pairs(json-doc('input.json'))">
<eg><![CDATA[<xsl:for-each select="map:pairs(json-doc('input.json'))">
<xsl:sort select="?key"/>
<city number="{position()}"
name="{?key}"
Expand Down Expand Up @@ -36196,7 +36196,7 @@ the same group, and the-->
all the keys to upper case. A dynamic error occurs if this results in duplicate
keys:</p>
<eg role="xslt-declaration" xml:space="preserve">&lt;xsl:map&gt;
&lt;xsl:for-each select="map:key-value-pairs($map)"&gt;
&lt;xsl:for-each select="map:pairs($map)"&gt;
&lt;xsl:map-entry key="upper-case(?key)" select="?value"/&gt;
&lt;/xsl:for-each&gt;
&lt;/xsl:map&gt;
Expand All @@ -36208,7 +36208,7 @@ the same group, and the-->
<p>The following example modifies a supplied map <code>$input</code> by wrapping
each of the values in an array:</p>
<eg role="xslt-declaration" xml:space="preserve">&lt;xsl:map&gt;
&lt;xsl:for-each select="map:key-value-pairs($map)"&gt;
&lt;xsl:for-each select="map:pairs($map)"&gt;
&lt;xsl:map-entry key="?key"&gt;
&lt;xsl:array select="?value"/&gt;
&lt;/xsl:map-entry>
Expand All @@ -36217,7 +36217,7 @@ the same group, and the-->
</eg>
<p>This could also be written:</p>
<eg role="xslt-declaration" xml:space="preserve"><![CDATA[<xsl:map select="
map:key-value-pairs($map) ! { ?key : array{ ?value } }"/>
map:pairs($map) ! { ?key : array{ ?value } }"/>
]]></eg>
</example>

Expand All @@ -36226,7 +36226,7 @@ the same group, and the-->

<changes>
<change issue="169" date="2023-11-28">
A new attribute <code>xsl:map/@on-duplicates</code> is available,
A new attribute <code>xsl:map/@duplicates</code> is available,
allowing control over how duplicate keys are handled by the <elcode>xsl:map</elcode>
instruction.
</change>
Expand All @@ -36235,21 +36235,31 @@ the same group, and the-->
<p>This section describes what happens when two or more maps in the input sequence of
within an <elcode>xsl:map</elcode> instruction contain duplicate keys: that is, when one of these
maps contains an entry with key <var>K</var>, and another contains an entry with key <var>L</var>,
and <code diff="chg" at="2023-01-25">fn:atomic-equal(K, L)</code> returns <code>true</code>.</p>
and <code>fn:atomic-equal(<var>K</var>, <var>L</var>)</code> returns <code>true</code>.</p>

<p><error spec="XT" class="DE" code="3365" type="dynamic">
<p>In the absence of the <code>on-duplicates</code> attribute,
<p>In the absence of the <code>duplicates</code> attribute,
a <termref def="dt-dynamic-error">dynamic error</termref> occurs if the set of
keys in the maps making up the input sequence
<error.extra>of an <elcode>xsl:map</elcode> instruction</error.extra>
contains duplicates.</p>
</error></p>

<p>The result of evaluating the <code>on-duplicates</code> attribute, if present, <rfc2119>must</rfc2119>
be a function with arity 2. When the <elcode>xsl:map</elcode> instruction encounters two
map entries having the same key, the two values associated with this key are passed as
arguments to this function, and the function returns the value that should be associated
with this key in the final map.</p>
<p>The result of evaluating the <code>duplicates</code> attribute, if present, <rfc2119>must</rfc2119>
be either one of the strings <code>"use-first"</code>, <code>"use-last"</code>,
<code>"use-any"</code>, <code>"combine"</code>, or <code>"reject"</code>,
or a function with arity 2. These values correspond to the permitted
values of the <code>duplicates</code> option of the
<xfunction>map:of-pairs</xfunction> function.</p>

<p>The result of the <elcode>xsl:map</elcode> instruction is defined by reference to
the function <xfunction>map:of-pairs</xfunction>. Specifically, if <code>$maps</code>
is the input sequence to <elcode>xsl:map</elcode>, and <code>$duplicates</code>
is the <termref def="dt-effective-value"/> of the <code>duplicates</code>
attribute, then the result of the instruction is the result of the function
call <code>map:of-pairs(map:pairs($maps), { "duplicates": $duplicates })</code>.</p>

<!--

<p>The order of the arguments passed to the function reflects the order of the maps in which
the duplicate entries appear: if map <var>M</var> and map <var>N</var> contain values <var>V/M</var>
Expand All @@ -36272,9 +36282,9 @@ the same group, and the-->

<p>Thus, if the values are all singleton items (which is not necessarily the case), and if the sequence
of values is <var>S</var>, then the final result is <code>fold-left(tail(S), head(S), F)</code>.</p>

<p>For example, the following table shows some useful callback functions that might be supplied,
and explains their effect:</p>
-->
<p>The following table shows some possible values
of the <code>duplicates</code> attribute, and explains their effect:</p>

<table>
<thead>
Expand All @@ -36285,41 +36295,47 @@ the same group, and the-->
</thead>
<tbody>
<tr>
<td><code>fn($a, $b) { $a }</code></td>
<td><code>duplicates="use-first"</code></td>
<td>The first of the duplicate values is used.</td>
</tr>
<tr>
<td><code>fn($a, $b) { $b }</code></td>
<td><code>duplicates="use-last"</code></td>
<td>The last of the duplicate values is used.</td>
</tr>
<tr>
<td><code>fn($a, $b) { $a, $b }</code></td>
<td>The sequence-concatenation of the duplicate values is used.
<phrase diff="add" at="2023-04-04">This could
also be expressed as <code>on-duplicates="op(',')"</code>.</phrase></td>
<td><code>duplicates="combine"</code></td>
<td>The <xtermref spec="XP40" ref="dt-sequence-concatenation"/>
of the duplicate values is used. This could
also be expressed as <code>on-duplicates="op(',')"</code>.</td>
</tr>
<tr>
<td><code>fn($a, $b) { max(($a, $b)) }</code></td>
<td><code>duplicates="fn($a, $b) { max(($a, $b)) }"</code></td>
<td>The highest of the duplicate values is used.</td>
</tr>
<tr>
<td><code>fn($a, $b) { min(($a, $b)) }</code></td>
<td><code>duplicates="fn($a, $b) { min(($a, $b)) }"</code></td>
<td>The lowest of the duplicate values is used.</td>
</tr>
<tr>
<td><code>fn($a, $b) { string-join(($a, $b), ', ') }</code></td>
<td><code>duplicates="concat(?, ', ', ?) }"</code></td>
<td>The comma-separated string concatenation of the duplicate values is used.</td>
</tr>
<tr diff="add" at="2023-04-04">
<td><code>fn($a, $b) { $a + $b }</code></td>
<td>The sum of the duplicate values is used.
This could also be expressed as <code>on-duplicates="op('+')"</code>
<tr>
<td><code>duplicates="op('+')"</code></td>
<td>The sum of the duplicate values is used.</td>
</tr>
<tr>
<td><code>duplicates="fn($a, $b) { subsequence(($a, $b), 1, 4) }"</code></td>
<td>The first four of the duplicates are retained; any further duplicates
are discarded.
</td>
</tr>
<tr>
<td><code>fn($a, $b) { error() }</code></td>
<td>Duplicates are rejected as an error (this is the default in the absence of a
callback function).</td>
<td><code>duplicates="fn($a, $b) { distinct-values(($a, $b)) }"</code></td>
<td>When multiple entries have the same key, the corresponding values
are retained only if they are distinct from other values having the
same key.
</td>
</tr>
</tbody>
</table>
Expand All @@ -36337,14 +36353,31 @@ the same group, and the-->
<eg><![CDATA[{ "A23": [ 12, 2 ], "A24": [ 5 ], "A23": [ 9 ] }]]></eg>
<p>The logic is:</p>
<eg><![CDATA[<xsl:template match="data">
<xsl:map on-duplicates="fn($a, $b) { array:join(($a, $b)) }">
<xsl:map duplicates="fn($a, $b) { array:join(($a, $b)) }">
<xsl:for-each select="event">
<xsl:map-entry key="@id" select="[xs:integer(@value)]"/>
</xsl:for-each>
</xsl:map>
</xsl:template>]]></eg>
</example>

<note>
<p>Specifying the effect by reference to <xfunction>map:of-pairs</xfunction> has
the following consequences when duplicates are combined
into a merged entry:</p>
<ulist>
<item><p>The position of the merged entry in the result corresponds
to the position of the first of the duplicate keys in the input.</p></item>
<item><p>The key used for the merged entry in the result corresponds
to one of the duplicate keys in the input: it is
<termref def="dt-implementation-dependent"/> which one is chosen.
This is relevant when
the duplicate keys differ in some way, for example when they have
different type annotations, or when they are <code>xs:dateTime</code>
values in different timezones.</p></item>
</ulist>
</note>


</div3>
</div2>
Expand Down