Skip to content

Commit

Permalink
Deploying to gh-pages from @ afac53c 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
ndw committed Aug 10, 2024
1 parent 8caae6c commit ed2b2a8
Show file tree
Hide file tree
Showing 13 changed files with 578 additions and 665 deletions.
2 changes: 1 addition & 1 deletion docs/announcements/2024/07/saxon-12.5.html
Original file line number Diff line number Diff line change
Expand Up @@ -177,5 +177,5 @@ <h3>Issues in SaxonC</h3>
<a href="https://saxonica.plan.io/projects/saxon/issues">report them</a>
on our issue tracker.</p>

</main><footer><div class="prev-uri"><a href="/announcements/2024/01/saxonc-12.4.2.html">Announcing SaxonC 12.4.2!</a></div></footer></body>
</main><footer><div class="prev-uri"><a href="/announcements/2024/01/saxonc-12.4.2.html">Announcing SaxonC 12.4.2!</a></div><div class="next-uri"><a href="/mike/2024/08/maps-and-records.html">Maps and Records</a></div></footer></body>
</html>
2 changes: 1 addition & 1 deletion docs/announcements/atom.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xml:lang="EN-us"><title>Saxonica announcements</title><link href="https://blog.saxonica.com/announcements/" rel="alternate" type="text/html"/><link href="https://blog.saxonica.com/announcements/atom.xml" rel="self"/><id>https://blog.saxonica.com/announcements/atom.xml</id><updated>2024-07-02T10:30:40.612616Z</updated><entry><title>Announcing Saxon 12.5!</title><link href="https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html" rel="alternate" type="text/html"/><id>https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html</id><published>2024-07-02T11:30:00Z</published><content type="xhtml" xml:base="https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html"><div xmlns="http://www.w3.org/1999/xhtml">
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xml:lang="EN-us"><title>Saxonica announcements</title><link href="https://blog.saxonica.com/announcements/" rel="alternate" type="text/html"/><link href="https://blog.saxonica.com/announcements/atom.xml" rel="self"/><id>https://blog.saxonica.com/announcements/atom.xml</id><updated>2024-08-10T10:20:43.391447Z</updated><entry><title>Announcing Saxon 12.5!</title><link href="https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html" rel="alternate" type="text/html"/><id>https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html</id><published>2024-07-02T11:30:00Z</published><content type="xhtml" xml:base="https://blog.saxonica.com/announcements/2024/07/saxon-12.5.html"><div xmlns="http://www.w3.org/1999/xhtml">
<h1>Announcing Saxon 12.5!</h1>

<p>The Saxon 12.5 maintenance release has been published. This is a
Expand Down
224 changes: 110 additions & 114 deletions docs/atom.xml

Large diffs are not rendered by default.

11 changes: 10 additions & 1 deletion docs/authors.html
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,23 @@ <h1>Author index</h1>
<div class="blogroll"><a href="/">Home (combined archives)</a><br><a href="/announcements/">Announcements</a><br><a href="/mike/">Saxon diaries</a><br><a href="/oneil/">O’Neil Delpratt’s Blog</a><br><a href="/norm/">Saxon Chronicles</a></div>
</div>
</aside>
<div class="byline"><span class="date" time="2024-07-02T10:30:46.372804Z">July&nbsp;02, 2024 at 10:30a.m.</span></div>
<div class="byline"><span class="date" time="2024-08-10T10:20:49.092304Z">August&nbsp;10, 2024 at 10:20a.m.</span></div>
</header>
<main>
<p>Authors: <a href="#michael-kay">Michael Kay</a>, <a href="#norm-tovey-walsh">Norm Tovey-Walsh</a>, <a href="#oneil-delpratt">O’Neil Delpratt</a></p>
<div class="author">
<h2 id="michael-kay">Michael Kay</h2>
<div class="post-index">
<ul class="years">
<li id="michael-kay-D2024">2024
<ul class="month" id="michael-kay-D2024-08">
<li>August
<ul>
<li class="mike"><span class="title"><a href="/mike/2024/08/maps-and-records.html">Maps and Records</a></span><span class="author">, Michael Kay</span><span class="date">, August&nbsp;10, 2024 at 09:00a.m.</span></li>
</ul>
</li>
</ul>
</li>
<li id="michael-kay-D2023">2023
<ul class="month" id="michael-kay-D2023-10">
<li>October
Expand Down
9 changes: 8 additions & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,19 @@
<div class="blogroll"><a href="/announcements/">Announcements</a><br><a href="/mike/">Saxon diaries</a><br><a href="/oneil/">O’Neil Delpratt’s Blog</a><br><a href="/norm/">Saxon Chronicles</a></div>
</div>
</aside>
<div class="byline"><span class="date" time="2024-07-02T10:30:46.372804Z">July&nbsp;02, 2024 at 10:30a.m.</span></div>
<div class="byline"><span class="date" time="2024-08-10T10:20:49.092304Z">August&nbsp;10, 2024 at 10:20a.m.</span></div>
</header>
<main>
<div class="post-index">
<ul class="years">
<li id="D2024">2024
<ul class="month" id="D2024-08">
<li>August
<ul>
<li class="mike"><span class="title"><a href="/mike/2024/08/maps-and-records.html">Maps and Records</a></span><span class="author">, Michael Kay</span><span class="date">, August&nbsp;10, 2024 at 09:00a.m.</span></li>
</ul>
</li>
</ul>
<ul class="month" id="D2024-07">
<li>July
<ul>
Expand Down
9 changes: 9 additions & 0 deletions docs/mike/2024/08/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<!DOCTYPE HTML><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Refresh" content="0; url=/mike/2024/08/maps-and-records.html">
<title>Redirect</title>
</head>
<body><a href="/mike/2024/08/maps-and-records.html">Redirect</a>.
</body>
</html>
119 changes: 119 additions & 0 deletions docs/mike/2024/08/maps-and-records.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
<!DOCTYPE HTML><html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Maps and Records</title>

<meta name="author" content="Michael Kay">
<meta name="pubdate" content="2024-08-10T09:00:00">

<link rel="stylesheet" type="text/css" href="/css/blog.css"><link rel="stylesheet" type="text/css" href="/css/michael-kay.css"><meta content="Saxon diaries" property="og:site_name"><meta content="https://blog.saxonica.com/img/sitecard.png" property="og:image"><meta content="Michael Kay's blog." property="og:description"><meta content="https://blog.saxonica.com/mike/" property="og:url"><meta content="600" property="og:image:width"><meta content="315" property="og:image:height"><meta content="Maps and Records" property="og:title"><meta content="en_GB" property="og:locale"><meta content="website" property="og:type"><meta name="viewport" content="width=device-width, initial-scale=1.0"></head>
<body class="michael-kay"><header><div class="banner"><h1><a href="/mike/">Saxon diaries</a></h1><div class="tagline">Michael Kay’s blog</div></div><h2>Maps and Records</h2><aside class="nav"><div class="navlinks"><div id="search"><form action="https://www.google.com/search" target="_parent"><span>Search: </span><input size="20" name="as_q"><input type="hidden" name="hl" value="en"><input type="hidden" name="ie" value="UTF-8"><input type="hidden" name="btnG" value="Google+Search"><input type="hidden" name="as_qdr" value="all"><input type="hidden" name="as_occt" value="any"><input type="hidden" name="as_dt" value="i"><input type="hidden" name="as_sitesearch" value="blog.saxonica.com"></form></div><div id="blogroll"><a href="/">Home (combined archives)</a><br><a href="/announcements/">Announcements</a><br><a href="/mike/">Saxon diaries</a><br><a href="/oneil/">O’Neil Delpratt’s Blog</a><br><a href="/norm/">Saxon Chronicles</a></div></div></aside><div class="byline"><span class="by">By </span><span class="name"><a href="/authors.html#michael-kay">Michael Kay</a></span><span class="on"> on </span><a href="/authors.html#michael-kay-D2024-08"><span class="date" time="2024-08-10T09:00:00">August&nbsp;10, 2024 at 09:00a.m.</span></a></div></header><main>


<p>Maps have proved to be one of the most powerful new features in the 3.0/3.1 family of standards,
and records, which extend the capability will probably prove one of the most powerful in 4.0.
Under the name <i>tuple types</i>, the feature has been available as a proprietary Saxon extension
since Saxon 9.8, which came out on the same day as the XSLT 3.0 Recommendation in June 2017.
The feature is now well established, but the details are still being refined.</p>

<p>A record type is declared like this:</p>

<pre>record(longitude as xs:double, latitude as xs:double)</pre>

<p>A record type is simply a new way of constraining maps; the instances of the type
are maps (in this case a map with two entries, one with key "longitude" and one with
key "latitude"). You can use a record type to declare the types of variables and function
arguments, but the actual value of the variable is a map, and all the standard map operations
are available, such as the lookup operator: <code>$location?longitude</code>.</p>

<p>We're working on an extension that allows named record types to be declared globally:</p>

<pre>&lt;xsl:record name="my:location"&gt;
&lt;xsl:field name="longitude" as="xs:double"/&gt;
&lt;xsl:field name="latitude" as="xs:double"/&gt;
&lt;/xsl:record&gt;</pre>

<p>which would also give you a constructor function: <code>my:location($long, $lat)</code>.</p>

<p>The main thing I want to talk about in this article is how records can be efficiently
implemented.</p>

<p>Until recently, a record type simply constrained the contents of a map, and had no
bearing on the way the map was implemented.</p>

<p>Internally, Saxon represents maps using the interface <code>net.sf.saxon.ma.map.MapItem</code>
(actually an abstract class), and there are several implementations of this interface:</p>

<ul>
<li><code>EmptyMap</code> for an empty map</li>
<li><code>SingleEntryMap</code> for a map with one entry, such as the map created by <code>map:entry()</code></li>
<li><code>DictionaryMap</code> for a map whose keys are all strings, and that isn't likely to be modified
(for example maps derived by parsing JSON, or maps written using literal constructors as option parameter values)</li>
<li><code>HashTrieMap</code> as the general implementation that handles everything.</li>
</ul>

<p>For the next release, Saxon 13, we've written a new implementation called a
<i>ShapedMap</i>. There are two parts to this: a <i>Shape</i> is a mapping from field names to
integer slot numbers, and a <i>ShapedMap</i> is a reference to a <i>Shape</i>, plus an array of slots.
So it's great where you have many maps with exactly the same structure, because you only hold the keys
once.</p>

<p>So far we're mainly using shaped maps where the structure of the map is defined by the language specification,
for example for the key-value pairs returned by <code>map:pairs()</code>, for the results of functions such as
<code>parse-csv()</code>, <code>random-number-generator()</code>, and <code>load-xquery-module()</code>,
and for the labels attached to values by the new deep-lookup operator (plenty of scope there for future articles).
I would love to use them also for the result of <code>parse-json()</code> if we can detect the common case
of a JSON file containing thousands of maps (JSON objects) with exactly the same structure. And of course, once
we have record constructor functions as described above, they are an obvious candidate for the result
of such a function.</p>

<p>Shaped maps immediately give a space saving because the keys and their hash index are shared between instances.
The next challenge after that is to make lookup on shaped maps more efficient. Given a lookup expression
such as <code>$location?longitude</code>, we <i>ought</i> to be able to extract the corresponding value directly
from slot 0 of the <code>ShapedMap</code> object, without the overhead of doing a run-time hash lookup of the string
<code>"longitude"</code> in the corresponding <code>Shape</code> in order to establish that this field is always
in slot 0.</p>

<p>The obvious, classic way of doing that is through static type inference: if we know the static type of the
<code>$location</code> variable, then we can know at compile time what the mapping of field names to slots will be,
and can generate an execution plan accordingly.</p>

<p>But I'm becoming a bit disillusioned with relying on static type analysis. Users, in general, are lazy: they
want good performance without doing extra work, like declaring the types of all their variables. That's particularly
true when you start writing code that relies heavily on higher-order functions, which we want to encourage.
So I'm looking increasingly at options that decide the execution plan at run-time, modifying it in the light
of actual experience. Given an expression like <code>$location?longitude</code> that is executed repeatedly,
the chances are that if <code>$location</code> is a shaped map with <code>longitude</code> in slot 0 on one occasion,
then the same will be true next time you execute the same expression.</p>

<p>We've quietly introduced this kind of approach in recent releases, and it's working well. For example, with
lazy evaluation of variables and function results we now use a learning approach: we start with lazy evaluation,
but if on the first 20 executions the value is immediately read to completion, we switch to eager evaluation.
That's because lazy evaluation has a significant set-up overhead to retain the parts of the context on which the
expression depends, and there is no benefit in doing this if the caller is going to immediately materialise
the value anyway.</p>

<p>The concrete design for shaped record access goes something like this. We augment the
<code>MapItem</code> interface with a method
<code>map.lookup(key, plan)</code>. This method returns the requested value from the map, but also updates the
value of <code>plan</code> with information that will be retained the next time the same expression is evaluated.
If the map is a shaped map, the returned plan can include the <code>Shape</code> and the slot number; if an incoming
request comes with a plan that identifies the same <code>Shape</code> (which it usually will),
then we can access the relevant slot number directly,
ignoring the value of the key. That only works, of course, for a lookup expression where the key is a literal
constant; but that's the normal case when working with records.</p>

<p>If we can make this work (and it seems straightforward), then the same approach might have other applications.
For example, can we make path expressions go faster if we optimize for the tree model in use? Or could we get rid
of statically-allocated fingerprints (with the inconvenience they cause by not allowing documents and stylesheets
to be shared across configurations), and instead have the expression discover the fingerprint and NamePool at
execution time?</p>

<p>Saxon is now 25 years old. It seems there are still plenty of exciting ways to make it better.</p>






</main><footer><div class="prev-uri"><a href="/announcements/2024/07/saxon-12.5.html">Announcing Saxon 12.5!</a></div></footer></body>
</html>
9 changes: 9 additions & 0 deletions docs/mike/2024/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<!DOCTYPE HTML><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Refresh" content="0; url=/mike/2024/08/maps-and-records.html">
<title>Redirect</title>
</head>
<body><a href="/mike/2024/08/maps-and-records.html">Redirect</a>.
</body>
</html>
Loading

0 comments on commit ed2b2a8

Please sign in to comment.