This project provides XSLT stylesheets for those endpoints of
Distributed Text Services
(DTS),
that can be implemented generically based on evaluating
<citeStructure>
.
- navigation endpoint
- document endpoint
The other endpoints are not targeted by this project. But there are recommendations.
Implemented version: 1.0rc1
Query parameters for the endpoints are supported through stylesheet parameters:
parameter | navigation | document |
---|---|---|
resource |
✅ ³ | ✅ ³ |
ref |
✅ | ✅ |
start |
✅ | ✅ |
end |
✅ | ✅ |
down |
✅ | not used |
tree |
✅ | ✅ |
page |
❌ | not used |
mediaType |
not used | ✅¹ |
Evaluated TEI elements:
element | navigation | document |
---|---|---|
<refsDecl> |
✅ | ✅ |
<citeStructure> |
✅ | ✅ |
<citeData> |
✅² | not used |
Notes
- see section about mediaType
- supported, but
dcterms
do not yet come out as a map as shown in the specification's examples - for mapping of the values of
resource
to document URIs see theresource
section
The xsl/navigation.xsl
XSLT package generates
the
dts:Navigation
JSON-LD object as required by the navigation endpoint. Members are
generated by evaluating
tei:citeStructure
elements in the processed TEI document.
xsl/navigation.xsl
can either be applied on a TEI source document,
e.g. test/matt.xml
$SAXON_CMD -xsl:saxon-local.xml -xsl:xsl/navigation.xsl -s:test/matt.xml
... or it can be called with an initial template (the default initial
template xsl:initial-template
) where the source URL can then be
passed as the resource
stylesheet parameter:
$SAXON_CMD -xsl:saxon-local.xml -xsl:xsl/navigation.xsl -it resource=test/matt.xml
When a source document is processed, the resource
stylesheet
parameter can be used to set the source's URI in multiple properties
of the JSON-LD output.
The xsl/document.xsl
XSLT package implements the
either full or part-wise delivery of a TEI document.
Just as xsl/navigation.xsl
, also xsl/document.xsl
can be applied
on a source document (where the resource
parameter can be used to
reset the resource identifier)
$SAXON_CMD -xsl:saxon-local.xml -xsl:xsl/document.xsl -s:test/matt.xml
... or it can be called with the default initial template:
$SAXON_CMD -xsl:saxon-local.xml -xsl:xsl/document.xsl -it resource=test/matt.xml
$SAXON_CMD -config:saxon.he.xml -xsl:xsl/document.xsl -s:test/john.xml tree=page-hateoas start=p.1 end=p.1.end
This selects the content of the first page of
test/john.xml
, i.e. the nodes from <pb n="1"/>
to
the last node before <pb n="2"/>
:
<?xml version="1.0" encoding="UTF-8"?><TEI xmlns="http://www.tei-c.org/ns/1.0"><dts:wrapper xmlns:dts="https://w3id.org/api/dts#"><pb n="1"/>
<head>The book of John</head>
<milestone unit="theme" xml:id="creation-start"/>
<l n="1">In the beginning was the Word, and the Word was with God, and the Word was
God.</l>
<l n="2">He was with God in the beginning.</l>
<l n="3">Through him all things were made; without him nothing was made that has been
made.</l>
In him was life, and that life was the light</dts:wrapper></TEI>
The output is well-formed and contains the nodes (trees) from the node
identified by the start
throughout the node identified by the end
parameter. More about cutting out text based on milestone-like markup
is written in the project's
Wiki.
If you have Saxon HE at hand, simply use it as follows.
- Download released zip packages of the project. They are available
as release assets.
unzip dts-transformations-VERSION-package.zip
- Setup the class path for Saxon:
export SAXON_CMD="java -cp ... net.sf.saxon.Transform"
- Transform:
$SAXON_CMD -config:dts-transformations/saxon.he.xml -xsl:dts-transformations/xsl/navigation.xsl -s:YOUR_TEI.xml
You can install the transformations bundled in an Oxygen
framework. The framework works on top of the TEI P5
framework and
its transformation scenarios support you well in writing cite
structure declarations with <refsDecl>
and <citeStructure>
elements. The framework can simply be installed by putting the
following URL into the dialog box in Help > Install new add-ons
....
https://scdh.github.io/dts-transformations/descriptor.xml
There is a detailed installation guide in the Wiki.
Errors may occur on older versions of Oxygen, see Issue 10. Consider installing a plugin with a newer version of Saxon.
You can also clone this repo and set up and use its conveniant Tooling like so:
Setup:
# git clone ...
cd dts-transformations
./mvnw package # sets up tooling
Besides a wrapper script for Saxon-HE under target/bin/xslt.sh
, this
also provides you with Apache Jena
RIOT under
target/bin/riot.sh
and the command line interface of Titanium
JSON-LD under target/bin/ld-cli
.
Transforming:
target/bin/xslt.sh -config:saxon.he.xml -xsl:xsl/navigation.xsl -s:test/matt.xml
Other RDF serialization (e.g. expanded JSON-LD):
target/bin/xslt.sh -config:saxon.he.xml -xsl:xsl/navigation.xsl -s:test/matt.xml | target/bin/ld-cli expand -op
To make DTS endpoints, the XSL transformations from this package need to be deployed on a web service. There are several options and we will publish a PoC for a deployment very soon.
You can use the initial templates of xsl/navigation.xsl
and
xsl/document.xsl
for getting the document by the resource
parameter. You can go along with URIs for resources
; or you can
overwrite dts:resource-uri#0
from
xsl/resource.xsl
for mapping arbitrary resource
identifiers to document locations.
The value coming in via the resource
parameter must somehow be
mapped to a document URI (at least when calling the initial
template). There is a mapping function, that can easily be
replaced. It's called dts:resource-uri#1
and defined in
xsl/resource.xsl
. This package can be replaced
with one that suits your needs by the Saxon configuration file.
To add custom citation tree constructions not based on
<citeStructure>
, you can add templates to the citationTrees
mode
defined in xsl/tree.xsl
. It is initiated on every
refsDecl
and is first called on self::refsDecl
. It runs in
shallow-skip mode.
To get the HTTP status codes, that the DTS specs prescribe for certain
errors, the static parameters in xsl/errors.xsl
can be used. They define error codes that a web service can catch and
then return HTTP status codes accordingly.
Please note, that the XSLT uses <xsl:assert>
in some places, which
does not throw errors per default, but needs the XSLT processor
configured to do so. Saxon HE can be told by the -ea:on
command
line
switch
or by the /configration/xslt/@enableAssertions
configuration file
option
to enable assertions.
Processing of the mediaType
parameter is a matter of post-processing
the result of applying xsl/document.xsl
. It is
thus is up to customization. There are several approaches:
- chaining the output of the
xsl/document.xsl
to another transformation which evaluates themediaType
parameter - importing parts of
xsl/document.xsl
in an third stylesheet that processesmediaType
- compile time customization of
xsl/document.xsl
through its static parameters which determine amedia-type-package
, its version, and how it is called for processingmediaType
The first option wins the award of straighforwardness, but may have a
downside: The source-document context of the nodes will probably be
lost during the post-processing phase. The other approaches can get
the full benefit from the nice feature, that the nodes returned by the
two dts:cut-...#1
functions in
xsl/document.xsl
are still in the context of the
source document (node identity). So you can probably use your
well-written stylesheets for getting HTML, plain text, LaTeX, etc,
even for parts of your documents.
For the third option, see the example post-proc-(apply|call|fun).xsl
packages in the test
folder.
URI templates, which are required for the output of the
dts:Resource
LOD object, must of course be adaptable to specific project
needs.
The adaption can be done by providing an custom XSLT package to the
xsl/navigation.xsl
through its static parameters
uri-template-package
and uri-template-package-version
. An
implementation must expose two functions:
dts:uri-template-map-entries ($resource as ducument-node(), parameters as map(xs:string, item()*) as item()*
dts:navigation-uri ($resource as ducument-node(), parameters as map(xs:string, item()*) as xs:anyURI?
They get the resource document and the query parameters for maximum
flexiblity. The first function must return a sequence of
<xsl:map-entry>
elements.
The xsl/uri-templates/
folder offers different
implementations.
xsl/navigation.xsl
offers customization points for adding metadata
and other LOD properties to the member objects.
- The mode
member-metadata
can be used to add additional elements to the intermediate<dts:member>
elements. The mode es called for each of the source's nodes (forrests) selected by aciteStructure/@match
. This mode does not contain any templates but the defaultshallow-skip
isch ones. - The function
dts:member-metadata-json#1
can be used to access these additional elements in order to output additional LOD properties to the member objects.
The value of JSON-LD @context
property can be configured through the
context
parameters in xsl/dts.xsl
.
The JSON-LD output has an asserted order where order matters: in arrays. The members array is in document order.
The order of object properties does not carry any information and
there are no guarantees about it. So the @context
property of the
root object may occur as the first or the last property or somewhere
in the middle.
Saxon's JSON serializer per default escapes slashes with
backslashes. If this matters, first think about configuring the
serializer:
There's a escape-solidus
option.
The collection endpoint is not targeted by this project. We recommend to first extract an RDF-based knowledge graph from your set of documents using xtriples-micro and to then use SPARQL and JSON-LD Framing for generating the collection objects from it. We have documented this approach in the xtriples-micro's Wiki.
The entry endpoint and the use of URI templates is really the killer feature of DTS. Do not underestimate it! It even allows you to have different base URLs for the different endpoints and it can serve as an extensible service registry for your edition. Imagine to serve collection from a static web server like github pages and to have a generic single service for navigation and document with a different base URL that serves these endpoints for multiple editions or even a whole community.––There's no generic solution for the entry endpoint.
Contributions of all kinds are well come. Please see the contributing guide.
There's also a Wiki which lives from community content.
MIT