From 6019279203373e76c95ba258e70811d861932ced Mon Sep 17 00:00:00 2001
From: Michael Kay
Date: Sun, 26 Jan 2025 12:21:47 +0000
Subject: [PATCH 1/2] WIP
---
specifications/grammar-40/grammar.dtd | 70 +-
specifications/grammar-40/xpath-grammar.xml | 688 ++++++--------------
style/assemble-spec.xsl | 8 +-
style/grammar2spec.xsl | 130 +---
4 files changed, 248 insertions(+), 648 deletions(-)
diff --git a/specifications/grammar-40/grammar.dtd b/specifications/grammar-40/grammar.dtd
index 788e19239..dbcf20312 100644
--- a/specifications/grammar-40/grammar.dtd
+++ b/specifications/grammar-40/grammar.dtd
@@ -6,16 +6,16 @@
or via an XSLT stylesheet or other transformation, may generate
a parser compiler specification such as for YACC or JavaCC.
-Norm and Scott moved this file, and added an explicit prefix, as part of the
-transition toward a unified build process for last call and beyond. This involved
-moving the location of the CVS repository, For earlier history information,
-see /WWW/XML/Group/xpath-query-src/grammar.dtd
+ In 2025 Michael Kay simplified the DTD to remove parts that were no
+ longer used or maintained.
=========================================================================-->
-
+
@@ -146,44 +146,44 @@ see /WWW/XML/Group/xpath-query-src/grammar.dtd
process-value (no | yes) #IMPLIED
>
-
+
-
+
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+>-->
-
+
@@ -580,13 +580,13 @@ VersionDecl ::= "xquery" (("encoding" StringLiteral) | ("version" StringLiteral
-
+
+
+ declare
@@ -600,7 +600,7 @@ VersionDecl ::= "xquery" (("encoding" StringLiteral) | ("version" StringLiteral
- -->
+ updating
@@ -1408,7 +1408,180 @@ ErrorVal ::= "$" VarName
-
+
+
+
+ or
+
+
+
+
+
+
+
+ and
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ otherwise
+
+
+
+
+
+
+
+ ||
+
+
+
+
+
+
+
+ to
+
+
+
+
+
+
+
+
+ +
+ -
+
+
+
+
+
+
+
+
+
+ *
+ ×
+ div
+ ÷
+ idiv
+ mod
+
+
+
+
+
+
+
+
+
+ union
+ |
+
+
+
+
+
+
+
+
+
+ intersect
+ except
+
+
+
+
+
+
+
+
+ instance
+ of
+
+
+
+
+
+
+
+ treat
+ as
+
+
+
+
+
+
+
+ castable
+ as
+
+
+ ?
+
+
+
+
+
+
+
+ cast
+ as
+
+
+ ?
+
+
+
+
+
+
+
+ ->
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ -
+ +
+
+
+
+
+
+
+
+
+
+
=>
@@ -2121,22 +2294,21 @@ ErrorVal ::= "$" VarName
-
+
-
+ />
-
+ >
-
-
+ </
+
-
+ >
@@ -2145,11 +2317,11 @@ ErrorVal ::= "$" VarName
-
+
-
+ =
@@ -2217,9 +2389,9 @@ ErrorVal ::= "$" VarName
-
+ <!--
-
+ -->
@@ -2232,13 +2404,13 @@ ErrorVal ::= "$" VarName
-
+ <?
-
+ ?>
@@ -2248,9 +2420,9 @@ ErrorVal ::= "$" VarName
-
+ <![CDATA[
-
+ ]]>
@@ -2494,9 +2666,9 @@ ErrorVal ::= "$" VarName
-
+ ``[
-
+ ]``
@@ -2514,11 +2686,11 @@ ErrorVal ::= "$" VarName
-
+ `{
-
+ }`
@@ -2880,7 +3052,7 @@ ErrorVal ::= "$" VarName
-
+
-
+
+
@@ -3097,13 +3269,13 @@ ErrorVal ::= "$" VarName
-
+-->
-
-
+
+
withoutcontent
-
+ -->
@@ -4073,34 +4245,6 @@ ErrorVal ::= "$" VarName
-
-
-
-
-
- >
-
-
-
- />
-
-
-
- </
-
-
-
-
-
-
-
- >
-
-
-
- =
-
-
"
@@ -4125,14 +4269,6 @@ ErrorVal ::= "$" VarName
}}
-
- <!--
-
-
-
- -->
-
-
@@ -4142,38 +4278,7 @@ ErrorVal ::= "$" VarName
-
- <?
-
-
-
- ?>
-
-
-
- <![CDATA[
-
-
-
-
- ]]>
-
-
-
- ``[
-
-
-
- ]``
-
-
-
- `{
-
-
-
- }`
-
+
-
-
-
-
- This is not an actual state, but rather a collection of
- sub-terminals that are referenced by g:token rules.
- In the file that is generated for input to JavaCC,
- each becomes a "private regular expression".
- (It would be better to make this distinction
- in the g:token element.)
-
-
- No state change.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- XXX
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- The "(:" token marks the beginning of an expression
- Comment, and the ":)" token marks the end. This allows no special
- interpretation of other characters in this state.
-
-
-
-
-
-
-
- No state change.
-
-
-
-
-
-
-
- This state allows attributes in the native XML syntax,
- and marks the beginning of an element construction. Element
- constructors also push the current state, popping it at the
- conclusion of an end tag. In the START_TAG state, the string ">" is
- recognized as a token which is associated with the transition to
- the original state.
-
-
-
-
-
-
-
-
-
-
-
-
-
- No state change.
-
-
-
-
-
-
-
- This state allows content valid for attributes. The
- character "{" marks a transition to the OPERAND state, i.e. the
- start of an embedded expression, and the "}" character pops back to
- the original state. To allow curly braces to be used as character
- content, a double left or right curly brace is interpreted as a
- single curly brace character. This state is the same as
- APOS_ATTRIBUTE_CONTENT, except that apostrophes are allowed without
- escaping, and an unescaped quote marks the end of the
- state.
-
-
-
-
- Transition to an Attribute Value
- Template.
-
-
-
-
-
- No state change.
-
-
-
-
-
-
-
-
-
-
-
- This state is the same as QUOT_ATTRIBUTE_CONTENT, except
- that quotes are allowed, and an unescaped apostrophe marks the end
- of the state.
-
-
-
-
- Transition to an Attribute Value
- Template.
-
-
-
-
-
- No state change.
-
-
-
-
-
-
-
-
-
-
-
- This state allows XML-like content, without these
- characters being misinterpreted as expressions. The character "{"
- marks a transition to the OPERAND state, i.e. the start of an
- embedded expression, and the "}" character pops back to the
- ELEMENT_CONTENT state. To allow curly braces to be used as
- character content, a double left or right curly brace is
- interpreted as a single curly brace character. The string "</"
- is interpreted as the beginning of an end tag, which is associated
- with a transition to the END_TAG state.
-
-
-
-
- Transition to an Element Value
- Template.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- No state change.
-
-
-
-
-
-
-
-
-
-
- When the end tag is terminated, the state is popped to
- the state that was pushed at the start of the corresponding start
- tag.
-
-
-
-
- No state change.
-
-
-
-
-
-
-
-
- The "<--" token marks the beginning of an XML
- Comment, and the "-->" token marks the end. This allows no special
- interpretation of other characters in this state.
-
-
-
-
- No state change.
-
-
-
-
-
-
-
-
-
- In this state, only patterns that are valid in a
- processing instruction name are recognized.
-
-
-
-
-
-
-
- No state change.
-
-
-
-
-
-
- In this state, only characters are that are valid in
- processing instruction content are recognized.
-
-
-
-
- No state change.
-
-
-
-
-
-
-
- In this state, only lexemes that are valid in a CDATA
- section are recognized.
-
-
-
-
- No state change.
-
-
-
-
-
-
-
- This state is entered in a a pragma expression, and recognizes a
- QName that transits to a PRAGMA_3 state rather than a OPERATOR state.
-
-
-
-
-
- No state change.
-
-
-
-
-
- This state recognizes the space(s) required to preceed pragma contents. If you do not have
- this, and try to recognize S in PRAGMA_3, then Char will be recognized first,
- and the pragma production will not work properly.
-
-
-
-
-
-
-
-
-
- This state recognizes characters in pragma content, and transits out of this
- state when a “#)” pattern is recognized.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
-
+
diff --git a/style/assemble-spec.xsl b/style/assemble-spec.xsl
index c16355770..6ab07ab92 100644
--- a/style/assemble-spec.xsl
+++ b/style/assemble-spec.xsl
@@ -179,13 +179,7 @@
-
-
-
-
-
-
-
+
diff --git a/style/grammar2spec.xsl b/style/grammar2spec.xsl
index dc77c5654..20cfb60be 100644
--- a/style/grammar2spec.xsl
+++ b/style/grammar2spec.xsl
@@ -298,135 +298,7 @@
-
-
-
-
-
-
-
-
- state-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-