Skip to content

Commit bb8b822

Browse files
committed
resolved conflict in software list
2 parents 50b6b90 + aba3da0 commit bb8b822

File tree

6 files changed

+105
-51
lines changed

6 files changed

+105
-51
lines changed

data/JTEI/14_2021-23/jtei-cc-pn-erjavec-195-source.xml

Lines changed: 54 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -207,10 +207,11 @@
207207
<div xml:id="schema">
208208
<head>The Parla-CLARIN Schema</head>
209209
<p>Parla-CLARIN is written as a TEI ODD document, consisting of the prose guidelines and
210-
the schema specification, on the basis of which it is possible, using the standard TEI
211-
XSLT stylesheets, to derive an XML schema expressed either as a RelaxNG schema, a DTD,
212-
or a W3C schema, which is then used for formal validations of a Parla-CLARIN
213-
parliamentary corpus.</p>
210+
the schema specification, on the basis of which it is possible, using the <ptr
211+
type="software" xml:id="R5" target="#teistylesheets"/><rs type="soft.name" ref="#R5"
212+
>standard TEI XSLT stylesheets</rs>, to derive an XML schema expressed either as a
213+
RelaxNG schema, a DTD, or a W3C schema, which is then used for formal validations of a
214+
Parla-CLARIN parliamentary corpus.</p>
214215
<p>While the proposal tries to cater for many encoding needs, it is possible that new
215216
users will have to use TEI elements or attributes that are not discussed in the prose
216217
guidelines. Since the recommendations are still under development, the formal schema
@@ -324,20 +325,22 @@
324325
<div xml:id="presentation">
325326
<head>Presentation of Parla-CLARIN</head>
326327
<p>Like the TEI Guidelines, the Parla-CLARIN recommendations are available on <ref
327-
target="https://github.com/clarin-eric/parla-clarin/"><ptr type="software"
328-
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub"
329-
>GitHub</rs></ref>, as a project<note>Tomaž Erjavec and Andrej Pančur, Parla-CLARIN
330-
project <ptr type="software" xml:id="GitHub" target="#GitHub"/><rs type="soft.name"
331-
ref="#GitHub">GitHub</rs> site, last updated March 17, 2021, <ptr
332-
target="https://github.com/clarin-eric/parla-clarin/"/>.</note> of the CLARIN ERIC
333-
collection. The project contains a folder for the schema (i.e., the Parla-CLARIN ODD
334-
document and XML schemas derived from it), a folder for the programs that convert the
335-
ODD into the XML schemas and to the HTML of the prose and schema definitions, and a
336-
folder for examples, which contains an artificial but fully worked out example of a
337-
Parla-CLARIN document and subfolders with various example resources, where each should
338-
contain: <list rend="ordered">
328+
target="https://github.com/clarin-eric/parla-clarin/"><ptr type="software" xml:id="R1"
329+
target="#GitHub"/><rs type="soft.name" ref="#R1">GitHub</rs></ref>, as a
330+
project<note>Tomaž Erjavec and Andrej Pančur, Parla-CLARIN project <ptr
331+
type="software" xml:id="R2" target="#GitHub"/><rs type="soft.name" ref="#R2"
332+
>GitHub</rs> site, last updated March 17, 2021, <ptr type="software" xml:id="R9"
333+
target="#parlaclarinscripts"/><rs type="soft.url" ref="#R9"><ptr
334+
target="https://github.com/clarin-eric/parla-clarin/"/></rs>.</note> of the CLARIN
335+
ERIC collection. The project contains a folder for the schema (i.e., the Parla-CLARIN
336+
ODD document and XML schemas derived from it), a folder for the <rs type="soft.name"
337+
ref="#R9">programs that convert the ODD into the XML schemas and to the HTML of the
338+
prose and schema definitions</rs>, and a folder for examples, which contains an
339+
artificial but fully worked out example of a Parla-CLARIN document and subfolders with
340+
various example resources, where each should contain: <list rend="ordered">
339341
<item>a sample of a corpus in its source encoding;</item>
340-
<item>XSLT script to convert it into Parla-CLARIN; and</item>
342+
<item><rs type="soft.name" ref="#R9">XSLT script to convert it into Parla-CLARIN</rs>;
343+
and</item>
341344
<item>the output of the conversion.</item>
342345
</list>
343346
</p>
@@ -495,12 +498,15 @@
495498
<p>Nevertheless, AKN is an important schema for modeling parliamentary proceedings,
496499
especially as the primary encoding standard used by various legislative bodies, so some
497500
of AKN’s solutions were used in developing the Parla-CLARIN proposal, in particular the
498-
typology of divisions of a document. Also developed was a partial, but non-trivial,
499-
conversion from AKN to Parla-CLARIN, which covers several AKN example documents. As
500-
mentioned in <ptr type="crossref" target="#presentation"/>, the example documents and
501-
conversion script can be found in the <ident>Examples</ident> folder of the Parla-CLARIN
502-
Git repository. The <ident>akn2tei.xsl</ident> script attempts to preserve the IDs of
503-
the source AKN document, converts the AKN addressee, role, and questions and answers to
501+
typology of divisions of a document. Also developed was a partial, but non-trivial, <ptr
502+
type="software" xml:id="R10" target="#parlaclarinscripts"/><rs type="soft.name"
503+
ref="#R10">conversion from AKN to Parla-CLARIN</rs>, which covers several AKN example
504+
documents. As mentioned in <ptr type="crossref" target="#presentation"/>, the example
505+
documents and conversion script can be found in the <ident>Examples</ident> folder of
506+
the Parla-CLARIN Git repository. The <ptr type="software" xml:id="R11"
507+
target="#parlaclarinscripts"/><rs type="soft.name" ref="#R11"
508+
><ident>akn2tei.xsl</ident></rs> script attempts to preserve the IDs of the source
509+
AKN document, converts the AKN addressee, role, and questions and answers to
504510
Parla-CLARIN, and maps FRBR data (which distinguishes a <soCalled>work</soCalled> from
505511
its <soCalled>expression</soCalled> and its expression from its
506512
<soCalled>manifestation</soCalled>) to the appropriate TEI elements and attributes.
@@ -572,9 +578,10 @@
572578
parliamentary proceedings meant for scholarly investigations. This scheme is currently a
573579
straightforward customization of the TEI Guidelines, with the majority of the effort
574580
having gone into the writing of the prose guidelines of the Parla-CLARIN recommendations
575-
and into developing the conversion from Akoma Ntoso to Parla-CLARIN. We have not included
576-
examples of the encoding, as these are readily available on the <ptr type="software"
577-
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub">GitHub</rs>
581+
and into developing the <ptr type="software" xml:id="R12" target="#parlaclarinscripts"
582+
/><rs type="soft.name" ref="#R12">conversion from Akoma Ntoso to Parla-CLARIN</rs>. We
583+
have not included examples of the encoding, as these are readily available on the <ptr
584+
type="software" xml:id="R3" target="#GitHub"/><rs type="soft.name" ref="#R3">GitHub</rs>
578585
documentation page of the project, and large Parla-CLARIN encoded corpora are openly
579586
available.</p>
580587
<p>Apart from the siParl 2.0 corpus mentioned above (<ptr type="crossref"
@@ -601,15 +608,21 @@
601608
<p>As we wanted to have corpora that are not only interchangeable but interoperable as well,
602609
we created a bespoke ParlaMint XML schema directly in RelaxNG – the schema is compatible
603610
with Parla-CLARIN as it validates a subset of documents that would be validated against
604-
Parla-CLARIN. We produced common scripts that can convert any of the four corpora to plain
605-
text, to CoNLL-U format as used by the Universal Dependencies project, and to vertical
606-
format as used by the <ref target="http://cwb.sourceforge.net/">CWB</ref><note>The IMS
607-
Open Corpus Workbench (CWB), last modified March 30, 2021, <ptr
608-
target="http://cwb.sourceforge.net/"/>.</note> and <ref
609-
target="http://www.sketchengine.eu/">Sketch Engine</ref><note>Accessed January 13, 2022,
610-
<ptr target="http://www.sketchengine.eu/"/>.</note> (<ref type="bibl"
611-
target="#kilgarriff14">Kilgarriff et al. 2014</ref>) concordancers, as well as to
612-
extract complete speech metadata into TSV files.</p>
611+
Parla-CLARIN. We produced <ptr type="software" xml:id="R13" target="#parlaclarinscripts"
612+
/><rs type="soft.url" ref="#R13">common scripts that can convert any of the four corpora
613+
to plain text, to CoNLL-U format as used by the Universal Dependencies project, and to
614+
vertical format as used by the <ptr type="software" xml:id="R14" target="#cwb"/><rs
615+
type="soft.url" ref="#R14"><ref target="http://cwb.sourceforge.net/"
616+
>CWB</ref></rs></rs><note>The <rs type="soft.name" ref="#R14">IMS Open Corpus Workbench
617+
(CWB)</rs>, last modified March 30, 2021, <rs type="soft.url" ref="#R14"><ptr
618+
target="http://cwb.sourceforge.net/"/></rs>.</note> and <ptr type="software"
619+
xml:id="R15" target="#sketchengine"/><rs type="soft.url" ref="#R15"><ref
620+
target="http://www.sketchengine.eu/"><rs type="soft.name" ref="#R15">Sketch
621+
Engine</rs></ref></rs><note>Accessed January 13, 2022, <rs type="soft.url"
622+
ref="#R15"><ptr target="http://www.sketchengine.eu/"/></rs>.</note> (<rs
623+
type="soft.bib.ref" ref="#R15"><ref type="bibl" target="#kilgarriff14">Kilgarriff et al.
624+
2014</ref></rs>) concordancers, as well as to extract complete speech metadata into
625+
TSV files.</p>
613626
<p>In order for Parla-CLARIN to achieve its goal of becoming a widely recognized encoding
614627
format for corpora of parliamentary proceedings, significant work remains to be done. On
615628
the basis of the lessons learned in creating ParlaMint, we plan to revise the prose
@@ -619,10 +632,10 @@
619632
specification from the default ones in the TEI Guidelines to ones taken or adapted from
620633
the collected parliamentary corpora.</p>
621634
<p>Second, as we have already done for ParlaMint, we plan to add to the <ptr type="software"
622-
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub">GitHub</rs>
623-
Parla-CLARIN project more down-conversion scripts with which we would increase the
624-
usability of the Parla-CLARIN corpora. As mentioned, work also needs to be done to develop
625-
a conversion to RDF.</p>
635+
xml:id="R4" target="#GitHub"/><rs type="soft.name" ref="#R4">GitHub</rs> Parla-CLARIN
636+
project more down-conversion scripts with which we would increase the usability of the
637+
Parla-CLARIN corpora. As mentioned, work also needs to be done to develop a conversion to
638+
RDF.</p>
626639
<p>Last, but not least, one of the great benefits of Git is the ability to support
627640
collaborative work, be it through posting issues, or through using pull requests to
628641
incorporate changes. While the community has not so far made use of these options, we hope
@@ -790,8 +803,8 @@
790803
<bibl xml:id="kilgarriff14"><author>Kilgarriff, Adam</author>, <author>Vít Baisa</author>,
791804
<author>Jan Bušta</author>, <author>Miloš Jakubíček</author>, <author>Vojtěch
792805
Kovář</author>, <author>Jan Michelfeit</author>, <author>Pavel Rychlý</author>, and
793-
<author>Vít Suchomel</author>. <date>2014</date>. <title level="a">The Sketch Engine:
794-
Ten Years On.</title>
806+
<author>Vít Suchomel</author>. <rs type="soft.bib.ref" ref="ewfew"><date>2014</date>.
807+
<title level="a">The Sketch Engine: Ten Years On.</title></rs>
795808
<title level="j">Lexicography: Journal of ASIALEX</title>
796809
<biblScope unit="volume">1</biblScope> (<biblScope unit="issue">1</biblScope>):
797810
<biblScope unit="page">7–36</biblScope>. doi:<idno type="DOI"

schema/tei_jtei_annotated.odd

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2277,7 +2277,9 @@
22772277
<valItem mode="add" ident="#webpack"/>
22782278
<valItem mode="add" ident="#elem"/>
22792279
<valItem mode="add" ident="#literaturetranslation"/>
2280+
<valItem mode="add" ident="#teitok"/>
22802281
<valItem mode="add" ident="#github"/>
2282+
<valItem mode="add" ident="#githubpages"/>
22812283
<valItem mode="add" ident="#leaflet"/>
22822284
<valItem mode="add" ident="#ugarit"/>
22832285
<valItem mode="add" ident="#smartcompose"/>
@@ -2384,6 +2386,9 @@
23842386
<valItem mode="add" ident="#azurecloud"/>
23852387
<valItem mode="add" ident="#gate"/>
23862388
<valItem mode="add" ident="#r"/>
2389+
<valItem mode="add" ident="#textualcommunities"/>
2390+
<valItem mode="add" ident="#visualstudiocode"/>
2391+
<valItem mode="add" ident="#scholarlyxml"/>
23872392
<valItem mode="add" ident="#igraph"/>
23882393
<valItem mode="add" ident="#textal"/>
23892394
<valItem mode="add" ident="#planthumanitiesworkbench"/>
@@ -2422,7 +2427,6 @@
24222427
<valItem mode="add" ident="#eppt"/>
24232428
<valItem mode="add" ident="#elwoodviewer"/>
24242429
<valItem mode="add" ident="#evt"/>
2425-
<valItem mode="add" ident="#boilerplate"/>
24262430
<valItem mode="add" ident="#tei2html"/>
24272431
<valItem mode="add" ident="#basex"/>
24282432
<valItem mode="add" ident="#tipuesearch"/>
@@ -2447,6 +2451,7 @@
24472451
<valItem mode="add" ident="#ediarum"/>
24482452
<valItem mode="add" ident="#transkribus"/>
24492453
<valItem mode="add" ident="#xproc"/>
2454+
<valItem mode="add" ident="#ceteicean"/>
24502455
<valItem mode="add" ident="#imagemarkuptool"/>
24512456
<valItem mode="add" ident="#tustep"/>
24522457
<valItem mode="add" ident="#netbeans"/>
@@ -2481,6 +2486,7 @@
24812486
<valItem mode="add" ident="#teipelican"/>
24822487
<valItem mode="add" ident="#odd2odd"/>
24832488
<valItem mode="add" ident="#zenodo"/>
2489+
<valItem mode="add" ident="#parlaclarinscripts"/>
24842490
</valList>
24852491
<dataRef name="anyURI" restriction="http.+|#.+|@.+|hdl.+|mailto.+"/>
24862492
</alternate>

schema/tei_jtei_annotated.rng

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
xmlns="http://relaxng.org/ns/structure/1.0"
66
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
77
ns="http://www.tei-c.org/ns/1.0"><!--
8-
Schema generated from ODD source 2024-02-20T13:55:46Z. 2014.
8+
Schema generated from ODD source 2024-02-20T15:55:29Z. 2014.
99
TEI Edition: P5 Version 4.7.0. Last updated on 16th November 2023, revision e5dd73ed0
1010
TEI Edition Location: https://www.tei-c.org/Vault/P5/4.7.0/
1111
@@ -2745,8 +2745,12 @@ attributes @target and @cRef may be supplied on <name/>.</report>
27452745
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
27462746
<value>#literaturetranslation</value>
27472747
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
2748+
<value>#teitok</value>
2749+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
27482750
<value>#github</value>
27492751
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
2752+
<value>#githubpages</value>
2753+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
27502754
<value>#leaflet</value>
27512755
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
27522756
<value>#ugarit</value>
@@ -2959,6 +2963,12 @@ attributes @target and @cRef may be supplied on <name/>.</report>
29592963
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
29602964
<value>#r</value>
29612965
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
2966+
<value>#textualcommunities</value>
2967+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
2968+
<value>#visualstudiocode</value>
2969+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
2970+
<value>#scholarlyxml</value>
2971+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
29622972
<value>#igraph</value>
29632973
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
29642974
<value>#textal</value>
@@ -3035,8 +3045,6 @@ attributes @target and @cRef may be supplied on <name/>.</report>
30353045
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30363046
<value>#evt</value>
30373047
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
3038-
<value>#boilerplate</value>
3039-
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30403048
<value>#tei2html</value>
30413049
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30423050
<value>#basex</value>
@@ -3085,6 +3093,8 @@ attributes @target and @cRef may be supplied on <name/>.</report>
30853093
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30863094
<value>#xproc</value>
30873095
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
3096+
<value>#ceteicean</value>
3097+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30883098
<value>#imagemarkuptool</value>
30893099
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
30903100
<value>#tustep</value>
@@ -3153,6 +3163,8 @@ attributes @target and @cRef may be supplied on <name/>.</report>
31533163
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
31543164
<value>#zenodo</value>
31553165
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
3166+
<value>#parlaclarinscripts</value>
3167+
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
31563168
</choice>
31573169
<data type="anyURI">
31583170
<param name="pattern">http.+|#.+|@.+|hdl.+|mailto.+</param>

0 commit comments

Comments
 (0)