Skip to content

Entering Inscriptions for the "Inscriptions of Israel Palestine Project"

Elli Mylonas edited this page Jan 15, 2020 · 12 revisions

IIP inscriptions are encoded using a markup language called EpiDoc, which is a type of XML (=EXtensible Markup Language) schema (see: https://sourceforge.net/p/epidoc/wiki/Home/ ; you may also have a look here: http://www.stoa.org/wordpress/wp-content/uploads/2010/09/Chapter05_EpiDoc_Bodard.pdf). The Oxygen program is XML aware, and is supposed to help you edit XML more easily. However, in order to help encoders do their work more quickly, we have customized the interface to the oXygen software so they don’t have to encounter all the details of XML, and can concentrate, instead, on entering accurate information. However, you will see various artifacts of the underlying schema.

The first step to entering or editing any inscription file is to make Oxygen aware of our Schema and load the template that drives the interface. To do this, follow the instructions on the page "customizing..."

When you first open the template, you will see some shaded areas and a lot of red lines under the text and on the right. The lines indicate that the underlying XML isn’t correct. As you start adding information, the XML in the file will be more correct, and the red lines will disappear.

At the top of the Author screen, you will see a variety of icons. You will be able to work most easily if you set the interface to not display any of the underlying tags. To do this, click on the tags icon (the second icon from the left on the icon bar; with thick arrows), hold down, and select “No Tags” ( oxygen arrows )from the menu that appears.

Tags Menu

You are now ready to begin editing.

Note that you might be using different templates if entering Hebrew and Aramaic (or another rtl language) inscriptions. To enter an rtl inscription, open the iip_template-Hebrew file in the Template folder in the Oxygen project folder. In order to avoid corruption, immediately “save as” the file name. You can, of course, also work from an rtl inscription that has already been entered (renamed, of course). For more information on entering texts in rtl, see below.

Identifiers

  1. Enter the machine readable inscription name in the box next to the title which is the same as the file name. The machine readable ID consist of 4 lowercase letters indicating the location of the inscription followed by 4 digits for the inscription number. Pad with leading zeros if necessary. Example: zoor was used for Zoora, and 0002 would indicate inscription number 2 – zoor0002 is the machine readable ID. It will also be the file name (zoor0002.xml)

  2. Enter the Display ID in the Identifiers box. The display ID is the more human readable version of the machine readable ID. Capitalize the first letter, and put a space between the place and the number. Ex. Zoor 0002 .

Summary Box

Each inscription file has two major parts. The first part is information about the inscription (called the metadata) and the second are the texts (transcriptions and translation). In text mode, you can see that the metadata goes up to line 150, at which point the texts appear. Most of the boxes in the Oxygen Author mode deal with the metadata.

  • Language

    In the Summary box, enter the language or languages used in the inscription using the pull-down menu. The present choices are Aramaic, Greek, Hebrew, and Latin. If you come across an inscription in another language, please bring it to our attention! We have space for two languages in the template. If you need to do something that the template does not accommodate (e.g., add a third language or delete a language if you made a mistake), go into Text mode in Oxygen; find the tag “textLang”, and make the adjustment there. You can, for example, simply add another otherLangs=“” before the /, or delete the one that is there. There must always be a mainLang.

    We are developing a mechanism for entering unconventional scripts (e.g., Aramaic in proto-Jewish script) that involves using a hyphen between the language and the script (e.g., <textLang mainlang=”arc-pj”>).

  • Genre

    This is where you indicate what kind of text the inscription is. There are now seven options (plus “Other” and “Unknown”), although we have used more in the past and will most likely do so again in the future. We are working on developing a more comprehensive controlled vocabulary list for this field.

  • Summary description of form (in the <p> tag within <msitem> element)

    The Summary description is a prose description of essential information about the inscription. Note that the information in this description is also found in other parts of the xml file – in those places it is indexed and searched, not here.

    The format of the summary description is:

    City, Date. Object. Genre. [Note: the value of "City" is the content of the <settlement>]

    Example: Zoora, April 4, 417 CE. Tombstone. Epitaph.

    Example: Jerusalem, North Talpiot, Peace Forest, First century CE. Ossuary. Funerary.

Physical Description Box

The Physical description <physdesc> focuses on the inscription as an object, not the text of the actual inscription. Much of the metadata is contained within this tag.

  • Type of Object (<objectDesc>).

    This is chosen from a pull-down menu. Note that this is not the type or genre of inscription, but of the object upon which it is inscribed (e.g. ossuary). These values are not a closed list; new values can be added as needed by the project manager(s).

  • Material (<supportDesc> within <objectDesc>).

    This pull-down menu operates much as the one above, with a two-level hierarchical controlled vocabulary. These values are not a closed list; new values can be added as needed by the project manager(s).

  • More Detail (<p> within <support>).

    Any prose that is relevant to the type of object or its material. This field is not currently displayed. We do not use it often, but feel free to use it if there is something exceptional about the object that it is helpful to share internally. Please comment only on the object and its material.

  • Dimensions (<dimensions>).

    All measurements for these three dimensions should be in centimeters. If no dimensions are indicated leave the field(s) blank. If the text you are using uses a different unit (e.g. millimeters), make sure to convert it into centimeters. If the object has dimensions that vary along a single side, or is uneven, enter the minimum and maximum separated by a dash.

    Example: <height>22-27</height> Example: <width>18</width>

  • Condition

    Indicate the condition of the inscription by selecting a value from the pull-down menu.

Complete > The object is complete (generally)


Complete.broken > We have all the parts, but the object is broken Complete.intact > The object is complete and intact Fragment.single > We have a single fragment Fragments > We have a number of fragments Fragments.contig > We have a number of fragments, and they are contiguous Fragments.non_contig > We have a number of fragments, and they are not contiguous

  • More detail (<p> in <condition>)

    Prose details relating to the condition of the inscription. This field is not currently displayed.

  • Columns and Written Lines (attributes in <layout> in <layoutDesc>)

    Enter the number of columns and lines of the inscription.

  • More detail (<p> in <layout>).

    Prose details relating to the layout of the inscription. This field is not currently displayed.

  • How was this written (<handnote> within <handDesc>)

    Select from the pull-down menu indicating the means by which the inscription was made, e.g., chiseled or inset in a mosaic. Most inscriptions on stone or on ostraka are either inscribed or painted.

  • Average dimensions of the letters (another <handnote> within <handDesc>)

    Enter the average height of the letters, as a range of cm values in the "At Least" and "At Most" fields. If there is no variation, enter a single value in the "Quantity" field.

  • More detail (<p> in <handnote>)

    Prose description of any additional details relating to the writing. This field is not currently displayed.

  • Decorations (<decoDesc>)

    As with <handDesc>, each decoration is contained within a <decoNote>. The controlled vocabulary for this field is in development. Describe each decoration in prose the best you can (this goes in the <ab> element within the <decoNote>). Then, also in prose, describe where the decoration on the object is (<locus> within the <DecoNote>). If there is more than one decoration, click the “New Decoration” button and enter the information. Repeat as necessary.

History and Provenance

This section describes the history of the inscription.

  • Is this a fake?

    Except in very rare circumstances, the value will remain “Genuine.”
    If you look at the XML, you will see an empty >rs< element if the inscription is genuine, and you will see >rs<Fake>/rs< if the inscription is not genuine.

  • Place and Date of Origin (<origin>)

    These fields apply to the place and date where and when the inscription was first made displayed and made. Sometimes there is doubt about this, which is indicated primarily in the prose description.

    • Dates: NotBefore and NotAfter (attributes in <date> element). These values will not be visible to users, but will be used for finding and sorting. Please use 4 digit dates (300 will be entered as 0300). If the date is BCE, then it will have a minus sign. Most dates are approximate, so if something is in the 1^st^ half of the 2^nd^ c. BCE then you will enter notBefore=”-0200” and notAfter=”-0150”. If the date is exact, then both values will be the same.

    • Display Date (prose in <date> element)

      Enter the date as it should display in the text box. “1^st^ half of 2^nd^ c. BCE”.

    • Region, etc. (<placeName>)

This indicates the original site of the inscription, to the degree known, in a hierarchical fashion moving from broadest, region (<region>), to settlement (<settlement>), to two progressively more specific locations (<geogName> and <geogFeat>).

The spelling of the names generally follows those in Yoram Tsafrir, et al., Tabula Imperii Romani: Iudaea/Palaestina (Jerusalem: The Israel Academy of Sciences and Humanities, 1994). At times, however, we depart from the names here to use more popular and recognizable names. We are investigating linking these names to Pleiades.

Incorporation of geographical coordinates?

Current Location (<placeName> in <provenance>)

In prose, whatever is known about the current location of the inscription.

If in original location, use in situ. If the inventory number is available, include it.

Revision Log

This provides an internal history of who did what to the inscription. Each entry creates a new <revisionDesc> section. Add some brief info indicating what you changed, or that you created the file. You can add more lines to the revision log by clicking the “Add Change” button.

Images

At present since we are focusing on entering the inscriptions and not the images, there is no section in Oxygen’s “Author” mode to enter images. There is, however, a section in the template (<facsimile>) that allows for linking an image if available.

Can we also indicate the source of the image?

Bibliography

Although this comes at the end of the template, when entering inscriptions it is best at this point to enter the bibliography. The full bibliography (usually found in the source you are using) should be entered. Follow these steps to do so:

  1. In your browser (you can open another window or tab), go to the IIP website, Bibliography, and then click on the bibliography link.

  2. Search for your citation(s).

    a. If the record is present, record the value in “Loc. In Archive” (IIP-xxx: please make sure that only three numbers appear after the dash) and see below.

  3. If we do not have record for that citation, go back to the Bibliography page and click the “Add Citation” link. The username is: iip, the password: xxxxx. Then follow instructions to add a new citation. Remember to record the “Loc. In Archive” value which is automatically generated.

  4. Now, knowing the “Loc. In Archive” values, go back to the entry form that you are working on in Oxygen. Designate a unique Citation ID for each citation (b1, b2, b3); enter the Loc. In Archive ID and the appropriate page or item number (checking which value you are entering).

  5. For the case of titles in languages that use Latin letters, use the original title. In the case of non-Latin titles (e.g., Hebrew or Greek) use the English title that is indicated in the publication with a notation of language in the appropriate field. If there is no English translation, use the original script to copy the title.

  6. If you have more than 3 references, you can type a number (b4) into the box where the pulldown value displays.

    Encoding an Inscription-Transcription

    **N.B. More detailed and thorough information to be found here: **

    http://www.stoa.org/epidoc/gl/latest/

    http://www.stoa.org/epidoc/gl/dev/app-alltrans.html

    Use this links as your main reference tool as far as encoding is concerned.

    The actual transcriptions, translation (into English), and display note are found in the <text> element (actually, this also contains the bibliographical links as well). Each is separated by a <div> element. The diplomatic transcription is where the inscription is transcribed as written (e.g., there is no separation between words; all is in capital letters; no accents). The simple transcription is the inscription as rendered by an editor. The translation is, of course, simply the English translation. Note that it is entirely possible that one or more of these items do not appear in the printed source: publication of Hebrew inscriptions, for example, often contain a transcription but not a diplomatic transcription. Just include what is in the source.

    We tag each of these things differently. The diplomatic transcription is tagged only for features relating to the appearance of the text. For example, we would tag gaps, lines, uncertain readings, deletions (or interlinear additions) original to the text. We would not note abbreviations, etc. In the transcription, we tag all of these things in addition to whatever the editor has done, such as filling in abbreviations, noting original spelling mistakes or missing or extraneous words, etc. Almost all published inscriptions typographically render their texts according to something known as the Leiden convention (see: http://papyri.info/docs/leiden_plus), in which editorial comments are represented with various punctuation symbols. We convert these typographical symbols into their equivalent element (although they are rendered on the display back according to the Leiden convention). For example, a printed version might say: A[B]C. We would transcribe that as: A<supplied reason=“lost”>B</supplied>C.

    For the translation, we simply copy what is provided, with the typographical symbols as written. These are not generally tagged, except for line breaks where noted.

    Usually, the diplomatic transcription, transcription, translation, and commentary are from the same source. Occasionally, though, they are not. In either case you need to note the source for each by selecting a value in the Reference Citation pull-down menu. This menu will display existing b1, b2, etc. values. Often, there will only be one citation, and all information will derive from that one citation. In that case, the citation will be “b1,” and all Reference Citations will be “b1.” Clicking on the default b1 will bring up a checklist of all the citations you have entered for this inscription. Check one or more.

    To enter transcriptions, turn on Full Tags in Oxygen. The process for entering rtl (e.g., Hebrew and Aramaic) inscriptions and Greek and Latin ones are slightly different.

    For Hebrew and Aramaic, bidirectional editing should be selected (usually when you start Oxygen you have this choice). You should be working from the iip_template-Hebrew (or the equivalent. The only difference between this and the other template is the inclusion of the appropriate attribute in the “diplomatic” and “transcription” elements: <foreign xml:lang="heb"></foreign>. This makes Oxygen aware that you are using Hebrew. You would then change the keyboard font to Hebrew. Position the cursor between the “p” tags (i.e. within the box). Enter the text of the inscription. When a tag is necessary, press “Enter” and select from the available tags. Some tag (these are called XML Elements) require attributes – for these elements select them, press alt-enter (Windows) or option-enter (Mac), and then enter the attribute value (a pulldown list). You can look at the “Epidoc Cheat Sheet” (available in our DropBox) or consult Appendix 1 for a description of the tags and how we use them.

    For Greek: For a Windows machine, there are two ways to enter polytonic (accented) Greek. One way is to type the transcription in a Word document (in Unicode Greek) and then to paste it into the appropriate box in Oxygen. The downside of this approach is that one must then enter the tags around the words.

    A second approach is change the keyboard layout to Greek. Accented characters are accessed via the Edit Menu, “Insert From Character Map.” An insert window appears. Select Arial Unicode MS. You can then scroll down to the accented Greek characters, highlight them, and press Insert. One potential problem of this approach is that it sometimes can take a while from the time you press Insert until the time the character actually appears.

    A few notes about Greek:

  • The lunate sigma (C) is not an English C, but Unicode hex U+03F2. You should be able to insert this directly from the Greek keyboard with Option+s and shift+option+s (for the capital). We encode these in the diplomatic with the Lunate sigma character, and then use standard sigmas in the transcription.

  • The koppa is U+03DE

  • Even if something looks like an apostrophe in the Greek transcription, it isn't. An apostrophe looking symbol is used to denote (among other things) that the letters are being used as numerals. Check the character palate for a Greek Numeral Sign and use that instead. Though a Greek year mark exists, we don't use it. Instead, we use the numeral sign (???).

  • Greek has a final sigma and a sigma used in all other places throughout the word. Be sure to mark the final sigma where appropriate.

  • Greek inscriptions sometimes make use of a large lowercase omega (which looks like a curvy "W") in the place of a majuscule (that is, capital) Omega. In this instance, we do not encode the large lower case omega with the Cyrillic letter which resembles it! Instead, we encode a standard majuscule omega and record in the note that the inscription uses large minuscule omegas in place of their majuscule counterparts.

Important! Our program assumes a white space anytime you hit "Return." Therefore, make sure you don't hit return in the middle of the word -- this is crucial for searching purposes. This means (Even when an author has used soft hyphens to mark words split over two lines, stick to our line-break conventions, and don't enter the soft hyphens. If you have no idea where word breaks are, ask somebody!)

The Notes box (<div type="commentary" ana="b1">) is where you include any vital information otherwise not included that you feel is important for the viewer to have. These notes will appear in the Inscription display. Generally these deal with details about the inscription and not commentary on its contents. Try to be concise.

Tagging

A summary of the most popular tags we use can be found in the Epidoc_cheatsheet (available through our DropBox and here: http://www.stoa.org/epidoc/gl/latest/). Below is a slightly more expansive version of this with some comments relevant to our encoding practices. Please refer also to the links provided above: for your own convenience, here they are again: http://www.stoa.org/epidoc/gl/latest/

http://www.stoa.org/epidoc/gl/dev/app-alltrans.html

  • Abbreviation (example for Latin word "viro"):
A standard way to publish an abbreviation in an inscription is: v(iro) In XML we represent it as: <expan><abbr>v</abbr><ex>iro</ex></expan>.

    Used in transcription only.

  • Alternative Readings: In the case when the editor provides an alternative reading, encode like this: <app type=”alternative”><lem> the first reading </lem> <rdg> the second possible reading</rdg></app>

  • Ambiguous characters with alternatives offered: use the following tagging

    <choice><unclear>α</unclear><unclear>β</unclear></choice>

    See here: http://www.stoa.org/epidoc/gl/latest/trans-ambiguousalt.html

  • Columns: In case one encounters an inscription which employs columns, there is a simple tag to render the columns correctly.

    Type the text in the order in which it is read; for example, an inscription whose columns are read horizontally looks as follows: | Thus shall be done

Whom the King 
| To the man 
Wishes to honor. |"Thus shall be done to the man whom the King wishes to honor." In XML, this inscription would be tagged as:<p>Thus shall be done<colSpace/>To the man<lb/>Whom the King<colSpace/>Wishes to honor.</p>

    An inscription read vertically, on the other hand, looks as follows and is tagged differently:


Thus shall be done Whom the King

To the man Wishes to honor.


<p>Thus shall be done<lb/>To the man<lb/><colSpace/>Whom the King<lb/>Wishes to honor.</p>

This is for searchability purposes. As of now, our web page doesn't display columns, so be sure to describe how the columns look and are read in your Notes section.

  • Delete: If an object preserves traces of an inscription which has been rubbed out or otherwise removed, some authors represent the text in double brackets: Legio
In XML, deleted text is represented as: <del rend=”erasure”>Legio</del>. Used in both diplomatic and transcription
  • Gap: A standard way to represent gaps when publishing inscriptions is [----]. Sometimes it is represented as an ellipsis. In XML we represent this as follows: <gap reason= “illegible” (OR: “lost”, depending on the case) extent="4" unit="character"></gap> If the extent of the gap is unknown, we simply use the gap tag with the reason attribute: For the XML to validate properly, you must include a reason.

    If the extent is given as a circa value, for example [ca. 13], mark as follows: <gap reason= “xyz”extent="12-14" unit="character"> </gap>

    If the extent is a range, for example [-5-6-], mark as follows: <gap reason= “xyz” extent="5-6" unit="character"> </gap>

    If the gap is represented with units other than characters, mark as such: [-5 cm-] becomes <gap reason=“xyz” extent="5" unit="cm"> </gap>

    An intentional blank space is called a vacat. See below about this.

    We also tag missing lines. For example, a gap of three lines should read <gap reason= “xyz” extent="3" unit="line. Used in both diplomatic and transcription.

  • Ligature (example for a ligated "ou"):
<hi rend="ligature">ou</hi>. More than two characters can appear nested in this one tag. One can also code a space between letters if they are ligated on the object, but the former is the end of a word and the latter the beginning of a word in the transcription:

    <hi rend="ligature">???</hi>

    <hi rend="ligature">? ?</hi>. Used in both diplomatic and transcription.

  • Line break: <lb/>
Probably our most commonly used tag. Mark a line break on every line, except the final line. If the inscription is only one line long, then do not enter a line break tag.

    Note that a <lb/> tag can occur in the middle of the word. When this is a case, it is important not to put any spaces between the tag and the continuation and not to actually type a hyphen. Instead, encode as <lb break=“no”/>.

    Used in diplomatic and transcription, but not in translations, where the typography follows the source.

  • Phoenician (paleo-Hebrew) letters and/or numerals: (cf. e.g. ostraca from Masada) a glyph tagging is used in these cases. Each letter needs to be tagged separately as a glyph, and identified using the Unicode name as shown in the link below. So, encode as follows: <g ref="#PHOENICIAN-NUMBER-THREE"/> or <g ref=“#Phoenician-letter-gaml”/>. Please note that no spaces are allowed within “ref”, so a dash needs to be added in between the words that compose each single definition. For definitions, use this as a reference tool: http://unicode.org/charts/PDF/U10900.pdf

  • Supplied: Letters are usually missing for one of two reasons: the area is damaged or the writer of the inscription made a mistake.

    Letters that are supplied by an editor due to damage are typographically marked by brackets: Le[g]. We would typically encode this as Le<supplied reason=“lost”>g</supplied>. In cases where the restoration is less certain, we can add a certainty attribute. So, Le[g?] would be encoded, Le<supplied reason=“lost” cert=“low”>g</supplied>.

    Letters omitted by the writer of the inscription that are supplied by the editor are typographically indicated by angle brackets. L<e>g would thus render the diplomatic Lg. This is tagged: L<supplied reason=“omitted”>e</supplied>g.

    This example is particularly illustrative because it is likely that Le[g] also requires an abbreviation tag: <expan><abbr>Le<supplied reason=“lost”>g</supplied></abbr><ex>io</ex></expan>.

    Used only in the transcription.

  • Sic. This is used to indicate words that the writer misspelled. Editors often typographically mark this with (!). We use the <sic> tag in order to indicate that a word was misspelled, which is a departure from the cheat sheet. For an example, let’s imagine that the writer recorded the name Augostus when the correct spelling was Augustus. The author of the article indicates this error, so it's our job to record it. We do so like this: <sic corr="Augustus">Augostus</sic>. The corrected spelling goes inside of the corr="", while the original spelling as recorded in the inscription goes between the > and </sic>.

    Note that <sic> is not used for grammatical mistakes, but only spelling errors. Instead, grammatical mistakes can be recorded in the "note" section of the xml file. Specifics are rarely necessary - "The editor notes that the inscription contains grammatical errors" is fine.

    Used in diplomatic and transcription.

  • **Unclear.**
A standard way to indicate an unclear character when publishing inscriptions is to have the letters appear with a dot below it. In XML, we represent this with a tag: <unclear>i</unclear>.


    Traces of letters are indicated differently. Use the format:

    <gap reason=”illegible” quantity=”x” unit=”y”/>, where y is usually the word character.

    Used in diplomatic and transcription.

  • Vacat. Spaces that are left blank intentionally are often indicated as (vac.). We tag this with the space element: <space quantity=“x” unit=“y”/>, where x is a number and y is usually “character”. If the extent is unknown, the tag is: <space extent=“unknown” unit=”character”/>.


**Special Characters**


  • In order to enter a cross, simply type a plus sign: +. We also note this cross as a figure.

    It is tagged as follows: <g type=“cross”>

  • Letter “z” as symbol for drachma(s): tag as follows: <g type="drachma">z</g>

Best Practices

  • Oxygen has a proofreading tool, which previews how your document will display on the web and helps you catch errors. In Oxygen, under "Documents," select "Transformation," then "Apply Transformation Scenario." (You can also get there by clicking the icon of a red, side-facing triangle in a circle somewhere at the top of your document window.) Select "proofreading" and hit "Transform now." It should display in Firefox or Safari. (If it's not working, see bottom of page.)

Workflow

  • Inscriptions that you are working on should be in your folder;

  • When you have finished working on an inscription, move it into the “Finished” subfolder (in your folder);

  • Gaia will check the inscription and move it into the “Finished Inscriptions” folder;

  • Periodically, Elli will move the files from the “Finished Inscriptions” to the “IIP-versioned” folder in Dropbox, and then upload these files to the server;

  • Michael or Gaia will log into the server from the “admin” button in the “search” page. All inscriptions that were new or changed in IIP-versioned have a status of “To Approve.” We check that box and search for all unapproved inscriptions. Individual inscriptions are opened by right-clicking “Link to this Inscription” and opening it in another tab or window. If it looks good, it is then changed with the buttons on top to “Approved”. A process is run at regular intervals (several times a day) that will then add the inscription to the public database.

Correcting Inscriptions

  • Inscriptions are generally corrected from the IIP-versioned file. Open this file in Oxygen, make the corrections, and then close and save it. At regular intervals newly modified inscriptions will be uploaded to the server.

  • How can I upload inscriptions with SVN when I want?

Global Changes in Oxygen

  • Select the folder or highlight several files in the project list

  • Right click on the selection, select Find/Replace in Files

  • Fill out dialog as in picture attached below.

    • Your goal is to constrain the parts of the file that can be changed as much as possible. You can use an XPath to do this as well as to restrict changes to element content and so on, in XML search options.
  • Test your expression by just doing a "Find All" and see if the results are the string you expect.

  • Try again with "Replace All"

  • In the dialog that appears, select "Preview" (see image below)  instead of "Replace"

  • Step through some of the changes to make sure you are seeing what you expect.

  • Click "OK" to change everything, 

NOTE: There is no undo if you are changing many files at once. If you are really worried, use the "Make Backup File" option in the dialog box. Don't forget to remove the backups once you are done. (need further documentation for doing that)