-
Notifications
You must be signed in to change notification settings - Fork 39
Home
djbpitt edited this page Jul 8, 2014
·
6 revisions
Welcome to the collatex wiki!
TEI input:
Discussion 2014-07-08 (Lausanne) David J. Birnbaum / Ronald Haentjens Dekker:
- each witness in a separate TEI document
- take the
<body>
element (ignore the rest) - get rid of the hierarchy by converting tags into ranges or milestones
- tokenize on whitespace and punctuation (djb: is this what we should do with punctuation?)
- create normalized version
- collate
- generate variant graph
- TEI output issue: you can't raise the hierarchy again in a direct way because the collation markup introduces an overlapping hierarchy
- Solution: not responsibility of CollateX to raise hierarchy again; output with the milestones in place (attach milestone to the nearest token - with "nearest" still to be defined)