Skip to content
djbpitt edited this page Jul 8, 2014 · 6 revisions

Welcome to the collatex wiki!

TEI input:

Discussion 2014-07-08 (Lausanne) David J. Birnbaum / Ronald Haentjens Dekker:

  • each witness in a separate TEI document
  • take the <body> element (ignore the rest)
  • get rid of the hierarchy by converting tags into ranges or milestones
  • tokenize on whitespace and punctuation (djb: is this what we should do with punctuation?)
  • create normalized version
  • collate
  • generate variant graph
  • TEI output issue: you can't raise the hierarchy again in a direct way because the collation markup introduces an overlapping hierarchy
  • Solution: not responsibility of CollateX to raise hierarchy again; output with the milestones in place (attach milestone to the nearest token - with "nearest" still to be defined)
Clone this wiki locally