-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow running with incomplete descriptions #58
Conversation
(fall back to physical pages, sorted if possible, or error with empty text)
27dffe8 is warranted IMO because of DTABf details for 71fd269 is useful if your input does not contain the images under |
- for `sourceDesc/biblFull/titleStmt/title/@level`, only use allowed values (m/a/j/s/u), and try mapping from top-level logical `div/@TYPE` - for `sourceDesc/bibl/@type`, try mapping from top-level logical `div/@TYPE` - instead of ignoring `titleInfo` main and part/volume titles, - prefer main title from titleInfo over top-level logical `div/@LABEL` - prefer `titleInfo/@type=uniform` or empty over abbrev/alternative/translated - also parse and add `partNumber/partName` or `part` - instead of spilling titleInfo between `fileDesc/titleStmt` and `biblFull/titleStmt`, copy the former to the latter when complete, and then add `@level` etc
The last 2 commits improve the coverage and conformance of title and identifier metadata. It affects #36, but there is still much to do. |
…idno from mods:url
…t | mods:classification)
creation.text = collection | ||
profile_desc = self.tree.xpath('//tei:msDesc/tei:msIdentifier', namespaces=ns)[0] | ||
coll = etree.SubElement(profile_desc, "%scollection" % TEI) | ||
coll.text = collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check whether this is DTABf? I think, I used this more abstract solution since collection
of the digital work is not necessarily collection of the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not in DTABf at all – sry. But neither is creation
(and it seems like a misfit).
You're right in that it's not clear whether relatedItem/@type=series
applies to the physical copy or digital presentation (which could perhaps be differentiated on the TEI side by msIdentifier
vs objectIdentifier
IIUC).
Perhaps the whole thing should rather enter biblFull/seriesStmt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right in that it's not clear whether
relatedItem/@type=series
applies to the physical copy or digital presentation (which could perhaps be differentiated on the TEI side bymsIdentifier
vsobjectIdentifier
IIUC).Perhaps the whole thing should rather enter
biblFull/seriesStmt
?
I am now certain that's the right place. And to differentiate between series of physical copy and series of digital presentation, we could use fileDesc/sourceDesc/biblFull/seriesStmt
vs. fileDesc/seriesStmt
(which is also allowed by DTABf RNG, but not documented).
Let's discuss further under #44!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all: Many thanks for this great contribution.
Personal communication on some details requested.
Expands on #56, additionally fixes #57.