Skip to content

dse-as/workflow_IIIF-ATR-TEI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DSE-AS document workflow: IIIF-ATR-TEI ⚙️

The preparation workflow for images and transcriptions consists of the following steps:

Auxiliary methods:

To facilitate handling, most scripts are executable directly on Github, either by opening an issue (using the appropriate template) or by committing a metadata file to the repository.


Upload IIIF images to Transkribus for ATR

Initiate IIIF upload to Transkribus

Automated upload workflow of IIIF images into a Transkribus collection.


Delete documents from Transkribus collection

Delete Transkribus document



Download PAGE

python scripts/PAGE-from-Transkribus/download_latest_pagexml.py -u 'USERNAME' -p 'PASSWORD' -c 'COLLECTION-ID-1' 'COLLECTION-ID-2' -o 'OUTFOLDER'

PAGE to raw TEI

python scripts/PAGE-to-raw-TEI/page2TEI.py -i download -o download_out

Schematic

        ───────────────────────────────────────────────────────────────────────────────────────╮
                           document and image identifiers remain stable after initial creation │
                                                                                               │
  ┌─┬──┬─┬─┬──┬──┬─┬──┬─┬─┐                                                                    │
  │small forms   │ │  │ │ │                                                                    │
  │ID  │metadata      IIIF│                                                                    │
  │ │ ┌┴┬──┬─┬─┬──┬──┬─┬──┼─┬─┐                   iiif.annemarie-schwarzenbach.ch/presentation │
  │ │ │letters       │ │  │ │ │                                                                │
  │ │ │ID  │metadata      IIIF│                     ┌─────┐                                    │
  │ │ │ │  │ │ │  │  │ │  │ │ │   ━━━━━━━━━━━━━━▶  ┌┴────┐│    one .toml file per document     │
  └─┴─┤ │  │ │ │  │  │ │  │ │ │                   ┌┴────┐││                                    │
      │ │  │ │ │  │  │ │  │ │ │                   │     │├┘                                    │
      │ │  │ │ │  │  │ │  │ │ │                   │     ├┘                                     │
      └─┴──┴─┴─┴──┴──┴─┴──┴─┴─┘                   └─────┘                                      │
  docs.google.com/spreadsheets                           commit to dse-as.github.io/i3f        │
                                                      generates IIIF presentation manifest     │
                                                                                               │
                                                                          ┃                    │
                                                                          ┃                    │
                                                                          ┃                    │
                                                                          ┃                    │
                                                                          ┃                    │
                                                                          ▼                    │
                                                                                               │
                                                  dse-as.github.io/workflow_IIIF-Transkribus-AT│
┌────────────────────────────┐                                                ┌─────────────┐  │
│ ┌──────────┐ ═══════════   │                                                │             │  │
│ │          │ ══════════    │                       form-based image upload  ├─────────────┤  │
│ │          │ ═══════════   │    ◀━━━━━━━━━━━━━━       into Transkribus      │─────        │  │
│ │          │ ═════════════ │                             collection         │─────        │  │
│ │          │ ═══════════   │                                                ├─────────────┤  │
│ │          │ ═════════════ │                                                └─────────────┘  │
│ │          │ ═══════════   │                                                                 │
│ └──────────┘ ══════════    │                                                                 │
│                            │                                                                 │
│                            │                                                                 ▼
└────────────────────────────┘                                                                 │
 app.transkribus.org                                                                           │
                                                                                               │
 text recognition, (rough) structural annotation                                               │
                                                                                               │
  ┃                                                                                            │
  ┃                                                                                            │
  ┃   3 Transkribus collections                                                                │
  ┃                                                                                            │
  ┃                                                                                            │
  ┗━━━━▶  as-dse_wait                                                                          │
      ┃                                                                                        │
      ┃                                                                                        │
      ┃                                                                                        │
      ┃                                                                                        │
      ┃                                                                                        │
      ┗━━━━▶  as-dse_work                                                                      │
          ┃                                                  TEI-XML data                      │
          ┃                                                                                    │
          ┃                                                  ┌───────┐                         │
          ┃                                                  │       ├─┐                       │
          ┃                                                  │       │ ├─┐  1 file per text    │
          ┗━━━━▶  as-dse_finalised     ━━━━━━━━━━━━━━━▶      │       │ │ │                     │
                                                             │       │ │ │                     │
                                                             └─┬─────┘ │ │                     │
                                                               └─┬─────┘ │                     ▼
                                     script-based export from    └───────┘                      
                                        Transkribus and data          ║                         
                                      transformation (raw TEI,        ║                         
                                            project TEI)         ╔════╩════════════════╗        
                                                                 ║                     ║        
                                                                 ║                     ║        
                                                                 ║                     ║        
                                                                 ║                     ║        
                                                                 ║                     ║        
                                                                 ║                     ║        
                                                                 ▼                     ▼        
                                                                                                
┌──────────────────────────────────────────────────────────────────────────┐  ┌───────────────┐ 
│                     development of web presentation                      │  │               │ 
│                                                                          │  │  FAIR data    │ 
├────────────────────────┬────────────────────────┬────────────────────────┤  │  repository   │ 
│                        │                        │                        │  │               │ 
│   data transformation, │    index, register     │        frontend        │  │               │ 
│     (static) backend   │                        │                        │  │               │ 
│                        │                        │                        │  │               │ 
└────────────────────────┴────────────────────────┴────────────────────────┘  └───────────────┘ 

Credits

The code in this repository is based on

License