Skip to content

Files

Latest commit

 

History

History

document_stream_builder

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Document Stream Builder

Python script to build consecutive document streams from a collection of pdf documents.

Usage

python document_stream_builder.py ||
    --input <Input Dir>           ||
    --output <Output Dir>         ||
    --random <True/False>         ||  
    --limit <Number>
  • input: Input directory (Default: "./input/")
  • output: Output directory (Default: "./output/")
  • random: Random document order in page stream (Default: True)
  • limit: limit the amount of processed input documents