Addtion of a pdf checker
This release includes six tools :
naming_conventions.py & naming_conventions_do_rename.py to enforce some strict rules over the directory names in a file tree. naming_conventions.py is a preview with no effective renaming
check_jpegs for a fast sanity check of a jpegs file tree & check_jpegs_full for a deeper and slower sanity check
scandir2pdf for massive conversion from jpegs to pdfs.
scandirpdf2txt for massive conversion from ocred pdfs to txt files for the purpose of fast full text search with dedicated tools (google or else).
new: check_pdfs for a sanity check of a pdf files tree, it can detect corrupted files, even though some can be open with acrobat reader. There are certainly several possible causes of "false" positives : less standard formats and robustness of acrobat reader to corrupted files.
new : naming_conventions_files.py & naming_conventions_do_rename_files.py to enforce some strict rules over the file names in a file tree. naming_conventions_files.py is a preview with no effective renaming.
validated on 27K+ jpeg files, 3K+ pdfs.