Releases: dataiku/dss-plugin-tesseract-ocr
Releases · dataiku/dss-plugin-tesseract-ocr
Version 2.3.3
Version 2.3.2
Fix reading temporary file for pypandoc conversion
Version 2.3.1
Fix text extraction from html files with line wraps when chunking
Version 2.3.0
Improve markdown text extraction when chunking (#76) * Improve markdown text extraction when chunking * Only use pandoc for text block conversion of non markdown files * Fix typo
Version 2.2.0
Merge pull request #73 from dataiku/feature/extract-text-chunks Extract text chunks
Version 2.1.1
Merge pull request #72 from dataiku/chore/rename-recipes Improve recipes title and description wording
Version 2.1.0
Merge pull request #71 from dataiku/feature/text-extraction-pandoc Text extraction with pandoc
Version 2.0.0
Merge pull request #70 from dataiku/feature/add-easyocr add easyocr and accept pdf
Release v1.0.3
Update code env description to support python versions 3.8, 3.9, 3.10 and 3.11
release v1.0.2
Update CHANGELOG.md