-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF Renderer: allow to specify an alternate image or a custom resolution. #4171
base: main
Are you sure you want to change the base?
PDF Renderer: allow to specify an alternate image or a custom resolution. #4171
Conversation
cmake does not install a PDF font file. It was the old way, how to handle font in pdf. Now it is automatically included in library |
@jbreiden @jbreiden2 : Jeff can you have a look at this? |
Thanks, @zdenop, for the explanation. I was confused with the @jbreiden @jbreiden2, a better way to check pdf files generated than file maximum size is welcomed |
It looks there is a little interest, that happens :) Thanks all |
Hi, it's not unusual that pull requests take some time before they are merged. That does not necessarily mean that there is little interest, but there is only a small number of people who contribute to pull requests by adding comments or testing them. |
No worries at all, I just saw it open on my to-do list for a while, so I preferred to close. Thanks for your feedback, I understand, reopened, no hurry. |
Since it extends the API functionality, it should be included in the 5.4.0 release. |
…ammatically Support new rendering_dpi api params. Add pdf renderer tests. Install pdf font in cmake tool chain. resolves tesseract-ocr#210 resolves tesseract-ocr#3798
13bccc0
to
94b95b1
Compare
I rebased this pull request and fixed a merge conflict. |
What about implementing this feature also to tesseract executable as a command line option? |
Isn't that already possible with |
pdf: tests add lib leptonica dependency in the make toolchain
With |
Tesseract can create multi-page PDF files when it is called with a list of images. Ideally that should also work with alternate images. |
Would it be possible to implement the desired features by only adding new Tesseract parameters – without any change of the C / C++ API? |
Motivation
Input images passed to OCR are often pre-processed (higher resolution, grayed, etc...).
It can be useful to specify an alternate image or a lower resolution in renderer, especially for a searchable pdf export.
Proposed changes
TessResultRenderer::SetRenderingImage
orTessResultRenderer::SetRenderingResolution
methods allow to programmatically change image or resolution to render before adding image to the rendererrendering_dpi
param allows to override the output resolution by scaling the source imagepdf.ttf
font in the cmake install targetThese changes might resolve #210 and #3798 features request.
Checks
make check
passed locally on ubuntu 23.10