This repository has been archived by the owner on Feb 16, 2023. It is now read-only.
OCR errors on pre-generated pdf's #1299
Unanswered
davidmlafuente
asked this question in
Support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all!,
This is my first post here after deploy paperless-ng for my own paper management and (of course) I have a problem.
Frecuently, I receive the invoices for power supply on my email and I want to store it on paperless. In this case, I put some of these invoices at the consume folder and paperless catch all fine but when I go to the "Content" tab of a document I see the following text:
(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7)(cid:8)(cid:9)(cid:10)(cid:5)(cid:11)(cid:12)(cid:13)(cid:7)(cid:14)(cid:7)(cid:15)(cid:6)(cid:3)(cid:16)(cid:17)(cid:18)(cid:6)(cid:9)(cid:10)(cid:15)(cid:19)(cid:7)(cid:20)(cid:21)(cid:22) (cid:9)(cid:19)(cid:20)(cid:12) (cid:23)(cid:24)(cid:25)(cid:26)(cid:27)(cid:26)(cid:25)(cid:28)(cid:26)(cid:29)
If I put a pdf generated from my scanner, everything goes well and ocr works fine. Even if I put other web-app-generated pdf from another supplier, it works fine.
So I thing that the problem could be the way my power supplier generates the pdf on their application.
i figure that talking with the company will not solve anything, but... anyone knows what I can do to solve this?
I attached a screenshot that shows the problem.
Thank to all for your time!!
![screenshot](https://user-images.githubusercontent.com/78075239/132848102-5877a7e0-3f03-4292-b284-40d64c3c6330.jpg)
*** UPDATE ***
Sorry, I made a mistake. PDF file wasn't go directly from the email to paperless. in this case, I received a lot of invoices in the same pdf file and I use macOS preview app to separate the file in different files, one per invoice.
So i´ve thinked... well, maybe if I go to my supplier webpage and download one invoice from the customer panel.... without touch anything... MAYBE it can work. but no. It throws me other error.
Beta Was this translation helpful? Give feedback.
All reactions