Double characters/strange values in table #626
jakobdo
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment 2 replies
-
Some of the values can be corrected by: row[0][1::2], but this is not 100% perfect for all columns. So I guess this is a "buggy" PDF, but I really hope some of you PDF-experts has a bullet proof solution for this issue. :) |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello again, in this pdf: https://www.taggmbh.at/fileadmin/content/TAG-Website-Content-SM/2016_Maintenance_list_PROD_PDF.pdf
When using the extract tables, on page 2, I get the following values from one of the lines:
00:'TT..0116'
01:'SStaytsiotenm w toerskts'
02:'MCSS AWrneoiteldnsdteoirnf'
03:'0210..0037..22001166'
04:'0068::0000 hh'
05:'299'
06:'0280..0057..22001166'
07:'0165::0000 hh'
08:'1289'
09:'678 h doauyrss'
etc...
In pdf it looks like:
![image](https://user-images.githubusercontent.com/278270/158545749-8aad8cb2-31c2-42a5-9310-5f6cc70fcda3.png)
Can I fix this issue somehow or is this pdf just buggy?
It seems line the line 1 from page 1 and line 1 from page 2 is mixed.
Page 1 line 1:
![image](https://user-images.githubusercontent.com/278270/158546504-b48099be-58cd-40dd-89b5-87f1c91f8cf8.png)
Page 2 line 1:
![image](https://user-images.githubusercontent.com/278270/158546618-8e86ce2d-8b51-41cc-9a0c-ff973f6be643.png)
Beta Was this translation helpful? Give feedback.
All reactions