-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the bug where Python scripts fail to execute PDF text recognition… #11994
Conversation
guangyunms
commented
Apr 24, 2024
- 修复Python脚本执行pdf文本识别任务失败的BUG,优化判断pdf文件的逻辑。
- 为版面分析的quickstart文档添加案例。
… tasks, optimize the logic of judging PDF files, and add cases to the quickstart document for layout analysis.
Thanks for your contribution! |
原PR为# 11984,因为原PR的commit message过于混乱,而重新创建PR。 |
Lines 845 to 960 in 00f0d42
这里已经有相关的逻辑了,PR里添加的与现有的有什么异同 |
版面分析的quickstart文档案例确实没有提供pdf格式的处理 demo。感觉可以把main里面相关内容提出来改一改,做一个pdf格式的demo。 |
我的贡献参考了这里的代码。异同在于已有的代码是通过命令行方式运行的,而我的贡献是通过Python脚本运行的。开发者可能更习惯Python脚本的方式 |
确实,我目前参考quickstart文档里已有的案例写了一个demo。 |
main里面是先把pdf文件解析成单个图片,然后再对单个图片处理。可能并不需要直接传pdf到 PPStructure engine。只需要把demo改成先解析pdf,再处理图片的形式。这样改动最小,也解决了用户的疑惑。 |
ppstructure/docs/quickstart.md
Outdated
@@ -189,7 +190,29 @@ im_show.save('result.jpg') | |||
``` | |||
|
|||
<a name="223"></a> | |||
#### 2.2.3 版面分析 | |||
#### 2.2.3 版面分析+文本识别 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该还有版面分析,直接合并到版面分析里吧
这样子也可以的,我觉得可以把两种方式都写上,一种是直接传pdf,因为现有文档里命令行的运行方式就是直接传入的pdf文件路径,用户看了之后可能觉得这种更符合使用的直觉。另一种是用户自己先对pdf进行处理解析成图片,再处理图片。 写好之后我再合并到之前的版面分析里吧。 您这边觉得如何? |
@guangyunms 看了一下,ocr部分是支持pdf infer的,所以这么改也是合理的。可以按照你的想法做。 |
# for infer pdf file | ||
if isinstance(img, list): | ||
if isinstance(img, list) and flag_pdf: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这样改,对处理gif会不会有影响
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
返回 flag_gif 和 flag_pdf是不是很有必要,这里判断它是不是list,应该也是可以达到目标的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 我测试了,不会对gif造成影响,并且从代码中得知判断gif的文件的原因是读取文件的方式与普通图片不同,至于后续处理应该都是一样的,并不会和pdf一样出现因为存在多页而出现错误。
- 是的,目前看来判断是不是list也可以达到目标,但是根据数据类型判断感觉不太稳妥,而代码中既然有flag_pdf这个判断标准,感觉还是加上这个判断条件比较符合相关函数的定义和代码逻辑,且不会影响到后续的设计。
return res_list | ||
res, _ = super().__call__(img, return_ocr_result_in_table, img_idx=img_idx) | ||
return res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里返回类型发生了改变,会不会对用户使用造成困扰。建议参考ocr部分处理一下。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
关于返回类型发生了改变,会不会对用户使用造成困扰。参考ocr部分可知,它对返回类型的处理为,img如果是list,则不做改变,如果不是list,则把他放入一个list里返回,即都处理成一个list。然而,对于PPStructure类,它的定义和ocr不同,似乎是设计为返回单个页面的结果,main函数验证了我的猜想,目前命令行的方式里调用PPStructure是让它返回单个值的,如果按照OCR的部分处理的话,势必要改变main函数,我觉得还是暂时不动比较好。因为您那边可能对后续如何编写有其它设计,我尽量不改变已有的操作方式。
|
paddleocr.py
Outdated
@@ -561,6 +561,7 @@ def check_img(img, alpha_color=(255, 255, 255)): | |||
alpha_color: Background color in images in RGBA format | |||
return: numpy.array (h, w, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个返回类型的描述也需要改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
测试了一下两个demo都能正常工作。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@guangyunms Thanks for your contribution! You will receive a beautiful PaddlePaddle gift. Please provide your mailing address by filling out the following questionnaire before October 18th. Looking forward to the future, we will walk further together in the world of open source! |
hi, @guangyunms
|