划分段落
#17
Replies: 1 comment
-
看着很不错,不懂就问:跨列合并怎么做到的呀 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
1.pdf分段按各级标题组织很合理,难点会集中在标题的识别,找到了标题和层级,分段就很自然了
2.因为布局模型都是单页扫描,会遗漏跨页或者双列跨列的文段和表格,建议进行跨列跨页的识别框拼接,再进行识别
这是我使用rappidlayout进行分段的尝试,先进行了跨列合并,再进行了跨页合并
原始识别:
跨列合并:
跨页合并:
Beta Was this translation helpful? Give feedback.
All reactions