Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用word_detection_totaltext.pth检测单词,结果有遗漏的情况 #24

Open
anderuier opened this issue Feb 5, 2025 · 3 comments

Comments

@anderuier
Copy link

我使用python demo_text_detection.py --checkpoint pretrained_checkpoint/word_detection_totaltext.pth --model-type vit_h --input demo/001.jpg --output demo/ --dataset totaltext测试了一下自己的图片(纯文本的图片),发现漏了很多单词,第一张图漏了少量单词,第二张图大部分都漏了,这个是不是因为没有针对这种纯文本的数据进行训练的原因呢?

Image

Image

@ymy-k
Copy link
Owner

ymy-k commented Feb 5, 2025

这种数据应该用HierText上训的Hi-SAM

@anderuier
Copy link
Author

这种数据应该用HierText上训的Hi-SAM

您指的是用python demo_hisam.py --checkpoint pretrained_checkpoint/hi_sam_h.pth --model-type vit_h --input demo/001.jpg --output demo/ 这个跑吗?这个跑出来效果下面是这样的

Image

我还有两个问题想请教一下:1、如何返回切割后的所有框位置信息;2、我想要的切割结果是精确到每个字符,并且返回每个字符的bbox结果,是不是需要做包含每个字符位置信息的数据集,然后重新训练一下?

@ymy-k
Copy link
Owner

ymy-k commented Feb 19, 2025

参照README 2.2 step1保存下来jsonl结果再可视化

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants