Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with PaddePaddle framework and PaddleOCR library.
This MangaOCR is inspired by an old project called manga-ocr built by kha-white and other contributors. We admit that although kha-white's manga-ocr model has excellent performance, the size of the model is large (444Mb) and the recognition speed is a bit slow. Thus, this project is built with our goal to provide another end-to-end lightweight model for text recognition in manga. And to achieve performance equivalent to manga-ocr, we also aim to handle a variety of special text types in manga:
- both vertical and horizontal text
- text with furigana
- text overlaid on images
- wide variety of fonts and font styles
- low quality images
Up until now, our model has achieved the following:
- Small model size with only about 8Mb
- Recognize both vertical and horizontal text
- Recognize text overlapping the image
- Recognize with different fonts and font styles
And things that haven't been achieved yet:
- Better recognition results when compared with manga-ocr. Currently, our model has a CER(Character Error Rate) result of about 14.4% ( 2-10% is good for printed text and 10-20% for handwritten text ).
- Full recognition of unusually long text. With texts of unusually long length (usually over 100 characters), the model is able to predict the first 25-50% of characters, then errors gradually appear.
Here are some currently cherry-picked examples showing the capability of the model.
image | Our MangaOCR result |
---|---|
軍のテストフライトセンター | |
つぎの日曜日は | |
成魚」はこれくらい | |
うわああああああああああああ | |
付き合い始めたきっかけだって | |
極東方面に所属不明機を多数確認! | |
永遠が見えるまで駆けぬけていきたい | |
極道やった経験が生かされてるやろ | |
まだダイエットあきらめてなかったわけ? |
This project is currently working with the usage of:
- Manga109-s dataset
- CC-100 dataset