How can I run the Orientation model solely to determine the page rotation angle? #1797
-
Hi Team, FYI: I'm using docTR (0.10.0) with the following configuration to pass an image and retrieve the page orientation. I'm satisfied with the results from the
Note: I tried Also, is there any guide available for training a custom orientation model? I checked this page, but it doesn’t clearly explain how to properly prepare the dataset for training, eg: how our images looks like? Thanks!! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @sanjay-nit 👋, That's right our orientation prediction depends on the detection model:
Here is a short snippet how you could use it without the recognition part: import requests
import numpy as np
from doctr.io import DocumentFile
from doctr.models import detection_predictor, page_orientation_predictor
from doctr.models._utils import estimate_orientation
url = "https://www.francetvinfo.fr/pictures/uGwaNE-aJq7zHLhZJdzdCd9nyjE/1200x900/2021/03/16/phpCDwGn0.jpg"
det_predictor = detection_predictor(
arch="fast_base",
pretrained=True,
assume_straight_pages=False,
) #.cuda().half() # Uncomment if running on GPU
page_orient_predictor = page_orientation_predictor(pretrained=True) #.cuda().half() # Uncomment if running on GPU
det_predictor.model.postprocessor.bin_thresh = 0.3
det_predictor.model.postprocessor.box_thresh = 0.65
docs = DocumentFile.from_images([requests.get(url).content])
loc_preds, out_maps = det_predictor(docs, return_maps=True)
seg_maps = [
np.where(out_map > getattr(det_predictor.model.postprocessor, "bin_thresh"), 255, 0).astype(np.uint8)
for out_map in out_maps
]
_, classes, probs = zip(page_orient_predictor(docs))
# Flatten to list of tuples with (value, confidence)
page_orientations = [
(orientation, prob)
for page_classes, page_probs in zip(classes, probs)
for orientation, prob in zip(page_classes, page_probs)
]
origin_pages_orientations = [
estimate_orientation(seq_map, general_orientation)
for seq_map, general_orientation in zip(seg_maps, page_orientations)
]
orientations = [
{"value": orientation, "confidence": prob} for orientation, prob in zip(origin_pages_orientations, probs[0])
]
print(orientations) |
Beta Was this translation helpful? Give feedback.
Hi @sanjay-nit 👋,
That's right our orientation prediction depends on the detection model:
Here is a short snippet how you could use it without the recognition part: