[FEA]: Add ability to detect bounding boxes for compound images during extraction #353

drobison00 · 2025-01-20T18:44:05Z

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Significant improvement

Please provide a clear description of problem this feature solves

Description

During the extraction phase for PDFs, PPTx, and Docx, we often encounter situations where we have a number of small images that collectively make up a compound image. Currently, we treat all of these images independently, returning each as its own primitive in the extraction results.

We want to improve this behavior and make it configurable, so that it is possible to preserve the existing behavior, or to instruct the extraction phase to attempt to identify a single bounding box for a set of connected images.

Note

We do not know a-priori if a collection of images is part of a single image.
There may be multiple distinct collections on a page.

Describe the feature, and optionally a solution or implementation and any alternatives

This issue aims to define an approach to:

Identify clusters of bounding boxes that belong to the same compound image.
Compute the overall bounding box for each identified compound image.

Required Behavior

Group bounding boxes that belong to the same compound image.
Handle scenarios where bounding boxes may be close but not necessarily overlapping.
Compute a minimal bounding box that encapsulates all grouped bounding boxes.
Support configurable proximity thresholds for clustering.
Ensure that detected table/chart images are not included in connected component bounding boxes.
Output detected compound image bounding boxes.
Provide visualization for debugging grouped bounding boxes.
Efficient performance on large collections of bounding boxes.

Example Approach: Bounding Box Expansion

Steps:

Initialization:
- Identify all bounding boxes on a page.
- Exclude table/chart bounding boxes from processing.
Expansion:
- Expand each bounding box by a configurable margin.
- Merge overlapping or adjacent bounding boxes iteratively.
Refinement:
- Apply post-processing to fine-tune the final bounding box.
- Remove potential over-grouping by applying size and aspect ratio constraints.
Output:
- Store and visualize final compound bounding boxes.

Example Scenarios

Scenario 1: Adjacent Boxes Forming a Compound Image

graph TD
    subgraph Compound Image 1
        A[Box 1] -->|Close to| B[Box 2]
        B -->|Close to| C[Box 3]
    end
    subgraph Compound Image 2
        D[Box 4] -- No Connection --> E[Box 5]
    end

Expected output:

Compound Image 1: { Box 1, Box 2, Box 3 }
Compound Image 2: { Box 4, Box 5 }

Scenario 2: Distant Boxes Forming Separate Images

graph TD
    A[Box 1] -- Far --> B[Box 2]
    subgraph Compound Image 1
        C[Box 3] -->|Close to| D[Box 4]
    end

Expected output:

Compound Image 1: { Box 3, Box 4 }
Compound Image 2: { Box 1 }
Compound Image 3: { Box 2 }

Scenario 3: Overlapping Boxes

graph TD
    subgraph Compound Image 1
        A[Box 1] -->|Overlapping| B[Box 2]
        B -->|Overlapping| C[Box 3]
    end

Expected output:

Compound Image 1: { Box 1, Box 2, Box 3 }

Acceptance Criteria

Bounding boxes are correctly grouped based on proximity and overlap.
Correct minimal bounding box is computed for each detected cluster.
Performance remains efficient with increasing numbers of bounding boxes.
Clustering and visualization options are configurable.

Additional context

drobison00 added the feature request New feature or request label Jan 20, 2025

drobison00 assigned edknv Jan 20, 2025

edknv mentioned this issue Feb 5, 2025

Add ability to group collections of image bounding boxes #386

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA]: Add ability to detect bounding boxes for compound images during extraction #353

[FEA]: Add ability to detect bounding boxes for compound images during extraction #353

drobison00 commented Jan 20, 2025 •

edited

Loading

[FEA]: Add ability to detect bounding boxes for compound images during extraction #353

[FEA]: Add ability to detect bounding boxes for compound images during extraction #353

Comments

drobison00 commented Jan 20, 2025 • edited Loading

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Description

Describe the feature, and optionally a solution or implementation and any alternatives

Required Behavior

Example Approach: Bounding Box Expansion

Steps:

Example Scenarios

Scenario 1: Adjacent Boxes Forming a Compound Image

Scenario 2: Distant Boxes Forming Separate Images

Scenario 3: Overlapping Boxes

Acceptance Criteria

Additional context

drobison00 commented Jan 20, 2025 •

edited

Loading