Merge pull request #388 from Living-with-machines/ocr

Add text spotting
maps-as-data · Apr 29, 2024 · f256ae9 · f256ae9
2 parents 186f219 + 6237341
commit f256ae9
Show file tree

Hide file tree

Showing 13 changed files with 3,805 additions and 2 deletions.
diff --git a/.gitignore b/.gitignore
@@ -11,3 +11,4 @@ docs/source/api/mapreader/
 dist/
 .coverage
 .coverage.*
+build/
diff --git a/docs/source/Install.rst b/docs/source/Install.rst
@@ -31,7 +31,7 @@ Once you have installed either Ananconda or miniconda, open your terminal and us
 
    .. code-block:: bash
 
-      conda create -n mapreader
+      conda create -n mapreader python=3.10
 
    This will create a conda enviroment for you to install MapReader and its dependencies into.
 

diff --git a/docs/source/User-guide/Spot-text.rst b/docs/source/User-guide/Spot-text.rst
@@ -0,0 +1,227 @@
+Spot text
+=========
+
+MapReader now contains two new methods of spotting text on maps:
+
+- ``DPTextDETRRunner`` - This is used to detect text on maps using `DPTextDETR <https://github.com/ymy-k/DPText-DETR/tree/main>`__ and outputs bounding boxes and scores.
+- ``DeepSoloRunner`` - This is used to detect and recognise text on maps using `DeepSolo <https://github.com/ViTAE-Transformer/DeepSolo/tree/main>`__ and outputs bounding boxes, text and scores.
+
+Install dependencies
+--------------------
+
+To run these, you will need to install the required dependencies.
+
+.. note::
+
+    We have our own forks of these repos to enable them to work on CPU!
+
+Detectron2
+~~~~~~~~~~~
+
+Detectron2 is a popular object detection library built by Facebook AI Research.
+The main repo is available `here <https://github.com/facebookresearch/detectron2>`__.
+
+To install, run the following commands in your terminal:
+
+.. code:: bash
+
+    git clone https://github.com/facebookresearch/detectron2.git
+    cd detectron2
+    pip install .
+
+
+.. note::
+
+    Since both the DPText-DETR and DeepSolo repos are built ontop of `AdelaiDet <https://github.com/aim-uofa/AdelaiDet>`__, you won't be able to install both at the same. To get around this, you can set up two different conda environments, one for each.
+
+You should then pick one of the following to install:
+
+DPTextDETR
+~~~~~~~~~~~
+
+Our fork for DPText-DETR is available `here <https://github.com/rwood-97/DPText-DETR>`__.
+
+To install, run the following commands in your terminal:
+
+.. code:: bash
+
+    git clone https://github.com/rwood-97/DPText-DETR.git
+    cd DPText-DETR
+    pip install .
+
+DeepSolo
+~~~~~~~~
+
+Our fork for DeepSolo is available `here <https://github.com/rwood-97/DeepSolo>`__
+
+To install, run the following commands in your terminal:
+
+.. code:: bash
+
+    git clone https://github.com/rwood-97/DeepSolo.git
+    cd DeepSolo
+    pip install .
+
+
+Set-up the runner
+-----------------
+
+Once you have installed the dependencies, you can set up your chosen "runner".
+
+You will need to choose a model configuration and download the corresponding model weights.
+
+- Config files can be found in the model repositories under the ``configs`` directory.
+- Weights files should be downloaded from the github repositories (links to the downloads are in the README).
+
+e.g. for the ``DPTextDETRRunner``, if you choose the "ArT/R_50_poly.yaml", you should download the "art_final.pth" model weights file from the DPTextDETR repo.
+
+e.g. for the ``DeepSoloRunner``, if you choose the "R_50/IC15/finetune_150k_tt_mlt_13_15_textocr.yaml", you should download the "ic15_res50_finetune_synth-tt-mlt-13-15-textocr.pth" model weights file from the DeepSolo repo.
+
+You will also need to load your patch and parent dataframes.
+Assuming you have saved them, as shown in the :doc:`Load </User-guide/Load>` user guide, you can load them like so:
+
+.. code-block:: python
+
+    import pandas as pd
+
+    patch_df = pd.read_csv("patch_df.csv")
+    parent_df = pd.read_csv("parent_df.csv")
+
+You can then instantiate your runner.
+
+For the DPTextDETRRunner, use:
+
+.. code-block:: python
+
+    from map_reader.spot_text import DPTextDETRRunner
+
+    my_runner = DeepSoloRunner(
+        patch_df,
+        parent_df,
+        cfg_file = "DPText-DETR/configs/DPText_DETR/ArT/R_50_poly.yaml",
+        weights_file = "./art_final.pth",
+    )
+
+or, for the DeepSoloRunner, use:
+
+.. code-block:: python
+
+    from map_reader.spot_text import DeepSoloRunner
+
+    my_runner = DeepSoloRunner(
+        patch_df,
+        parent_df,
+        cfg_file = "DeepSolo/configs/R_50/IC15/finetune_150k_tt_mlt_13_15_textocr.yaml",
+        weights_file = "./ic15_res50_finetune_synth-tt-mlt-13-15-textocr.pth"
+    )
+
+You'll need to adjust the paths to the config and weights files to match your own set-up.
+
+Run the runner
+--------------
+
+You can then run the runner on all patches in your patch dataframe:
+
+.. code-block:: python
+
+    patch_preds = my_runner.run_all()
+
+By default, this will return a dictionary containing all the predictions for each patch.
+If you'd like to return a dataframe instead, use the ``return_dataframe`` argument:
+
+.. code-block:: python
+
+    patch_preds_df = my_runner.run_all(return_dataframe=True)
+
+If you'd like to run the runner on a single patch, you can also just run on one image:
+
+.. code-block:: python
+
+    patch_preds = my_runner.run_on_image("path/to/your/image.png")
+
+Again, this will return a dictionary by default but you can use the ``return_dataframe`` argument to return a dataframe instead.
+
+To view the patch predictions, you can use the ``show`` method.
+This takes an image ID as an argument, and will show you all the predictions for that image:
+
+.. code-block:: python
+
+    #EXAMPLE
+    my_runner.show(
+        "patch-0-0-1000-1000-#map_74488689.png#.png"
+    )
+
+By default, this will show the image with the bounding boxes drawn on in red and text in blue.
+You can change these by setting the ``border_color`` and ``text_color`` arguments:
+
+.. code-block:: python
+
+    my_runner.show(
+        "patch-0-0-1000-1000-#map_74488689.png#.png",
+        border_color = "green",
+        text_color = "yellow",
+    )
+
+You can also change the size of the figure with the ``figsize`` argument.
+
+
+Scale-up to whole map
+---------------------
+
+Once you've got your patch-level predictions, you can scale these up to the parent image using the ``convert_to_parent_pixel_bounds`` method:
+
+.. code-block:: python
+
+    parent_preds = my_runner.convert_to_parent_pixel_bounds()
+
+This will return a dictionary containing the predictions for the parent image.
+If you'd like to return a dataframe instead, use the ``return_dataframe`` argument:
+
+.. code-block:: python
+
+    parent_preds_df = my_runner.convert_to_parent_pixel_bounds(return_dataframe=True)
+
+Again, to view the predictions, you can use the ``show`` method.
+You should pass a parent image ID as the ``image_id`` argument:
+
+.. code-block:: python
+
+    #EXAMPLE
+    my_runner.show(
+        "map_74488689.png"
+    )
+
+As above, use the ``border_color``, ``text_color`` and ``figsize`` arguments to customize the appearance of the image.
+
+.. code-block:: python
+
+    my_runner.show(
+        "map_74488689.png",
+        border_color = "green",
+        text_color = "yellow",
+        figsize = (20, 20),
+    )
+
+
+You can then save these predictions to a csv file:
+
+.. code-block:: python
+
+    parent_preds_df.to_csv("text_preds.csv")
+
+Geo-reference
+-------------
+
+If you maps are georeferenced in your ``parent_df``, you can also convert the pixel bounds to georeferenced coordinates using the ``convert_to_coords`` method:
+
+.. code-block:: python
+
+    geo_preds_df = my_runner.convert_to_coords(return_dataframe=True)
+
+Again, you can save these to a csv file as above, or, you can save them to a geojson file for loading into GIS software:
+
+.. code-block:: python
+
+    my_runner.save_to_geojson("text_preds.geojson")
+
+This will save the predictions to a geojson file, with each text prediction as a separate feature.
diff --git a/docs/source/User-guide/User-guide.rst b/docs/source/User-guide/User-guide.rst
@@ -20,3 +20,4 @@ Please read this User Guide **before** looking through the worked examples.
    Annotate
    Classify/Classify
    Post-process
+   Spot-text
diff --git a/mapreader/__init__.py b/mapreader/__init__.py
@@ -12,6 +12,16 @@
 from mapreader.classify.classifier import ClassifierContainer
 from mapreader.classify import custom_models
 
+try:
+    from mapreader.spot_text.deepsolo_runner import DeepSoloRunner
+except ImportError:
+    pass
+
+try:
+    from mapreader.spot_text.dptext_detr_runner import DPTextDETRRunner
+except ImportError:
+    pass
+
 from mapreader.process import process
 
 from mapreader.annotate.annotator import Annotator

diff --git a/mapreader/load/images.py b/mapreader/load/images.py
@@ -1753,7 +1753,7 @@ def show(
                         counter=-1,
                     )
 
-                ax.set_title(image_id)
+                ax.set_title(parent_id)
                 figures.append(fig)
 
                 if column_to_plot and plot_histogram:

diff --git a/mapreader/spot_text/__init__.py b/mapreader/spot_text/__init__.py
-Original file line number
+Diff line change
@@ Expand Up / @@ -11,3 +11,4 @@ docs/source/api/mapreader/ @@
     dist/
     .coverage
     .coverage.*
+    build/