Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 1.32 KB

README.md

File metadata and controls

28 lines (20 loc) · 1.32 KB

AgropontosRegex

Agropontos Regex is a small Python program that extracts geolocation coordinates from PDF files, eg.: rural property registration documents.

It works for two types of coordinates, UTM and Lat-Long. And generates a CSV file that can be imported directly to GIS software, like QGIS.

The program interface can be used like a notepad to correct any errors or wrong characters brought by the OCR scanning. It also generates a new PDF file correcting the page tilt and rotation.

Screenshot

Installation

You need to install the following packages for Windows:

I recommend using the Chocolatey package manager to install some of the following: (Run in an Administrator command prompt)

  • Python 3.8 (64-bit) or later
    • choco install python3
  • Tesseract 4.1.1 (64-bit) or later
  • Ghostscript 9.50 (64-bit) or later
    • choco install ghostscript
  • OCRmyPDF 14.2.0 (64-bit) or later
    • pip install ocrmypdf
  • pypdf 3.9.0 (64-bit) or later
    • pip install pypdf