A curated list of awesome projects to simplify and improve paper scanning.
Tip
Sponsored by:
Perspec - Desktop app to correct the perspective of images.
🌐 Get Perspec
🖥️ github.com/ad-si/Perspec
Table Of Contents
- DIY Book Scanner - Community of people who build book scanners.
- Docutain - SDK for document & barcode scanning and data capturing.
- Eagle Doc - Invoice and receipt Recognition as a service.
- Book Scanning - Book scanner software for home-made scanner (no license).
- BookDrive Editor Pro - Software for post-processing images of books (commercial).
- Voussoir - Single-camera solution for book scanning (open source).
- Booksorber - Processes camera images of book pages (commercial).
- Decapod - Web application frontend for image processing and capture tools.
- DxO Viewpoint - Correct perspective distortions in images (commercial).
- Easy Scan - Scanning software for book2net scanners (commercial).
- LIMB - Project inventory, image processing, quality control, OCR, document structuring and multiple format exporting for long-term archiving (commercial).
- Nidaba - Expandable and scalable OCR pipeline.
- OpenCV-Document-Scanner - Interactive document scanner built with Python and OpenCV.
- Page Improver - Automatic image enhancing software for page scanning.
- Perspec - Manually correct the perspective of images.
- Readiris 17 - OCR software to digitalize papers, images, or PDF files.
- ScanGate LWF - Stand-alone software for book digitization (commercial).
- ScanTailor - Interactive post-processing tool for scanned pages (open source).
- ScanTailor Advanced - Merges features of forks, adds new features, and includes fixes.
- Skarynka - Software to scan and process images to build books.
- YASW - Yet Another Scan Wizard (open source).
- scanner - Document scanner for the web built in Rust.
- Plumb-Bob - Perspective rectifier (macOS app).
- Prizmo - Turn photos into scans by adjusting perspective, cropping, etc. (macOS app).
- CamScanner - Scan any kind of document.
- Doc OCR - PDF scanner with document image dewarping.
- Doc Scan - Turn your iPhone / iPad into a portable scanner and PDF editor.
- Genius Scan - A scanner in your pocket.
- IRIScan - Scan documents with your iPhone or iPad. Trims, enhances and makes pictures of whiteboards and docs readable.
- Quick Scan - Scan, Recognize, Automate.
- Scanbot - High quality scans with one tap.
- Scannable - Scan contracts, receipts and business cards.
- Scanner Pro - Scan paper documents into PDFs.
- vFlat - Capture documents, forms, receipts, books and convert them into high-quality PDFs.
- Adobe Scan - Scan, OCR, and edit documents (account required).
- Google Drive - Use the camera to scan documents (does not support loading existing photos).
- Microsoft Lens - Trim, enhances, and make photos of whiteboards and documents readable.
- Open Camera - Extensive open source camera app.
- PDF-Doc-Scan - Open source Android PDF document scanning app.
- Stack - PDF scanner, document organizer, and detail finder by Google's Area 120.
- Dewarping pages
- Document scanner - How to Build a Kick-Ass Mobile Document Scanner in Just 5 Minutes.
- Genetic programming in the cloud
- Keypoint Detection with Transfer Learning
- math.stackexchange - Compute ratio of a rectangle seen from an unknown perspective.
- Noteshrink - Compressing and enhancing hand-written notes.
- Page dewarping - Flattening images of curled pages.
- Perspective transform - 4 Point OpenCV getPerspective Transform Example.
- Stackoverflow - Proportions of a perspective-deformed rectangle.
- Unpaper - Post-processing tool for scanned sheets of paper.
- pdfsandwich - CLI tool using OCR to add text to image PDFs.
- rbgg - Remove background from images of paper.
- Unprojecting text with ellipses - Using transformed ellipses to estimate perspective transformations of text.
- Document-Dewarping with Control-Points - Dewarping of document images using control points.
- Building an image processing pipeline with Python
- Methods To Sense The 3D Surface/Structure Of A Book
- Robust Reading Competition - Detection and recognition challenges for text in scene images.
- What are the most common algorithms for adaptive thresholding?
- CornerAPI - Detect torn corners and edges in document images.
- doc2text - Bulk detect text blocks and OCR scanned PDFs.
- Empty_training - Train neural network to detect empty pages in document images.
- EmptyAPI - Detect empty pages in document images.
- FaultyImageAPI - Combines CornerAPI, EmptyAPI, PostitAPI, and WritingtypeAPI
- imgwarp-js - Warp images using JavaScript.
- Laser Book Scanning - Experimental methods for dewarping document images based on the use of lasers.
- LCNN - End-to-End Wireframe Parsing.
- Pixelnetica - Document Scanning SDK for business apps.
- PostitAPI - Detect post-it/sticky notes in document images.
- PyThreshold - Implementations of state-of-the-art image thresholding algorithms.
- Segment Anything - AI model that can cut out any object in any image.
- Table_segmentation - Segment table structures and detect text content in document images.
- [Train_document_classification] - Train a neural network to classify input documents based on the type/format.
- Train_fault_detection - Train a neural network to detect faults (e.g. folded corners, sticky notes, …) in document images.
- Train_writing_type - Train a neural network to classify document images by writing type (handwritten, typewritten).
- WeScan - Library to add scanning functionalities to an iOS app.
- WritingtypeAPI - Classify document based on the writing type (handwritten, typewritten).
- Docspell - Document management system for private and small business use.
- Hermes - Open source document management system by HashiCorp.
- Mayan EDMS - Libre document management system.
- OpenPaper.work - Scan and import personal documents.
- Paperless NGX - Scan, index, and archive paper documents.
- Papermerge - Open source document management system for digital archives.
- Polar - Knowledge manager for web pages, textbooks, and PDFs.
- TagSpaces - Offline & open source document manager with tagging support.
- Teedy - Lightweight document management system for individuals & businesses.
- Whiteboard scanning - Whiteboard scanning and image enhancement by Zhengyou Zhang , Li-Wei He
- Cam params - Determining camera parameters from the perspective projection of a rectangle by Robert M. Haralick. (PDF)
- Dewarping of document images - Two-Step Dewarping of Camera Document Images by N. Stamatopoulos, B. Gatos, I. Pratikakis & S. J. Perantonis
- Dewarping of Document Images using Coupled-Snakes
- DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks
- Image and Depth from a Conventional Camera with a Coded Aperture
- The IUPR Dataset of Camera-Captured Document Images
- Image processing via level set curvature flow
- OCR Datasets - Collection of OCR-related datasets.
- Book Flipping Scanning
- BFS-Auto: High Speed & High Definition Book Scanner
- Real-time 3D Page Tracking and Book Status Recognition
- High-speed and High-definition Document Digitalization System based on Adaptive Scanning using Real-time 3D Sensing
- Automatic page turner machine for Book Flipping Scanning
- Document Digitization and its Quality Improvement using a Multi-camera Array
- Digitization of Deformed Documents using a High-speed Multi-camera Array
- BFS-Solo: High Speed Book Digitization using Monocular Video
- Reconstruction of 3D Surface and Restoration of Flat document Image from Monocular Image Sequence
- High-accuracy rectification technique of deformed document image using Tiled Rectangle Fragments (TRFs)
- Document Image Rectification using Advance Knowledge of 3D Deformation
- Estimation of Non-rigid Surface Deformation using Developable Surface Model
- Proof-of-concept prototype for Book Flipping Scanning
- Archivist - V-shaped platner based book scanner (open source).
- book2net - Book scanners for libraries and archives (commercial).
- Linear Book Scanner - Low-cost page-turning book scanner (open source).
- Portable Scanners - Several portabal scanning devices (commercial).