GitHub - anujhsrsaini/Aadhar-OCR: Aadhar OCR using Python and Tesseract : Fetch name, gender, date of birth, aadhaar number, address from Aadhaar card using OCR

Aadhaar OCR using Tesseract

About The Project

This project is a Python-based tool designed to extract and digitize text information from Aadhar Cards, the unique identification cards issued by the Government of India. This project aims to facilitate the automation of data extraction from Aadhar Cards, making it easier to integrate Aadhar Card data into various applications, databases, and systems.

Developers and data analysts often need to test and develop Redash features, plugins, and customizations in a local environment before deploying to a production server. While Linux is the recommended platform for hosting Redash, this project aims to make it more accessible to Windows users for local testing and development.

**Disclaimer: The aadhaar samples used, were found on google using my internet search

Features

Aadhar Card Text Extraction: The project includes OCR capabilities that can accurately extract text from Aadhar Cards, including important information such as the Aadhar number, holder's name, date of birth, and address. This OCR functionality is powered by Tesseract, an open-source OCR engine known for its accuracy and versatility in text recognition.
Customization: Users have the flexibility to customize the OCR process to accommodate variations in Aadhar Card formats and designs.
Open Source: This project is open source and can be freely used, modified, and extended by the community.

Getting Started

Prerequisites

Python 3.7.9 Make sure you have Python 3.7.9 installed on your system. You can download and install Python from the official Python website.
Git
Tesseract OCR: Tesseract is used for text extraction. Install Tesseract for your operating system by following the instructions on the Tesseract GitHub repository.

Installation

Clone the repo

git clone https://https://github.com/anujhsrsaini/Aadhar-OCR.git

It is recommended that you setup a completely new python environment from python 3.7.9 for this project, as the library versions in the requirements.txt may conflict with your prior installations, and install the required libraries to this environment using the below commands.

python -m venv venv
pip install -r requirements.txt

Make the changes to main.py file, to include your own Tesseract path and paths to front and back of Aadhaar image you want to process. You might need to make slight change to backside image part of the code based on the format of aadhaar you are using as mentioned in the commented part of the code.
Now, you can run the code and it will print out the processed information.

Authors

Anuj Saini

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
Aadhaar.png		Aadhaar.png
Back_Sample.jpg		Back_Sample.jpg
Front_Sample.jpg		Front_Sample.jpg
README.md		README.md
Tesseract.png		Tesseract.png
aadhaar_read.py		aadhaar_read.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aadhaar OCR using Tesseract

Table Of Contents

About The Project

Features

Getting Started

Prerequisites

Installation

Authors

About

Releases

Packages

Languages

anujhsrsaini/Aadhar-OCR

Folders and files

Latest commit

History

Repository files navigation

Aadhaar OCR using Tesseract

Table Of Contents

About The Project

Features

Getting Started

Prerequisites

Installation

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages