http://caaca1b72e90.ngrok.io/OCR712/ [ please use the choose file button, hosting was causing issues with drag and drop :) ] (link is down)
This projectis aimed at helping the Global Parli Foundation in their mission to improve rural India through a replicable model of Rural Rejuvenation.
In order to achieve this we aim to
- build a hosted OCR web service
- which converts the 7/12 extract (“Saath Baara Utara”)
- to an editable excel file.
Winning project of the Code For Change Hackathon
- Upload singular or multiple Saath Baara Utara pdfs at once using drag and drop or browse.
- Google ocr converts each of these files to a text document in devnagri script
- Using pandas and python we extract usefull information from these converted text. Basically, we extract variables from the text documents and create columns for a excel file based on them for easy readability and comparision between multiple 7/12 Extracts.
- FINAL EXCEL
- In our opinion the biggest advantage we provide is the multiple file support => EXCEL as this will be helpfull for the NGO to compare thousands of 7/12 extracts at a glance using excel functionalities.
- Our next advantage would be fast and reliable OCR service, with a error rate of only 1.2 %.
- No one in the market currently provide all these functionalities bundled together.
clone the repo
pip install -r requirements.txt
Refer requirements_NOTE.txt
python manage.py migrate
python manage.py makemigrations
python manage.py runserver
Run at
http://127.0.0.1:8000/OCR712/