This project utilizes OCR (Optical Character Recognition) and other technologies to analyze Sample First Information Reports (FIRs) and extract crucial information. The goal is to provide insights into crime types, details, and related legal sections.
Note: Project is Still Under Development , Some components already there and some are under development process The project involves the following key components:
-
OCR and Text Extraction: Utilizes the EasyOCR library for OCR to extract text and bounding boxes from images of FIRs.
-
Crime Information Extraction: Parses the OCR results to identify the crime type and details. Draws bounding boxes around relevant information on the image.
-
PDF Search: Converts PDFs containing Indian crime laws into images and performs keyword searches using regular expressions. Identifies matches and extracts related legal section numbers.
-
IndiaCode Integration: Searches on the IndiaCode platform to find additional information related to the identified crime type.
-
Image Processing: Provide the path to the image of the FIR (
SampleFIR.png
) to initiate the analysis. -
Visualization: The image will be displayed with bounding boxes around the identified crime type and details.
-
PDF Search: If crime information is complete, the project searches a PDF file (
criminal_laws.pdf
) for matches, highlighting type and details, and providing section numbers. -
IndiaCode Integration: Additional information related to the crime type is fetched from the IndiaCode platform.
-
Sample Data: The project is currently using sample FIR data (
SampleFIR.png
) and a document of crime laws (criminal_laws.pdf
) for analysis. -
Section Identification: In further development, the project aims to identify specific legal sections related to the identified crime types.
-
Web Scraping: The project includes web scraping from the IndiaCode site to enhance the information retrieval process.
Make sure you have the following Python libraries installed:
re
easyocr
requests
bs4
(BeautifulSoup)pdf2image
matplotlib
PIL
(Pillow)
The main page (index.html) allows users to upload an image of an FIR for analysis. The result page (result.html) displays crime details and relevant laws extracted from the analysis.
Please note that the project is still in progress and incomplete. It will become even better after completion.
This web application is developed by Team Adhunikta, consisting of Om Dabral and Saksham Srivastava.
Feel free to explore the web application and contribute to its further development!