Skip to content

Developed an OCR Image-to-Text application using Python and Streamlit, focusing on accurate text extraction and image preprocessing. Enhanced reliability and performance, enabling seamless conversion of diverse image formats into editable text.

Notifications You must be signed in to change notification settings

Rayyan9477/OCR-Image-to-text

Repository files navigation

Intelligent OCR and Text Analysis Tool

Description

An advanced application that performs Optical Character Recognition (OCR) on images and PDFs, extracts text, and provides a question-answering interface based on the extracted content. It leverages machine learning models and modern NLP techniques to enable users to interactively query their documents.

Techniques and Tools Used

  • Streamlit: For building the interactive web application.
  • PyPDF2: To read and extract text from PDF files.
  • Pillow (PIL): For image processing and manipulation.
  • OCR Module: Custom module (ocr_module.py) for performing OCR on images.
  • RAG Module: Custom module (rag_module.py) implementing Retrieval-Augmented Generation for processing queries.
  • Transformers: HuggingFace library for loading pre-trained models.
  • SentenceTransformers: For generating sentence embeddings.
  • PyTorch: Deep learning framework underpinning the ML models.

Features

  • Upload Images or PDFs: Accepts multiple image formats and PDFs for text extraction.
  • Perform OCR: Extracts text from images using the perform_ocr function.
  • Text Analysis: Enables users to ask questions about the extracted text using the process_query function.
  • Custom Styling: Utilizes custom CSS and JavaScript for an enhanced UI/UX.

Code Snippets

Loading Custom CSS

def load_css():
    with open('static/styles.css') as f:
        st.markdown(f'<style>{f.read()}</style>', unsafe_allow_html=True)

Handling File Uploads

def handle_file_upload(uploaded_file):
    if uploaded_file.type == "application/pdf":
        pdf_reader = PyPDF2.PdfReader(uploaded_file)
        text = ""
        for page in pdf_reader.pages:
            text += page.extract_text()
    else:
        image = Image.open(uploaded_file)
        text = perform_ocr(image)
    return text

Processing User Queries

def get_answer(query, context):
    answer = process_query(query, context)
    return answer

Contact

For inquiries or feedback:

About

Developed an OCR Image-to-Text application using Python and Streamlit, focusing on accurate text extraction and image preprocessing. Enhanced reliability and performance, enabling seamless conversion of diverse image formats into editable text.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published