This Django project allows users to upload PDF files, extract text and images from them, compress images, and download the images as a ZIP file. It uses fitz (PyMuPDF) for PDF processing, Pillow for image compression, and Django for web functionality.
Users can upload PDF files through the web interface.
Extracts text and images from the uploaded PDFs.
Extracted images are compressed to reduce file size without significant loss in quality.
Allows users to download all extracted images as a single ZIP file.
Clone this repository: git clone https://github.com/kksain/Pdf_Text_Extractor.git cd pdf_text_extractor
pip install -r requirements.txt
python manage.py migrate python manage.py runserver Open your browser and go to http://127.0.0.1:8000/ to access the upload page.
Users can upload a PDF file on the main page.
Upon successful upload, the application extracts text and images from the PDF.
Extracted images are compressed to a size less than 200KB while maintaining quality.
Users can download all extracted and compressed images in a single ZIP file.