Skip to content

Split PDF files by size, by page, and extract email addresses

Notifications You must be signed in to change notification settings

marioszocs/pdf-splitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

PDF Splitter

The PDF Splitter is a desktop application built in Java for splitting PDF files by size, by pages, and extracting email addresses from PDF documents. This project utilizes the PDFBox and iTextPDF libraries to perform these operations effectively.


Features

  • Split PDFs by size: Break large PDF files into smaller chunks of a specified size.
  • Split PDFs by pages: Divide a PDF into multiple parts after a given number of pages.
  • Extract email addresses: Retrieve and save all email addresses found in a PDF document to a .txt file.

How It Works

Main Operations:

  1. Split PDF After Specific Pages:

    • Select the number of pages after which the PDF should be split.
    • The resulting PDFs will be saved in the output folder.
  2. Split PDF by Specific Size:

    • Specify the maximum allowable size for each split PDF in kilobytes.
    • The application will create multiple PDFs, ensuring each part adheres to the size limit.
  3. Extract Email Addresses:

    • Scans the text within PDF files for valid email addresses.
    • Extracted emails are saved in a .txt file for easy access.

Requirements

  • Java 8 or higher.
  • Maven for dependency management.

Installation

  1. Clone the repository:
    git clone https://github.com/your-username/pdf-splitter.git
    cd pdf-splitter
  2. Build the project:
    mvn clean install
  3. Run the application:
    java -cp target/pdfsplitting-0.0.1-SNAPSHOT.jar com.pdfsplitting.Main

Example Screenshots

Split PDF by Size

PDF Split by Size

Input Selection

Input Example

Output Example

Output Example


Libraries Used


Project Structure

marioszocs-pdf-splitter/
├── pom.xml                  # Maven build configuration
├── README.md                # Project documentation
├── src/main/java/com/pdfsplitting/
│   ├── Main.java            # Entry point of the application
│   ├── PDFFileOperations.java # Interface for PDF operations
│   ├── PDFFileOperationsImp.java # Implementation of PDF operations
│   ├── PdfUtilities.java    # Utility methods for PDF handling
└── .gitignore               # Ignored files

About

Split PDF files by size, by page, and extract email addresses

Topics

Resources

Stars

Watchers

Forks

Languages