The PDF Splitter is a desktop application built in Java for splitting PDF files by size, by pages, and extracting email addresses from PDF documents. This project utilizes the PDFBox and iTextPDF libraries to perform these operations effectively.
- Split PDFs by size: Break large PDF files into smaller chunks of a specified size.
- Split PDFs by pages: Divide a PDF into multiple parts after a given number of pages.
- Extract email addresses: Retrieve and save all email addresses found in a PDF document to a
.txt
file.
-
Split PDF After Specific Pages:
- Select the number of pages after which the PDF should be split.
- The resulting PDFs will be saved in the output folder.
-
Split PDF by Specific Size:
- Specify the maximum allowable size for each split PDF in kilobytes.
- The application will create multiple PDFs, ensuring each part adheres to the size limit.
-
Extract Email Addresses:
- Scans the text within PDF files for valid email addresses.
- Extracted emails are saved in a
.txt
file for easy access.
- Java 8 or higher.
- Maven for dependency management.
- Clone the repository:
git clone https://github.com/your-username/pdf-splitter.git cd pdf-splitter
- Build the project:
mvn clean install
- Run the application:
java -cp target/pdfsplitting-0.0.1-SNAPSHOT.jar com.pdfsplitting.Main
- Apache PDFBox: For handling PDF documents.
- iTextPDF: For advanced PDF processing.
marioszocs-pdf-splitter/
├── pom.xml # Maven build configuration
├── README.md # Project documentation
├── src/main/java/com/pdfsplitting/
│ ├── Main.java # Entry point of the application
│ ├── PDFFileOperations.java # Interface for PDF operations
│ ├── PDFFileOperationsImp.java # Implementation of PDF operations
│ ├── PdfUtilities.java # Utility methods for PDF handling
└── .gitignore # Ignored files