A Python-based GUI application for transcribing and processing images containing historical handwritten text using Large Language Models (LLMs) via API services (OpenAI, Google, and Anthropic APIs). Designed for academic and research purposes.
There is also a Transcription Pearl Manual available in the repository in PDF format with instructions on how to get and insert API keys as well as information on the various commands and functions.
MAJOR UPDATE: 1.0 beta Release
- Includes Image Preprocessing Utility
- Includes better functionality for storing settings
- Drag and drop functionality for PDFs
- Automatic rotation of photographs captured with phone cameras
- Ability to manually rotate images
- Ability to manually delete images
- Resizable image and text windows
- New requirements text (adds opencv-python==4.10.0.84)
Previous Update: 08 November 2024
There was an issue with the prompts file that was preventing the transcribed text from being sent to the correction function. The prompt was missing the {text_to_process} placeholder. That has now been fixed.
PDFs were also being imported at 72 DPI and this has been changed to 300 DPI which should improve readability.
Transcription Pearl helps researchers process and transcribe image-based documents using various AI services. It provides a user-friendly interface for managing transcription projects and leverages multiple AI providers for optimal results.
- Multi-API OCR capabilities (OpenAI, Google, Anthropic)
- Batch processing of images
- Text correction and validation
- PDF import and processing
- Drag-and-drop interface
- Project management system
- Find and replace functionality
- Progress tracking
- Multiple text draft versions
- Image Pre-Processsing Tool
- Python 3.8+
- Active API keys for:
- OpenAI
- Google Gemini
- Anthropic Claude
- tkinter
- tkinterdnd2
- pandas
- PyMuPDF (fitz)
- pillow
- openai
- anthropic
- google.generativeai
- Clone the repository
git clone https://github.com/mhumphries2323/Transcription_Pearl
- Install required packages
pip install -r requirements.txt
- Configure API keys in the Settings menu.
Launch the application:
python TranscriptionPearl_beta-2024111.py
Basic workflow:
- Create new project or open existing
- Import images or PDF
- Process text using AI services
- Edit and correct transcriptions
- Export processed text
The application uses several configuration files:
util/API_Keys_and_Logins.txt
- API credentialsutil/prompts.csv
- AI processing promptsutil/default_settings.txt
- Application settings
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This means you are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
- NonCommercial — You may not use the material for commercial purposes
If you use this software in your research, please cite:
Mark Humphries, 2024. Transcription Pearl 1.0 Beta. Department of History: Wilfrid Laurier University.
If you wish to cite the paper that explores this research cite:
Mark Humphries, Lianne C. Leddy, Quinn Downton, Meredith Legace, John McConnell, Isabella Murray, and Elizabeth Spence. Unlocking the Archives: Using Large Language Models to Transcribe Handwritten Historical Documents. Preprint: xxx.
Mark Humphries Wilfrid Laurier University, Waterloo, Ontario
This software is provided "as is", without warranty of any kind, express or implied. The authors assume no liability for its use or any damages resulting from its use.
This project is primarily for academic and research purposes. Please contact the author for collaboration opportunities.
- OpenAI API
- Google Gemini API
- Anthropic Claude API