Usage Guide:

This is a small collection of python files that let you input a PDF, which is then split into images (one page per image), and each image is converted to text using an LLM. Afterwards, the resulting text file can be captioned for RAG purposes, or used for some other purpose.

Usage Guide:

Replace example.pdf with a pdf of your choice
Replace api_key.txt with your API key from the provider of your choice (I used mistral)
Go to example_transcribe_pdf.py and replace the URL with the API URL of the provider of choice, and the model you want to call. Run the script.
Once #3 finishes, go to example_tag_text.py, replace the URL with the API URL of the provider of choice, and the model you want to call, run this script as well.
If you left the filepaths alone, the resulting text should be present in output_text/.

TODO

Streamlit UI
Further processing of tagged text
Better prompts (the current formatting of the transcribed text is a bit weird)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
libs		libs
output_images		output_images
output_text		output_text
README.md		README.md
api_key.txt		api_key.txt
core.py		core.py
example.pdf		example.pdf
example_tag_text.py		example_tag_text.py
example_transcribe_pdf.py		example_transcribe_pdf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage Guide:

TODO

About

Releases

Packages

Contributors 2

Languages

Green0-0/LLM-PDF-Utils

Folders and files

Latest commit

History

Repository files navigation

Usage Guide:

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages