epub_filter_tool

A tool to find genres associated with epubs by searching goodreads, with a simple GUI to sort and process them.

Windows Install

git clone https://github.com/secretlycarl/epub_filter_tool

cd epub_filter_tool

python -m venv venv

.\venv\Scripts\activate

pip install -r requirements.txt

python main.py

This is the basic flow of the script -

main.py

User input for the folder of .epubs to process
A LLM cleans up filenames, and a sanitization function cleans up any extra punctuation
The cleaned up filenames are passed to HTTP requests with a GoodReads search URL
The search page is checked for results, and the book page URL and amount of ratings are parsed
If there are less than 500 ratings, a txt file matching the original epub filename is saved with the text "unpopular" and any further processing for that book is skipped.
If there is no book found, a txt file with "unknown" is saved for the book and the genre logic is skipped.
Once the book is found, it navigates to the book page to find the genres, and saves them to a text file.
Once the entire is folder is processed, another folder path can be entered.

Note - It is set up to process 20 books at a time. It runs ok on my beefy PC, but if you run into any performance issues or rate limiting, reduce BATCH_SIZE near the top.

GUI

Sort by genre and view books associated with the selected genre
Type to search/filter box
Button to delete currently filtered books
Button to move currently filtered books to a folder with the genre name
The UI updates as books are processed. If you open a folder that has already been processed, click "Update" to load in the list of genre tags.

Things to Work On

Try to implement a more lightweight LLM. The current model is ~8GB so a graphics card with at least that much VRAM is needed

Note

Making thousands of requests to GoodReads servers might get you rate limited/temp banned for a day or so. I can do 3k books/day without issue but it happened once in my testing with more than 5k books.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
README.md		README.md
gui-screenshot.png		gui-screenshot.png
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

epub_filter_tool

Windows Install

main.py

GUI

Things to Work On

Note

About

Uh oh!

Releases

Packages

Languages

secretlycarl/epub_filter_tool

Folders and files

Latest commit

History

Repository files navigation

epub_filter_tool

Windows Install

main.py

GUI

Things to Work On

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages