Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix space in workflows dir #3

Merged
merged 12 commits into from
Dec 15, 2023
43 changes: 43 additions & 0 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Docker Image CI

on:
push:
branches: [ "release" ]
pull_request:
branches: [ "release" ]

env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}

jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- name: Checkout repository
uses: actions/checkout@v2

- name: Log in to the Container registry
uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@98669ae865ea3cffbcbaa878cf57c20bbf1c6c38
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

- name: Build and push Docker image
uses: docker/build-push-action@ad44023a93711e3deb337508980b4b5e9bcdc5dc
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM python:3-bookworm

LABEL maintainer=LasseR15
LABEL email=lasse.roth@lasse-it.de


RUN apt update
RUN apt install -y tesseract-ocr tesseract-ocr-deu ghostscript


COPY /src /app/src
COPY /requirements.txt /app/requirements.txt

WORKDIR /app

RUN pip3 install -r requirements.txt


RUN python -m playwright install-deps chromium
RUN python -m playwright install chromium

ENV PYTHONPATH=/app/src/
ENV BASE_OUTPUT_PATH=/app/output

ENTRYPOINT ["python3", "/app/src/main.py"]
156 changes: 156 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
<a name="readme-top"></a>

[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![GPL-3.0 License][license-shield]][license-url]



<!-- PROJECT LOGO -->
<br />
<div align="center">
<h3 align="center">BiBox to PDF</h3>

<p align="center">
CLI tool to download a book from BiBox as a OCR PDF
<br />
<br />
<a href="https://github.com/LasseR15/bibox-to-pdf/issues">Report Bug</a>
·
<a href="https://github.com/LasseR15/bibox-to-pdf/issues">Request Feature</a>
</p>
</div>


<!-- DISCLAIMER -->
<div align="center">
<h3>DISCLAIMER: THIS PROJECT IS FOR EDUCATIONAL PURPOSES ONLY</h3>
</div>
<br />

<!-- TABLE OF CONTENTS -->
<details>
<summary>Table of Contents</summary>
<ol>
<li>
<a href="#about-the-project">About The Project</a>
</li>
<li>
<a href="#getting-started">Getting Started</a>
<ul>
<li><a href="#prerequisites-for-manual-setup">Prerequisites for manual setup</a></li>
</ul>
</li>
<li>
<a href="#usage">Usage</a>
<ul>
<li><a href="#usage-with-Docker">Usage with Docker</a></li>
<li><a href="#usage-with-manual-setup">Usage with manual setup</a></li>
</ul>
</li>
<li><a href="#license">License</a></li>
</ol>
</details>



<!-- ABOUT THE PROJECT -->
## About The Project

This cli script allows you to download and ocr books from [BiBox](https://www.bibox.schule/).

You need valid login credentials as well as access to the books you want to download.


<p align="right">(<a href="#readme-top">back to top</a>)</p>



<!-- GETTING STARTED -->
## Getting Started

You can currently only run the script via Docker.

In the future there will be a described way to run it manually.

<p align="right">(<a href="#readme-top">back to top</a>)</p>

<!-- USAGE EXAMPLES -->
## Usage

### Usage with Docker

To run the image via Docker you can either do it directly via the Docker cli or the recommended way Docker compose.

There are two variants/tags available:
1. `latest`: The latest ocr version Docker image (larger image than the non-ocr verison)
2. `latest-non-ocr`: The latest non-ocr version. This image has no support for pdf ocr and is therefore smaller than the ocr version (Currently not available)

#### Docker Compose

To use the ocr version of the script with Docker Compose run the following command:
```bash
docker compose --rm -it run bibox-to-cli \
'{USERNAME}' '{PASSWORD}' {BOOK_ID}
```
<!-- CURRENTLY NOT AVAILABLE
If you want to run the non-ocr version run the following command.

You can also simply add `--no-ocr` before the username in the above command.
```bash
docker compose -f ./docker-compose.non-ocr.yml --rm -it run bibox-to-cli \
'{USERNAME}' '{PASSWORD}' {BOOK_ID}
```
-->
#### Docker CLI

To use the script with ocr via Docker run the following command:
```bash
Docker run --rm -it \
-v ./books:/app/output/books \
ghcr.io/lasser15/bibox-to-pdf:latest \
'{USERNAME}' '{PASSWORD}' {book_id}
```
<!-- CURRENTLY NOT AVAILABLE
To use it without ocr, run the following command.

You can also simply add `--no-ocr` before the username in the above command.
```bash
Docker run --rm -it \
-v ./books:/app/output/books \
ghcr.io/lasser15/bibox-to-pdf:latest-non-ocr \
'{USERNAME}' '{PASSWORD}' {book_id}
```
-->

<p align="right">(<a href="#readme-top">back to top</a>)</p>


### Usage with manual setup

The manual setup is currently not supported. Please use Docker instead.


<p align="right">(<a href="#readme-top">back to top</a>)</p>




<!-- LICENSE -->
## License

Distributed under the GPL 3.0 License. See `LICENSE` for more information.

<p align="right">(<a href="#readme-top">back to top</a>)</p>




<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[stars-shield]: https://img.shields.io/github/stars/LasseR15/bibox-to-pdf.svg?style=for-the-badge
[stars-url]: https://github.com/LasseR15/bibox-to-pdf/stargazers
[issues-shield]: https://img.shields.io/github/issues/LasseR15/bibox-to-pdf.svg?style=for-the-badge
[issues-url]: https://github.com/LasseR15/bibox-to-pdf/issues
[license-shield]: https://img.shields.io/github/license/LasseR15/bibox-to-pdf.svg?style=for-the-badge
[license-url]: https://github.com/LasseR15/bibox-to-pdf/blob/release/LICENSE
9 changes: 9 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
version: '3.9'

services:
bibox-to-pdf:
image: ghcr.io/lasser15/bibox-to-pdf:latest
build:
context: .
volumes:
- ./books:/app/output/books
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@ def get_bibox_images(access_token: str, book_id: int):
response = requests.get(url, headers=headers)

if response.status_code != 200:
print("Response code from server was not 200. Exiting!")
typer.Exit(1)
print(f"Response code from server was not 200. "
f"Either the book id '{book_id}' doesn't exist or the login wasn't successful. "
f"Exiting!")
raise typer.Exit(1)

return pages_to_image_array(response.json().get("pages", []))

Expand Down Expand Up @@ -52,7 +54,7 @@ def pages_to_image_array(pages: []):
image = page["images"][0] if page.get("images") else None
if image is None:
print("At least one image was null. Maybe script needs updating. Exiting!")
typer.Exit(1)
raise typer.Exit(1)
images.append(image)

return images
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def login_to_bibox(username: str, password: str) -> str:
page.wait_for_selector(BiboxSelectors.logoutBtn, timeout=10000)
except:
print('Login credentials incorrect or a network error occurred.')
typer.Exit(1)
raise typer.Exit(1)

access_token = page.evaluate('() => window.localStorage.getItem("oauth.accessToken")')

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
import os


class Constants:
biboxLoginUrl = 'https://bibox2.westermann.de'
biboxBookInfoUrl = 'https://backend.bibox2.westermann.de/v1/api/sync/{}?materialtypes[]=default&materialtypes[]=addon'
# biboxBookPageUrl = 'https://bibox2.westermann.de/book/{{bookId}}/page/{pageNumber}'

bookBaseOutputDir = '../books/{}'
baseOutputPath = os.getenv('BASE_OUTPUT_PATH', default='.')
bookBaseOutputDir = baseOutputPath + '/books/{}'
imageOutputDir = bookBaseOutputDir + '/images/'
imageOutputFile = imageOutputDir + '{}.png'
pdfOutputDir = bookBaseOutputDir + '/pdfs/'
Expand Down
File renamed without changes.
File renamed without changes.
Loading