Skip to content

Commit

Permalink
Merge pull request #9 from openzim/cli
Browse files Browse the repository at this point in the history
Added CLI
  • Loading branch information
benoit74 authored Aug 2, 2024
2 parents c93c1a5 + 5d6e35b commit 64a1c39
Show file tree
Hide file tree
Showing 10 changed files with 729 additions and 45 deletions.
63 changes: 50 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,24 +77,61 @@ There are three main ways to install and use `devdocs2zim` from most recommended
```sh
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim [--all|--slug=SLUG]
```
# Usage
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim [--all|--slug=SLUG|--first=N]
**Flags**
# Fetch all documents
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --all
* `--all`: Fetch all Devdocs resources, and produce one ZIM per resource.
* `--slug`: Fetch the provided Devdocs resource, producing a single ZIM.
Slugs are the first path entry in the Devdocs URL. For example, the slug for: `https://devdocs.io/gcc~12/` is `gcc~12`.
* `--title`: (Optional) Set the title for the ZIM, supports the placeholders listed below.
* `--description`: (Optional) Set the description for the ZIM, supports the placeholders listed below.
* `--devdocs-endpoint`: (Optional) Override the Devdocs URL endpoint.
* `--filename`: (Optional) Set the output file name, supports the placeholders listed below.
# Fetch all documents except Ansible
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --all --skip-slug-regex "^ansible.*"
**Placeholders**
# Fetch Vue related documents
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --slug vue~3 --slug vue_router~4
# Fetch the docs for the two most recent versions of each software
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --first=2
```
* `{name}`: Human readable name of the Devdocs resource e.g. `Python 3.12`.
**One of the following flags is required:**
* `--all`: Fetch all Devdocs resources, and produce one ZIM per resource.
* `--slug SLUG`: Fetch the provided Devdocs resource. Slugs are the first path entry in the Devdocs URL.
For example, the slug for: `https://devdocs.io/gcc~12/` is `gcc~12`. Use --slug several times to add multiple.
* `--first N`: Fetch the first number of items per slug as shown in the DevDocs UI.
**Optional Flags:**
* `--skip-slug-regex REGEX`: Skips slugs matching the given regular expression.
* `--output OUTPUT_FOLDER`: Output folder for ZIMs. Default: /output
* `--creator CREATOR`: Name of content creator. Default: 'DevDocs'
* `--publisher PUBLISHER`: Custom publisher name. Default: 'openZIM'
* `--name-format FORMAT`: Custom name format for individual ZIMs.
Default: 'devdocs_{slug_without_version}_{version}'
* `--title-format FORMAT`: Custom title format for individual ZIMs.
Value will be truncated to 30 chars. Default: '{full_name} Documentation'
* `--description-format FORMAT`: Custom description format for individual ZIMs.
Value will be truncated to 80 chars. Default: '{full_name} Documentation'
* `--long-description-format FORMAT`: Custom long description format for your ZIM.
Value will be truncated to 4000 chars.Default: '{full_name} documentation by DevDocs'
* `--tag TAG`: Add tag to the ZIM. Use --tag several times to add multiple.
Formatting is supported. Default: ['devdocs', '{slug_without_version}']
**Formatting Placeholders**
The following formatting placeholders are supported:
* `{name}`: Human readable name of the resource e.g. `Python`.
* `{full_name}`: Name with optional version for the resource e.g. `Python 3.12`.
* `{slug}`: Devdocs slug for the resource e.g. `python~3.12`.
* `{license}`: License information about the resource.
* `{slug_without_version}`: Devdocs slug for the resource without the version e.g. `python`.
* `{version}`: Shortened version displayed in devdocs, if any e.g. `3.12`.
* `{release}`: Specific release of the software the documentation is for, if any e.g. `3.12.1`.
* `{attribution}`: License and attribution information about the resource.
* `{home_link}`: Link to the project's home page, if any: e.g. `https://python.org`.
* `{code_link}`: Link to the project's source, if any: e.g. `https://github.com/python/cpython`.
## Developing
Expand Down
11 changes: 11 additions & 0 deletions codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
coverage:
status:
project:
default:
informational: true
patch:
default:
informational: true
changes:
default:
informational: true
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ dev = [
]

[project.scripts]
devdocs2zim = "devdocs2zim:entrypoint"
devdocs2zim = "devdocs2zim.entrypoint:main"

[tool.hatch.metadata.hooks.openzim-metadata]
kind = "scraper"
Expand Down
14 changes: 0 additions & 14 deletions src/devdocs2zim/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +0,0 @@
# pyright: strict, reportUnnecessaryIsInstance=false

from devdocs2zim.__about__ import __version__


def compute(a: int, b: int) -> int:
if not isinstance(a, int) or not isinstance(b, int):
msg = "int only"
raise TypeError(msg)
return a + b


def entrypoint():
print(f"Hello from {__version__}") # noqa: T201
5 changes: 5 additions & 0 deletions src/devdocs2zim/constants.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
import pathlib

from zimscraperlib.logging import ( # pyright: ignore[reportMissingTypeStubs]
getLogger, # pyright: ignore[reportUnknownVariableType]
Expand All @@ -8,8 +9,12 @@

NAME = "devdocs2zim"
VERSION = __version__
ROOT_DIR = pathlib.Path(__file__).parent

DEVDOCS_FRONTEND_URL = "https://devdocs.io"
DEVDOCS_DOCUMENTS_URL = "https://documents.devdocs.io"

# As of 2024-07-28 all documentation appears to be in English.
LANGUAGE_ISO_639_3 = "eng"

logger = getLogger(NAME, level=logging.DEBUG)
93 changes: 93 additions & 0 deletions src/devdocs2zim/entrypoint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import argparse
import logging

from devdocs2zim.client import DevdocsClient
from devdocs2zim.constants import (
DEVDOCS_DOCUMENTS_URL,
DEVDOCS_FRONTEND_URL,
NAME,
VERSION,
logger,
)
from devdocs2zim.generator import DocFilter, Generator, ZimConfig


def main() -> None:
parser = argparse.ArgumentParser(
prog=NAME,
)

parser.add_argument(
"--debug", help="Enable verbose output", action="store_true", default=False
)

parser.add_argument(
"--version",
help="Display scraper version and exit",
action="version",
version=VERSION,
)

parser.add_argument(
"--output",
help="Output folder for ZIMs. Default: /output",
default="/output",
dest="output_folder",
)

# ZIM configuration flags
ZimConfig.add_flags(
parser,
ZimConfig(
name_format="devdocs_{slug_without_version}_{version}",
creator="DevDocs",
publisher="openZIM",
title_format="{full_name} Docs",
description_format="{full_name} docs by DevDocs",
long_description_format=None,
tags="devdocs;{slug_without_version}",
),
)

# Document selection flags
DocFilter.add_flags(parser)

# Client configuration flags
parser.add_argument(
"--devdocs-frontend-url",
help="Scheme and hostname for the devdocs frontend.",
default=DEVDOCS_FRONTEND_URL,
)

parser.add_argument(
"--devdocs-documents-url",
help="Scheme and hostname for the devdocs documents server.",
default=DEVDOCS_DOCUMENTS_URL,
)

args = parser.parse_args()

logger.setLevel(level=logging.DEBUG if args.debug else logging.INFO)

try:
zim_config = ZimConfig.of(args)
doc_filter = DocFilter.of(args)
devdocs_client = DevdocsClient(
documents_url=args.devdocs_documents_url,
frontend_url=args.devdocs_frontend_url,
)

Generator(
devdocs_client=devdocs_client,
zim_config=zim_config,
output_folder=args.output_folder,
doc_filter=doc_filter,
).run()
except Exception as e:
logger.exception(e)
logger.error(f"Generation failed with the following error: {e}")
raise SystemExit(1) from e


if __name__ == "__main__":
main()
Loading

0 comments on commit 64a1c39

Please sign in to comment.