Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions how to run ingestion script from command line #4083

4 changes: 4 additions & 0 deletions catalog/justfile
Original file line number Diff line number Diff line change
Expand Up @@ -172,3 +172,7 @@ generate-docs doc="dag" fail_on_diff="false":
# Generate files for a new provider
add-provider provider_name endpoint +media_types="image":
python3 templates/create_provider_ingester.py "{{ provider_name }}" "{{ endpoint }}" -m {{ media_types }}

# Run bash in the container set in the SERVICE env-var
run:
just ../run {{ SERVICE }} bash
13 changes: 13 additions & 0 deletions documentation/catalog/guides/adding_a_new_provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,19 @@ in the
as well as a corresponding test file. Complete the TODOs detailed in the
generated files to implement behavior specific to your API.

You can run the provider script directly from the command line via a just
recipe. This will open a bash shell inside the docker stack of the catalog.

```
just catalog/run
```

Now you can just run the script like so:

```
python catalog/dags/providers/provider_api_scripts/<script_you_want_to_run>.py
```

Some APIs may not fit perfectly into the established `ProviderDataIngester`
pattern. For advanced use cases and examples of how to modify the ingestion
flow, see the
Expand Down