Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct path when checking if DAGs.md needs to be regenerated #3697

Merged
merged 1 commit into from
Jan 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion catalog/justfile
Original file line number Diff line number Diff line change
@@ -134,7 +134,7 @@ generate-dag-docs fail_on_diff="false":
echo "Done!"
if {{ fail_on_diff }}; then
set +e
git diff --exit-code -- documentation/catalog/reference/DAGs.md
git diff --exit-code -- ../documentation/catalog/reference/DAGs.md
if [ $? -ne 0 ]; then
printf "\n\n\e[31m!! Changes found in DAG documentation, please run 'just generate-dag-docs' locally and commit difference !!\n\n"
exit 1
22 changes: 21 additions & 1 deletion documentation/catalog/reference/DAGs.md
Original file line number Diff line number Diff line change
@@ -85,6 +85,7 @@ The following are DAGs grouped by their primary tag:

| DAG ID | Schedule Interval | Dated | Media Type(s) |
| --------------------------------------------------------------- | ----------------- | ------- | ------------- |
| [`auckland_museum_workflow`](#auckland_museum_workflow) | `@daily` | `True` | image |
| `brooklyn_museum_workflow` | `@monthly` | `False` | image |
| [`cc_mixter_workflow`](#cc_mixter_workflow) | `@monthly` | `False` | audio |
| `cleveland_museum_workflow` | `@monthly` | `False` | image |
@@ -106,7 +107,7 @@ The following are DAGs grouped by their primary tag:
| [`smk_workflow`](#smk_workflow) | `@monthly` | `False` | image |
| [`stocksnap_workflow`](#stocksnap_workflow) | `@monthly` | `False` | image |
| [`wikimedia_commons_workflow`](#wikimedia_commons_workflow) | `@daily` | `True` | image, audio |
| [`wordpress_workflow`](#wordpress_workflow) | `@monthly` | `False` | image |
| [`wordpress_workflow`](#wordpress_workflow) | `@weekly` | `False` | image |

### Provider Reingestion

@@ -123,6 +124,7 @@ The following is documentation associated with each DAG (where available):

1. [`add_license_url`](#add_license_url)
1. [`airflow_log_cleanup`](#airflow_log_cleanup)
1. [`auckland_museum_workflow`](#auckland_museum_workflow)
1. [`audio_data_refresh`](#audio_data_refresh)
1. [`audio_popularity_refresh`](#audio_popularity_refresh)
1. [`batched_update`](#batched_update)
@@ -203,6 +205,24 @@ airflow dags trigger --conf
- maxLogAgeInDays:<INT> - Optional
- enableDelete:<BOOLEAN> - Optional

### `auckland_museum_workflow`

Content Provider: Auckland War Memorial Museum Tāmaki Paenga Hira

ETL Process: Use the API to identify all CC licensed media.

Output: TSV file containing the media and the respective meta-data.

Notes: https://api.aucklandmuseum.com/

Resource: https://api.aucklandmuseum.com/
https://github.com/AucklandMuseum/API/wiki/Tutorial

| Resource | Requests per second | Requests per day |
| ------------ | ------------------- | ---------------- |
| /search, /id | 10 | 1000 |
| /id/media | 10 | 1000 |

### `audio_data_refresh`

#### Data Refresh DAG Factory