Skip to content

Commit 4aae7de

Browse files
authored
Remove temporary Science Museum DAG now that it is no longer necessary (#4314)
* Remove DAG now that it is no longer necessary * Remove documentation for deleted dag
1 parent e123cdb commit 4aae7de

File tree

2 files changed

+3
-228
lines changed

2 files changed

+3
-228
lines changed

catalog/dags/maintenance/update_science_museum_urls.py

Lines changed: 0 additions & 203 deletions
This file was deleted.

documentation/catalog/reference/DAGs.md

Lines changed: 3 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,9 @@ The following are DAGs grouped by their primary tag:
2626

2727
### Data Normalization
2828

29-
| DAG ID | Schedule Interval |
30-
| ----------------------------------------------------------- | ----------------- |
31-
| [`add_license_url`](#add_license_url) | `None` |
32-
| [`update_science_museum_urls`](#update_science_museum_urls) | `None` |
29+
| DAG ID | Schedule Interval |
30+
| ------------------------------------- | ----------------- |
31+
| [`add_license_url`](#add_license_url) | `None` |
3332

3433
### Data Refresh
3534

@@ -171,7 +170,6 @@ The following is documentation associated with each DAG (where available):
171170
1. [`smk_workflow`](#smk_workflow)
172171
1. [`staging_database_restore`](#staging_database_restore)
173172
1. [`stocksnap_workflow`](#stocksnap_workflow)
174-
1. [`update_science_museum_urls`](#update_science_museum_urls)
175173
1. [`wikimedia_commons_workflow`](#wikimedia_commons_workflow)
176174
1. [`wikimedia_reingestion_workflow`](#wikimedia_commons_workflow)
177175
1. [`wordpress_workflow`](#wordpress_workflow)
@@ -1057,26 +1055,6 @@ authorization required. API is undocumented.
10571055

10581056
----
10591057

1060-
### `update_science_museum_urls`
1061-
1062-
#### Update Science Museum URLs
1063-
1064-
One-time maintenance DAG to update Science Museum records to have valid URLs.
1065-
See https://github.com/WordPress/openverse/issues/4261.
1066-
1067-
For each Science Museum record, this DAG:
1068-
1069-
- updates the url to the new format, excluding `/images/` in the path if it
1070-
exists
1071-
- validates whether the url is reachable. If not, the record ID is added to an
1072-
`invalid_science_musem_ids` table.
1073-
1074-
Once complete, we can use the `science_museum_invalid_ids` to identify records
1075-
to delete. They are not automatically deleted by this DAG, in order to give us
1076-
an opportunity to first see how many there are.
1077-
1078-
----
1079-
10801058
### `wikimedia_commons_workflow`
10811059

10821060
**Content Provider:** Wikimedia Commons

0 commit comments

Comments
 (0)