Skip to content

Commit

Permalink
Update for recent version of migrate_source_csv (#1)
Browse files Browse the repository at this point in the history
  • Loading branch information
seth-shaw-asu authored Mar 1, 2023
1 parent bcd8b97 commit d8f0372
Show file tree
Hide file tree
Showing 5 changed files with 96 additions and 94 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Proof of Concept: Islandora 8 Migrate Files (Apollo Edition)

This repository consists of two modules:
This repository consists of two directories:

1. migrate_apollo: Uses the Migrate API to load Tiff masters, metadata from a CSV, and MADS RDF XML authority records from the Library of Congress.
1. migrate_apollo: A Drupal module holding the Migrate API configs to load Tiff masters, metadata from a CSV, and MADS RDF XML authority records from the Library of Congress.
2. data: the source data we will migrate (see below).

# Source Data
Expand All @@ -14,10 +14,10 @@ The source data used for this proof of concept came from the [Project Apollo Arc
Note: using drush with migrate_tools is optional, but the instructions assume it is installed.

0. Install Islandora 8, including islandora_defaults.
0. Clone this git repo to your modules directory. (`git clone https://github.com/seth-shaw-asu/migrate-apollo.git`)
0. Copy the data directory to your drupal web root (e.g. for islandora the default drupal web root is `/var/www/html/drupal/web` and the data directory is `/var/www/html/drupal/web/data`).
0. Clone this git repo to your modules directory. (`git clone https://github.com/seth-shaw-unlv/claw-migrate-files-poc.git`)
0. Enable the modules. E.g. `drush en -y migrate_apollo`.
0. Run the migration. E.g. `drush -l http://localhost:8000 mim --userid=1 --all`. *Note: drush must be run by the webserver user because the claw_file migration copies files to the "public://directory". E.g.* `sudo -u www-data drush -l http://localhost:8000 mim --userid=1 --all` *if you are using Ubuntu. Also, the userid flag is specific to the migrate:import command, it provides the necessary user information to the JWT authentication to enable derivatives.*
0. Run the migration. E.g. `drush -l http://localhost:8000 mim --userid=1 --all`. *Note: drush must be run by the webserver user because the file migration copies files to the "public://directory". E.g.* `sudo -u www-data drush -l http://localhost:8000 mim --userid=1 --all` *if you are using Ubuntu. Also, the userid flag is specific to the migrate:import command, it provides the necessary user information to the JWT authentication to enable derivatives.*
0. See a wonderful list of the newly migrated images on your Drupal site's front page!

# Combining People and Subject entities in a single column
Expand All @@ -27,7 +27,7 @@ into separate columns. This allows us to perform entity lookups on each column
for the matching content type.

In some cases, however, topic columns include items that could be either people or topics.
This requires us to perform a single lookup across multiple content types,
This requires us to perform a single lookup across multiple vocabularies,
something the existing migrate_plus module doesn't support. I've
[created a patch and issue](https://www.drupal.org/project/migrate_plus/issues/2960251) to address the issue.
Until it is merged or some other solution is found, we will either have to
Expand Down
52 changes: 25 additions & 27 deletions config/install/migrate_plus.migration.islandora8_file.yml
Original file line number Diff line number Diff line change
@@ -1,62 +1,60 @@
langcode: en
status: true
dependencies: { }
id: islandora8_file
label: Import Image Files
class: null
field_plugin_method: null
cck_plugin_method: null
migration_tags: null
migration_group: apollo

label: 'Import Image Files'
source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
path: data/apollo.csv
delimiter: ','
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
header_row_count: 0
ids:
- digital_id
constants:
source_base_dir: 'data/images'
collection_alias: 'apollo'
source_base_dir: data/images
collection_alias: apollo
dest_base_dir: 'fedora://masters'
extension: 'tiff'
column_names:
0:
digital_id: 'Digital ID' # identifier key and basename of the file
1:
title: 'Title' # Used for title and alt-text


extension: tiff
fields:
-
name: digital_id
label: 'Digital ID'
-
name: title
label: Title
process:
settings:
plugin: skip_row_if_not_set
source: digital_id

type:
plugin: default_value
default_value: image

filename:
plugin: concat
delimiter: '.'
delimiter: .
source:
- digital_id
- constants/extension

source_file_path:
plugin: concat
delimiter: /
source:
- constants/source_base_dir
- '@filename'

destination_file_path:
plugin: concat
delimiter: /
source:
- constants/dest_base_dir
- constants/collection_alias
- '@filename'

uri:
plugin: file_copy
source:
- '@source_file_path' #where it is
- '@destination_file_path' #where we want it

- '@source_file_path'
- '@destination_file_path'
destination:
plugin: 'entity:file'
migration_dependencies: null
43 changes: 20 additions & 23 deletions config/install/migrate_plus.migration.islandora8_media.yml
Original file line number Diff line number Diff line change
@@ -1,36 +1,34 @@
langcode: en
status: true
dependencies: { }
id: islandora8_media
label: Import Media
class: null
field_plugin_method: null
cck_plugin_method: null
migration_tags: null
migration_group: apollo

migration_dependencies:
required:
- islandora8_files
- islandora8_metadata # so we can lookup the value for field_media_of

label: 'Import Media'
source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
path: data/apollo.csv
delimiter: ','
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
header_row_count: 0
ids:
- digital_id
constants:
media_use: 'Original File'
uid: 1 # UID of Admin user, may be changed to uid of someone with permission to create items
column_names:
0:
digital_id: 'Digital ID' # identifier key

uid: 1
fields:
-
name: digital_id
label: 'Digital ID'
process:
mid:
plugin: migration_lookup
migration: claw_file
source: digital_id
no_stub: true

uid: constants/uid

# Lookup the Tiff we just migrated
field_media_file/target_id:
plugin: migration_lookup
migration: islandora8_file
Expand All @@ -42,15 +40,11 @@ process:
field_media_file/description:
plugin: default_value
default_value: ''

# Lookup the metadata record we just created
field_media_of:
plugin: migration_lookup
migration: islandora8_metadata
source: digital_id
no_stub: true

# Set as Preservation Master
field_media_use:
plugin: entity_lookup
source: constants/media_use
Expand All @@ -59,7 +53,10 @@ process:
bundle: islandora_media_use
entity_type: taxonomy_term
ignore_case: true

destination:
plugin: 'entity:media'
default_bundle: file
migration_dependencies:
required:
- islandora8_files
- islandora8_metadata
82 changes: 44 additions & 38 deletions config/install/migrate_plus.migration.islandora8_metadata.yml
Original file line number Diff line number Diff line change
@@ -1,61 +1,55 @@
langcode: en
status: true
dependencies: { }
id: islandora8_metadata
label: 'Import Metadata'
class: null
field_plugin_method: null
cck_plugin_method: null
migration_tags: null
migration_group: apollo

migration_dependencies:
required:
# Loading authorities first allows us to look them up
- auth_person
- auth_complex
- auth_geographic
- auth_topic

label: 'Import Metadata'
source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
path: data/apollo.csv # Path relative to Drupal site root
delimiter: ','
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
ids:
- digital_id
constants:
collection_alias: 'apollo'
image: 'Image'
uid: 1 # UID of Admin user, may be changed to uid of someone with permission to create items
column_names:
collection_alias: apollo
image: Image
uid: 1
fields:
-
digital_id: 'Digital ID'
name: digital_id
label: 'Digital ID'
-
title: 'Title'
name: title
label: Title
-
description: 'Description'
name: description
label: Description
-
subject_person: 'Identified Individual'
name: subject_person
label: 'Identified Individual'
-
subjects: 'Subjects'

destination: # We're creating nodes, ya'll.
plugin: entity:node

name: subjects
label: Subjects
process:
type: # The content type of the nodes we are creating
plugin: default_value
default_value: islandora_object

# One-to-One mappings
uid: constants/uid
field_identifier: digital_id
title: title
field_description: description

path: # Path Alias
plugin: concat
delimiter: '/'
delimiter: /
source:
- '' # Gives us a '/' prefix for the server root
- constants/collection_alias
- digital_id

# Type Tags
field_model:
plugin: entity_lookup
source: constants/image
Expand All @@ -70,14 +64,17 @@ process:
# lookups for each type, assign them to a temp array, and recombine
# them all before assigning them to the appropriate entity reference field.

temp_subjects_person: # Temporary array of person entity refs
temp_subjects_person: # Temporary array of person entity refs
-
plugin: skip_on_empty # Don't bother if there aren't any values
source: subject_person
method: process # Only this field, not the whole CSV row
method: process # Skip only this field, not the whole CSV row
- # Account for multiple entries in a cell delimited by ;
plugin: explode # Note: no whitespace trimming or quoting support is provided! Be careful with leading or trailing spaces between values in your source data!
delimiter: ';'
plugin: explode
delimiter: ;
- # Note: explode doesn't trim whitespace, so we'll do it here.
plugin: callback
callable: trim
-
plugin: entity_generate
value_key: name
Expand All @@ -86,15 +83,14 @@ process:
entity_type: taxonomy_term
default_values:
vid: person

temp_subjects: # Temporary field of subjects
-
plugin: skip_on_empty
source: subjects
method: process
-
plugin: explode
delimiter: ';'
delimiter: ;
-
plugin: entity_generate
value_key: name
Expand All @@ -103,7 +99,6 @@ process:
entity_type: taxonomy_term
default_values:
vid: subject

field_subjects: # Gather temp arrays into the destination field
-
plugin: get
Expand All @@ -112,3 +107,14 @@ process:
- '@temp_subjects'
-
plugin: flatten # an array of arrays to a flat array of entity refs

destination: # We're creating nodes, ya'll.
plugin: 'entity:node'

migration_dependencies:
required:
# Loading authorities first allows us to look them up
- auth_person
- auth_complex
- auth_geographic
- auth_topic
3 changes: 2 additions & 1 deletion migrate_apollo.info.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ description: Migrates Master Tiff Images into an Islandora Claw Image derivative
package: custom
type: module
core: 8.x
core_version_requirement: ^8 || ^9

dependencies:
- drupal:migrate
- migrate_plus:migrate_plus
- migrate_source_csv
- islandora_defaults
- islandora:islandora

0 comments on commit d8f0372

Please sign in to comment.