Skip to content

Commit

Permalink
Merge pull request seth-shaw-unlv#3 from seth-shaw-unlv/content-model…
Browse files Browse the repository at this point in the history
…ing-overhaul

Content modeling overhaul
  • Loading branch information
seth-shaw-unlv authored May 4, 2018
2 parents 3b8bc93 + 497f4d4 commit b3074c7
Show file tree
Hide file tree
Showing 19 changed files with 385 additions and 305 deletions.
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,22 @@ The source data used for this proof of concept came from the [Project Apollo Arc
Note: using drush with migrate_tools is optional, but the instructions assume it is installed.

0. Install the prerequisite modules (islandora_image, migrate_plus, and migrate_source_csv) and their dependencies. E.g. `composer require islandora/islandora_image drupal/migrate_tools:^4.0 drupal/migrate_source_csv`.
1. [Patch migrate_plus to allow looking up entities across multiple content types](https://www.drupal.org/project/migrate_plus/issues/2960251).
2. Copy the data directory to your drupal web root (e.g. in my tests the drupal web root is `/var/www/drupalvm/drupal/web` and the data directory is `/var/www/drupalvm/drupal/web/data`).
3. Copy the migrate_cdm and unlv_image directories to your modules directory.
4. Enable the modules. E.g. `drush en -y migrate_tools migrate_apollo`.
5. Run the migration. E.g. `drush mim --all`.
6. See a wonderful list of the newly migrated images on your Drupal site's front page!

# The migrate_plus Patch

Previously this example split out people that were subjects from topics that
were subjects. In that case we could perform entity lookups on each column for
the matching content type. To see this strategy use the [pre-mik branch](https://github.com/seth-shaw-unlv/claw-migrate-files-poc/tree/pre-mik).

The [Move to Islandora Kit sample metadata](https://github.com/MarcusBarnes/mik/blob/master/tests/assets/csv/sample_metadata.csv),
however, combines them into a single column. This requires us to perform a
single lookup across multiple content types, something the existing migrate_plus
module doesn't support. I've created a patch and issue to address the issue.
0. Copy the data directory to your drupal web root (e.g. in my tests the drupal web root is `/var/www/drupalvm/drupal/web` and the data directory is `/var/www/drupalvm/drupal/web/data`).
0. Copy the migrate_cdm and unlv_image directories to your modules directory.
0. Enable the modules. E.g. `drush en -y migrate_tools migrate_apollo`.
0. Run the migration. E.g. `drush mim --all`. *Note: drush must be run by the webserver user because the claw_file migration copies files to the "public://directory". E.g.* `sudo -u www-data drush mim --all` *if you are using vagrant.*
0. Generate service images. *Migrate is not triggering context actions. Until we figure out that problem you will need to do it yourself by going to the content page, selecting all the items you migrated, and then use the "Generate a service file from image preservation master". This will trigger both the service image and, as a chain reaction, the thumbnail generation.*
0. See a wonderful list of the newly migrated images on your Drupal site's front page!

# Combining People and Subject entities in a single column

This example splits out people that are subjects from topics that are subjects
into separate columns. This allows us to perform entity lookups on each column
for the matching content type.

In some cases, however, topic columns include items that could be either people or topics.
This requires us to perform a single lookup across multiple content types,
something the existing migrate_plus module doesn't support. I've
[created a patch and issue](https://www.drupal.org/project/migrate_plus/issues/2960251) to address the issue.
Until it is merged or some other solution is found, we will either have to
patch migrate_plus, or extend the process plugin for this small modification.
15 changes: 7 additions & 8 deletions data/apollo.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
ID,Title,Date,Location,Subjects,Description,File
AS11-36-5390,Apollo 11 Hasselblad image from film magazine 36/N - Trans-Lunar,,,"Aldrin, Buzz","Neil took this picture of Buzz during their initial inspection of the LM at about 057:03. Journal Contributor David Sander notes that ""Buzz is wearing his intravehicular suit, a specially made set of garments designed to be as flame retardant as the rest of the ship, and made from the same fabric as the outer layer of the spacesuits"". Paolo Attivissimo notes that Buzz's watch reads 5:35 (Houston time), which is 57:03 GET (Ground Elapsed Time)",AS11-36-5390.tiff
AS11-37-5528,"Apollo 11 Hasselblad image from film magazine 37/R - Orbit, Post-Landing, Post-EVA",,,"Armstrong, Neil",,AS11-37-5528.tiff
AS11-37-5545,"Apollo 11 Hasselblad image from film magazine 37/R - Orbit, Post-Landing, Post-EVA",,,Flags--United States,,AS11-37-5545.tiff
AS11-40-5850,Apollo 11 Hasselblad image from film magazine 40/S - EVA,,,Lunar excursion module;Moonwalk,"First EVA picture. Neil's first frame in a pan taken west of the ladder. Jettison bag under the Descent Stage, south footpad, bent probe, strut supports. The view is more or less up-Sun, so we are seeing the shadowed faces of boulders. 20 July 1969.",AS11-40-5850.tiff
AS11-40-5875,Apollo 11 Hasselblad image from film magazine 40/S - EVA,,,"Aldrin, Buzz;Lunar excursion module;Flags--United States;Moonwalk",,AS11-40-5875.tiff
AS11-40-5903,Apollo 11 Hasselblad image from film magazine 40/S - EVA,,,"Aldrin, Buzz;Moonwalk",,AS11-40-5903.tiff
AS11-44-6665,"Apollo 11 Hasselblad image from film magazine 44/V - LM inspection, rendezvous",,,Moon,,AS11-44-6665.tiff
AS11-36-5390,Apollo 11 Hasselblad image from film magazine 36/N - Trans-Lunar,"Neil took this picture of Buzz during their initial inspection of the LM at about 057:03. Journal Contributor David Sander notes that ""Buzz is wearing his intravehicular suit, a specially made set of garments designed to be as flame retardant as the rest of the ship, and made from the same fabric as the outer layer of the spacesuits"". Paolo Attivissimo notes that Buzz's watch reads 5:35 (Houston time), which is 57:03 GET (Ground Elapsed Time)","Aldrin, Buzz",
AS11-37-5528,"Apollo 11 Hasselblad image from film magazine 37/R - Orbit, Post-Landing, Post-EVA",,"Armstrong, Neil",
AS11-37-5545,"Apollo 11 Hasselblad image from film magazine 37/R - Orbit, Post-Landing, Post-EVA",,,Flags--United States
AS11-40-5850,Apollo 11 Hasselblad image from film magazine 40/S - EVA,"First EVA picture. Neil's first frame in a pan taken west of the ladder. Jettison bag under the Descent Stage, south footpad, bent probe, strut supports. The view is more or less up-Sun, so we are seeing the shadowed faces of boulders. 20 July 1969.",,Lunar excursion module;Moonwalk
AS11-40-5875,Apollo 11 Hasselblad image from film magazine 40/S - EVA,,"Aldrin, Buzz",Lunar excursion module;Flags--United States;Moonwalk
AS11-40-5903,Apollo 11 Hasselblad image from film magazine 40/S - EVA,,"Aldrin, Buzz",Moonwalk
AS11-44-6665,"Apollo 11 Hasselblad image from film magazine 44/V - LM inspection, rendezvous",,,Moon
18 changes: 12 additions & 6 deletions migrate_apollo/config/install/migrate_plus.migration.claw_file.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,19 @@ source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
delimiter: ','
header_row_count: 1 # headers, 0 if there are no headers
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
- digital_id
constants:
source_base_dir: 'data/images'
collection_alias: 'apollo'
dest_base_dir: 'public://masters'
extension: 'tiff'
column_names:
0:
digital_id: 'Digital ID' # identifier key
digital_id: 'Digital ID' # identifier key and basename of the file
1:
title: 'Title' # Used for title and alt-text
6:
file: 'File'


process:
Expand All @@ -31,20 +30,27 @@ process:
plugin: default_value
default_value: image

filename:
plugin: concat
delimiter: '.'
source:
- digital_id
- constants/extension

source_file_path:
plugin: concat
delimiter: /
source:
- constants/source_base_dir
- file
- '@filename'

destination_file_path:
plugin: concat
delimiter: /
source:
- constants/dest_base_dir
- constants/collection_alias
- file
- '@filename'

uri:
plugin: file_copy
Expand Down
71 changes: 49 additions & 22 deletions migrate_apollo/config/install/migrate_plus.migration.claw_image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ migration_group: test_digital

migration_dependencies:
required:
- claw_media
# Loading authorities first allows us to look them up
- auth_person
- auth_complex
Expand All @@ -15,26 +14,23 @@ source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
delimiter: ','
header_row_count: 1 # headers, 0 if there are no headers
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
- digital_id
constants:
collection_alias: 'apollo'
image: 'Image'
column_names:
0:
digital_id: 'Digital ID'
1:
title: 'Title'
2:
date: 'Date' # Ignoring for now.
description: 'Description'
3:
location: 'Location' # Ignoring for now.
subject_person: 'Identified Individual'
4:
subjects: 'Subjects'
5:
description: 'Description'
6:
file: 'File'

destination: # We're creating nodes, ya'll.
plugin: entity:node
Expand All @@ -57,13 +53,39 @@ process:
- constants/collection_alias
- digital_id

# SUBJECTS
# For subjects we can't find in the system, we can't determine by their value
# if they should be persons, corporate, families, or topics. This defaults to
# creating them as subject nodes (topics). We may opt for NOT creating them
# and reporting out the issue at a later time, if that makes sense.
# Type Tags
field_tags:
plugin: entity_lookup
source: constants/image
value_key: name
bundle_key: vid
bundle: tags
entity_type: taxonomy_term
ignore_case: true

# SUBJECTS
# Since subjects can be of multiple content types we need to perform
# lookups for each type, assign them to a temp array, and recombine
# them all before assigning them to the appropriate entity reference field.

temp_subjects_person: # Temporary array of person entity refs
-
plugin: skip_on_empty # Don't bother if there aren't any values
source: subject_person
method: process # Only this field, not the whole CSV row
- # Account for multiple entries in a cell delimited by ;
plugin: explode # Note: no whitespace trimming or quoting support is provided! Be careful with leading or trailing spaces between values in your source data!
delimiter: ';'
-
plugin: entity_generate
value_key: title
bundle_key: type
bundle: person
entity_type: node
default_values:
type: person

field_subjects: # Temporary field of subjects
temp_subjects: # Temporary field of subjects
-
plugin: skip_on_empty
source: subjects
Expand All @@ -72,14 +94,19 @@ process:
plugin: explode
delimiter: ';'
-
plugin: entity_generate # Create a subject entity if it doesn't already exist
plugin: entity_generate
value_key: title
bundle: subject
bundle_key: type
entity_type: node
default_values:
type: subject

# Now the TIFF entity references
field_tiff/target_id:
plugin: migration_lookup
migration: claw_media
source: digital_id
no_stub: true
field_tiff/alt: title
field_subjects: # Gather temp arrays into the destination field
-
plugin: get
source:
- '@temp_subjects_person'
- '@temp_subjects'
-
plugin: flatten # an array of arrays to a flat array of entity refs
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,17 @@ migration_group: test_digital
migration_dependencies:
required:
- claw_file
- claw_image # so we can lookup the value for field_media_of

source:
plugin: csv
path: 'data/apollo.csv' # Path relative to Drupal site root
delimiter: ','
header_row_count: 1 # headers, 0 if there are no headers
header_row_count: 0 # 1 with headers, 0 if there are no headers
keys:
- digital_id
constants:
preservation_master: 'Preservation Master'
column_names:
0:
digital_id: 'Digital ID' # identifier key
Expand All @@ -24,19 +27,36 @@ process:
source: digital_id
no_stub: true

field_file/target_id:
# Lookup the Tiff we just migrated
field_media_file/target_id:
plugin: migration_lookup
migration: claw_file
source: digital_id
no_stub: true

field_file/display:
field_media_file/display:
plugin: default_value
default_value: 1
field_file/description:
field_media_file/description:
plugin: default_value
default_value: ''

# Lookup the UNLV_Image we just created
field_media_of:
plugin: migration_lookup
migration: claw_image
source: digital_id
no_stub: true

# Set as Preservation Master
field_tags:
plugin: entity_lookup
source: constants/preservation_master
value_key: name
bundle_key: vid
bundle: tags
entity_type: taxonomy_term
ignore_case: true

destination:
plugin: 'entity:media'
default_bundle: image_tiff
default_bundle: file
3 changes: 0 additions & 3 deletions unlv_image/composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,5 @@
"keywords": ["Drupal", "Islandora"],
"license": "GPL-2.0+",
"require": {
"islandora/islandora_collection": "dev-8.x-1.x",
"islandora/islandora_image": "dev-8.x-1.x",
"drupal/media_entity_image": "^1.2"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,14 @@ langcode: en
status: true
dependencies:
config:
- core.entity_form_mode.media.inline
- field.field.node.unlv_image.field_creator
- field.field.node.unlv_image.field_description
- field.field.node.unlv_image.field_digital_id
- field.field.node.unlv_image.field_jp2
- field.field.node.unlv_image.field_memberof
- field.field.node.unlv_image.field_member_of
- field.field.node.unlv_image.field_subjects
- field.field.node.unlv_image.field_tiff
- field.field.node.unlv_image.field_tn
- field.field.node.unlv_image.field_web_content
- field.field.node.unlv_image.field_tags
- node.type.unlv_image
module:
- inline_entity_form
- path
enforced:
module:
Expand Down Expand Up @@ -55,20 +50,7 @@ content:
size: 60
placeholder: ''
third_party_settings: { }
field_jp2:
weight: 9
settings:
form_mode: inline
label_singular: ''
label_plural: ''
allow_new: true
match_operator: CONTAINS
override_labels: false
allow_existing: false
third_party_settings: { }
type: inline_entity_form_complex
region: content
field_memberof:
field_member_of:
weight: 7
settings:
match_operator: CONTAINS
Expand All @@ -86,44 +68,14 @@ content:
third_party_settings: { }
type: entity_reference_autocomplete
region: content
field_tiff:
weight: 8
settings:
form_mode: inline
label_singular: ''
label_plural: ''
allow_new: true
match_operator: CONTAINS
override_labels: false
allow_existing: false
third_party_settings: { }
type: inline_entity_form_complex
region: content
field_tn:
weight: 11
field_tags:
weight: 123
settings:
form_mode: inline
label_singular: ''
label_plural: ''
allow_new: true
match_operator: CONTAINS
override_labels: false
allow_existing: false
third_party_settings: { }
type: inline_entity_form_complex
region: content
field_web_content:
weight: 10
settings:
form_mode: inline
label_singular: ''
label_plural: ''
allow_new: true
match_operator: CONTAINS
override_labels: false
allow_existing: false
size: 60
placeholder: ''
third_party_settings: { }
type: inline_entity_form_complex
type: entity_reference_autocomplete
region: content
path:
type: path
Expand Down
Loading

0 comments on commit b3074c7

Please sign in to comment.