Skip to content

Commit

Permalink
🎁 Introduce .named_derivatives_and_generators_filter
Browse files Browse the repository at this point in the history
Prior to this commit, IIIF Print assumed that every file of a given
mime-type would use all of the same generators.  However, that is not
necessarily the case.

With this commit:

- Updated documentation based on a read of the generated Yardoc
- Added `DerivativeRodeoService.named_derivatives_and_generators_filter`
- Added a `clone` of attributes

The clone is in place to help ensure that as we apply the filter we
don't accidentally delete the application's configuration for mime
category and expected derivatives.

For example, let's say I have the following nested hash:

```ruby
nested_hash = {
  pdf: {
    thumbnail: "DerivativeRodeo::Generators::ThumbnailGenerator"
  },
  image: {
    thumbnail: "DerivativeRodeo::Generators::ThumbnailGenerator",
    json: "DerivativeRodeo::Generators::WordCoordinatesGenerator",
    xml: "DerivativeRodeo::Generators::AltoGenerator",
    txt: "DerivativeRodeo::Generators::PlainTextGenerator"
  }
}
```

If I then call the following:

```ruby
nested_hash.fetch(:pdf).delete_if { |key, value| key == :thumbnail }
```

Then look at `nested_hash`, I will see the following:

```ruby
pp nested_hash

{:pdf=>{},
 :image=>
  {:thumbnail=>"DerivativeRodeo::Generators::ThumbnailGenerator",
   :json=>"DerivativeRodeo::Generators::WordCoordinatesGenerator",
   :xml=>"DerivativeRodeo::Generators::AltoGenerator",
   :txt=>"DerivativeRodeo::Generators::PlainTextGenerator"}}
```

Why? Because we haven't changed objects.  It's possible that Rails's
class_attribute will do deep clones of hashes, but with this clone
behavior we remove that possibility of a problem.

Related to:

- notch8/adventist-dl#684
- https://github.com/scientist-softserv/adventist-dl/issues/676
  • Loading branch information
jeremyf committed Nov 29, 2023
1 parent cd96a59 commit 3fc7a2e
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 17 deletions.
72 changes: 56 additions & 16 deletions app/services/iiif_print/derivative_rodeo_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,25 +18,31 @@ class DerivativeRodeoService
##
# @!group Class Attributes
#
# @attr parent_work_identifier_property_name [String] the property we use to identify the unique
# identifier of the parent work as it went through the SpaceStone pre-process.
# @!attribute parent_work_identifier_property_name [r|w]
# @return [String] the property we use to identify the unique identifier of the parent work as
# it went through the SpaceStone pre-process.
#
# TODO: The default of :aark_id is a quick hack for adventist. By exposing a configuration
# value, my hope is that this becomes easier to configure.
# @todo The default of :aark_id is a quick hack for adventist. By exposing a configuration
# value, my hope is that this becomes easier to configure.
# @api public
class_attribute :parent_work_identifier_property_name, default: 'aark_id'

##
# @attr preprocessed_location_adapter_name [String] The name of a derivative rodeo storage location;
# this will must be a registered with the DerivativeRodeo::StorageLocations::BaseLocation.
# @!attribute preprocessed_location_adapter_name [r|w]
# @return [String] The name of a derivative rodeo storage location; this will must be a
# registered with the DerivativeRodeo::StorageLocations::BaseLocation.
# @api public
class_attribute :preprocessed_location_adapter_name, default: 's3'

##
# @attr named_derivatives_and_generators_by_type [Hash<Symbol, #constantize>] the named
# derivative and it's associated generator. The "name" is important for Hyrax or IIIF
# Print implementations. The generator is one that exists in the DerivativeRodeo.
# @!attribute named_derivatives_and_generators_by_type [r|w]
# @return [Hash<Symbol, #constantize>] the named derivative and it's associated generator.
# The "name" is important for Hyrax or IIIF Print implementations. The generator is
# one that exists in the DerivativeRodeo.
#
# TODO: Could be nice to have a registry for the DerivativeRodeo::Generators; but that's a
# tomorrow wish.
# @todo Could be nice to have a registry for the DerivativeRodeo::Generators; but that's a
# tomorrow wish.
# @api public
class_attribute(:named_derivatives_and_generators_by_type, default: {
pdf: {
thumbnail: "DerivativeRodeo::Generators::ThumbnailGenerator"
Expand All @@ -48,18 +54,46 @@ class DerivativeRodeoService
txt: "DerivativeRodeo::Generators::PlainTextGenerator"
}
})

##
# @!attribute named_derivatives_and_generators_filter [r|w]
# @return [#call] with three named parameters: :filename, :candidates, :file_set
#
# - :file_set is a {FileSet}
# - :filename is a String
# - :named_derivatives_and_generators is an entry from
# {.named_derivatives_and_generators_by_type} as pulled from
# {#named_derivatives_and_generators}
#
# The lambda is responsible for filtering any named generators that should or should not
# be run. It should return a data structure similar to the provided
# :named_derivatives_and_generators
#
# @see .named_derivatives_and_generators_by_type
# @see #named_derivatives_and_generators
# @api public
# rubocop:disable Lint/UnusedBlockArgument
class_attribute(:named_derivatives_and_generators_filter,
default: ->(file_set:, filename:, named_derivatives_and_generators:) { named_derivatives_and_generators })

# rubocop:enable Lint/UnusedBlockArgument
# @!endgroup Class Attributes
##

##
# @see .named_derivatives_and_generators_by_type
#
# @return [Hash<Symbol,String] The named derivative types and their corresponding generators.
# @raise [IiifPrint::UnexpectedMimeTypeError] when the {#file_set}'s {#mime_type} is not one
# that is part of {.named_derivatives_and_generators_by_type}
def named_derivatives_and_generators
@named_derivatives_and_generators ||=
if file_set.class.pdf_mime_types.include?(mime_type)
named_derivatives_and_generators_by_type.fetch(:pdf)
named_derivatives_and_generators_by_type.fetch(:pdf).clone
elsif file_set.class.image_mime_types.include?(mime_type)
named_derivatives_and_generators_by_type.fetch(:image)
named_derivatives_and_generators_by_type.fetch(:image).clone
else
raise "Unexpected mime_type #{mime_type} for #{file_set.class} ID=#{file_set.id.inspect}"
raise UnexpectedMimeTypeError.new(file_set: file_set, mime_type: mime_type)
end
end

Expand Down Expand Up @@ -194,9 +228,15 @@ def valid?
# @note We write derivatives to the {#absolute_derivative_path_for} and should likewise clean
# them up when deleted.
# @see #cleanup_derivatives
#
# @param filename [String]
#
# @see .named_derivatives_and_generators_filter
# @see #named_derivatives_and_generators
def create_derivatives(filename)
# TODO: Do we need to handle "impending derivatives?" as per {IiifPrint::PluggableDerivativeService}?
named_derivatives_and_generators.flat_map do |named_derivative, generator_name|
named_derivatives_and_generators_filter
.call(file_set: file_set, filename: filename, named_derivatives_and_generators: named_derivatives_and_generators)
.flat_map do |named_derivative, generator_name|
lasso_up_some_derivatives(
named_derivative: named_derivative,
generator_name: generator_name,
Expand Down
6 changes: 6 additions & 0 deletions lib/iiif_print/errors.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,10 @@ def initialize(file_set:, work:)
super(message)
end
end

class UnexpectedMimeTypeError < IiifPrintError
def initialize(file_set:, mime_type:)
super "Unexpected mime_type #{mime_type} for #{file_set.class} ID=#{file_set.id.inspect}"
end
end
end
2 changes: 1 addition & 1 deletion lib/iiif_print/split_pdfs/derivative_rodeo_splitter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def initialize(filename, file_set:, output_tmp_dir: Dir.tmpdir)
# bucket that we then use for IIIF Print.
#
# @note The preprocessed_location_template should end in `.pdf`. The
# {DerivativeRodeo::BaseGenerator::PdfSplitGenerator#derive_preprocessed_template_from}
# DerivativeRodeo::BaseGenerator::PdfSplitGenerator#derive_preprocessed_template_from
# will coerce the template into one that represents the split pages.
#
# @return [String]
Expand Down

0 comments on commit 3fc7a2e

Please sign in to comment.