SentencePieceTokenizer
inside a keras.models.Model
fails to be reconstructed during keras.saving.load_model()
#1522
Labels
type:Bug
Something isn't working
Describe the bug
When a
SentencePieceTokenizer
is integrated into a model using the functional API, a.k.a.keras.models.Model
, it cannot be properly reconstructed from a savedmodel.keras
file.While untested, I would expect any other custom keras object that relies on
load_assets()
to be able to compute an output spec for a given input tensor to exhibit the same behavior.To Reproduce
https://colab.research.google.com/drive/1XMNYLQrJo25_BkIv8GT02bMJZMjw5RoC?usp=sharing
Refer to cell no. 6
Expected behavior
Proper reconstruction of
SentencePieceTokenizer
.Additional context
When
keras.saving.load_model()
is called on a saved Functional model, the model is reconstructed by running a KerasTensor through the model. Because this happens before the vocabulary is loaded viaSentencePieceTokenizer.load_assets()
, an error is raised upon encountering the tokenizer in the model.The above functionality can be found in
keras.saving.saving_lib
._load_state()
, which is responsible for callingload_assets()
is called on L178, later thandeserialize_keras_object()
on L155.Would you like to help us fix it?
Defining
SentencePieceTokenizer.compute_output_spec()
seems to be sufficient to construct the model graph, allowing the loading function to continue to_load_state()
.Cell no. 3 in the colab notebook is a working example.
The text was updated successfully, but these errors were encountered: