v0.15.0
Summary
π’ KerasNLP is becoming KerasHub π’, read more about it here.
This release contains a number of feature improvements:
- Added int8 quantization support.
- Use the
quantize()
method to quantize any model. - Llama 2 and Llama 3 pre-quantized presets are available.
- Use the
- PaliGemmaCausalLM will automatically resize input images during preprocessing.
- Added more converters for hugginface/transformers checkpoints.
- Gemma 2, PaliGemma, GPT2, Bert, Albert, DistilBert, Bart.
- Class detection for huggingface/transformers checkpoints.
- Call
from_preset()
on a base class, and we will find the correct subclass to create.
- Call
- Added Vicuna presets.
- Alias
Classifier
asTextClassifier
,BertClassifier
asBertTextClassifier
. - Added
tokenizer.special_tokens
andtokenizer.special_token_ids
as convenient properties to view all special tokens on a pretrained tokenizer.
# Quantize an unquantized model.
lm = keras_nlp.models.CausalLM.from_preset(
"gemma2_instruct_2b_en",
dtype="bfloat16",
)
lm.quantize("int8")
# Load a pre-quantized model.
lm = keras_nlp.models.CausalLM.from_preset(
"llama3_instruct_8b_en_int8",
dtype="bfloat16",
)
# Convert a bert model in the huggingface/transformers format.
classifier = keras_nlp.models.TextClassifier.from_preset(
"hf://google-bert/bert-base-uncased",
num_classes=2,
)
# View all special tokens.
print(classifier.preprocessor.tokenizer.special_tokens)
print(classifier.preprocessor.tokenizer.special_token_ids)
Breaking changes
- On all backends, all strings and ragged output will be returned as python strings or python lists respectively.
- This include preprocessing methods like
tokenize()
anddetokenize()
. - This may break code that depended on
tf.Tensor
output on thetensorflow
backend, but will lead to consistent output on all backends, which we believe will be an overall improvement. - Preprocessing layers can still always be included in a
tf.data
preprocessing pipeline, on any backend.
- This include preprocessing methods like
What's Changed
- Version bump to 0.14.0.dev0 by @grasskin in #1675
- Revert "Version bump to 0.14.0.dev0" by @grasskin in #1676
- Remove Keras pin, fix tests by @mattdangerw in #1681
- Add quantization support for
Gemma
,Gemma2
andPaliGemma
by @james77777778 in #1670 - add vicuna preset by @sineeli in #1672
- Porting Gemma 2 transformers checkpoint by @ariG23498 in #1678
- Improve CI speed and resolve issues of
run_quantization_check
by @james77777778 in #1682 - Remove build_from_signature from MHA layers by @mattdangerw in #1687
- Refactoring: in CachedMultiHeadAttention call MHA methods instead of recoding the attention calculation by @apehex in #1684
- Porting PaliGemma transformers checkpoint by @ariG23498 in #1686
- Allow importing keras_nlp without tensorflow by @mattdangerw in #1660
- Add flag to gemma conversion script to specify local orbax by @mattdangerw in #1688
- Fix compatibility for earlier versions of Keras by @james77777778 in #1690
- Add a test against keras-nightly by @mattdangerw in #1693
- Fix dtype bugs in
ReversibleEmbedding
andLayerNorm
by @james77777778 in #1692 - Partially revert #1687 by @mattdangerw in #1695
- Fix quantization test for
XLNet
by @james77777778 in #1699 - Add a HF BERT converter, improve safetensor loading by @mattdangerw in #1694
- Add a subtle fix for gemma 2 conversions by @mattdangerw in #1701
- One more small Gemma conversion fix by @mattdangerw in #1702
- Slightly more defensive handling of type for backbone by @mattdangerw in #1703
- Add support for converting Gemma 2 checkpoints by @mattdangerw in #1700
- Make it clearer what is running in the github action UI by @mattdangerw in #1707
- Try upgrading tensorflow pin by @mattdangerw in #1706
- Bump version to fix query norm in Gemma 2 9b by @mattdangerw in #1709
- Gemma: Add logit soft-capping to score function. by @RyanMullins in #1712
- Version bump HEAD to 0.15 by @mattdangerw in #1713
- Port gpt2 transformers checkpoint by @cosmo3769 in #1704
- Add soft capping to reversible embedding layer by @mattdangerw in #1718
- Add presets for gemma 2 2b by @mattdangerw in #1721
- Utilize
to_numpy=True
inquantize
if available by @james77777778 in #1725 - Dynamic int8 quantization for Llama2 and Llama3 by @james77777778 in #1720
- Bump the python group with 2 updates by @dependabot in #1726
- Shield gemma shortnames by @mattdangerw in #1731
- Sliding window fixes by @mattdangerw in #1738
- Add int8 models to Llama2 and Llama3 by @james77777778 in #1734
- Port distilbert transformer checkpoint by @cosmo3769 in #1736
- Add support of
kwargs
toBackbone.from_preset
and fix the dtype forwarding inTask.from_preset
by @james77777778 in #1742 - Remove src init file contents by @mattdangerw in #1743
- Remove ROADMAP.md by @mattdangerw in #1773
- Fix nested list in args on keras.io by @mattdangerw in #1772
- Remove stale tf only examples by @mattdangerw in #1771
- Limit the default sequence length to 1024 for all models by @mattdangerw in #1770
- Consistent preprocessing output on all backends by @mattdangerw in #1777
- Port albert transformer checkpoint by @cosmo3769 in #1767
- Lower the default learning rate for albert by @mattdangerw in #1786
- Port bart transformer checkpoint by @cosmo3769 in #1783
- Add an option to disable default compilation by @mattdangerw in #1787
- Port mistral transformer checkpoint by @cosmo3769 in #1768
- [Bart]Fix missing weight port by @cosmo3769 in #1789
- Remove python 3.8 version in setup.py by @mattdangerw in #1792
- Class detection works for huggingface checkpoints by @mattdangerw in #1800
- Rename KerasNLP symbols for a multi-modal future by @mattdangerw in #1803
- Move preprocessing to base classes by @mattdangerw in #1807
- Add
add_bos=False, add_eos=False
to SentencePieceTokenizer.init() by @briango28 in #1811 - Only load a full task config when
load_task_extras
is passed by @mattdangerw in #1812 - Add image and audio converter classes by @mattdangerw in #1813
- Simplify registering "built-in" presets by @mattdangerw in #1818
- Support image and audio information in task summaries by @mattdangerw in #1819
- Take two of #1812, simpler classifier head loading by @mattdangerw in #1823
- Remove preprocessing layers we no longer use by @mattdangerw in #1824
- Version bump for dev release by @mattdangerw in #1825
- Version bump for dev release by @mattdangerw in #1830
- Version bump for 0.15.0 release by @mattdangerw in #1832
New Contributors
- @apehex made their first contribution in #1684
- @cosmo3769 made their first contribution in #1704
Full Changelog: v0.14.4...v0.15.0