Releases: allenai/allennlp
v2.10.1
What's new
Fixed ✅
- Updated dependencies
Commits
c51707e Add a shout to allennlp-light to the README
928df39 Be flexible about rich (#5719)
d5f8e0c Update torch requirement from <1.12.0,>=1.10.0 to >=1.10.0,<1.13.0 (#5680)
9f879b0 Add flair as an alternative (#5712)
b2eb036 Allowed transitions (#5706)
c6b248f Relax requirements on protobuf to allow for minor changes and patches. (#5694)
8571d93 Add a list of alternatives for people to try (#5691)
v2.10.0
What's new
Added 🎉
- Added metric
FBetaVerboseMeasure
which extendsFBetaMeasure
to ensure compatibility with logging plugins and add some options. - Added three sample weighting techniques to
ConditionalRandomField
by supplying three new subclasses:ConditionalRandomFieldWeightEmission
,ConditionalRandomFieldWeightTrans
, andConditionalRandomFieldWeightLannoy
.
Fixed ✅
- Fix error from
cached-path
version update.
Commits
b7f1cb4 Prepare for release v2.10.0
5a3acba Implementation of Weighted CRF Tagger (handling unbalanced datasets) (#5676)
20df7cd Disable dependabot, add notice of future sunsetting (#5685)
7bcbb5a Update transformers requirement from <4.20,>=4.1 to >=4.1,<4.21 (#5675)
0770b00 Bump black from 22.3.0 to 22.6.0 (#5679)
38b9a9e Update transformers requirement from <4.19,>=4.1 to >=4.1,<4.20 (#5636)
ed89a2e Update twine requirement from <4.0.0,>=1.11.0 to >=1.11.0,<5.0.0 (#5669)
5ace70e Bump actions/setup-python from 2 to 4 (#5662)
bd6b060 Update mkdocs-material requirement from <8.3.0,>=5.5.0 to >=5.5.0,<8.4.0 (#5655)
55663be Bump actions/checkout from 2 to 3 (#5647)
fb92b76 Bump actions/upload-artifact from 1 to 3 (#5644)
775919a Bump actions/cache from 2 to 3 (#5645)
f71eca9 Bump webfactory/ssh-agent from 0.4.1 to 0.5.4 (#5643)
62d413d Bump codecov/codecov-action from 1 to 3 (#5642)
428cb7d Bump actions/download-artifact from 1 to 3 (#5641)
eac4829 Bump mypy from 0.960 to 0.961 (#5658)
df9d7ca Make saliency interpreter GPU compatible (#5656)
ea4a53c Update cached_path version (#5665)
a6271a3 FBetaMeasure metric with one value per key (#5638)
8b5ccc4 Bump mypy from 0.950 to 0.960 (#5639)
39b3c96 Dependabot GitHub Actions (#5640)
60fae31 Update filelock requirement from <3.7,>=3.3 to >=3.3,<3.8 (#5635)
67f32d3 Update spacy requirement from <3.3,>=2.1.0 to >=2.1.0,<3.4 (#5631)
2fd8dfa Bump mypy from 0.942 to 0.950 (#5630)
d7409d2 Missing f
prefix on f-strings fix (#5629)
v2.9.3
What's new
Added 🎉
- Added
verification_tokens
argument toTestPretrainedTransformerTokenizer
.
Fixed ✅
- Updated various dependencies
Commits
0c4983a Prepare for release v2.9.3
426d894 Docspec2 (#5618)
1be8855 Add custom_dummy_tokens
to PretrainedTransformerTokenizer
(#5608)
d21854b Update transformers requirement from <4.18,>=4.1 to >=4.1,<4.19 (#5617)
8684412 Bump mkdocs from 1.2.3 to 1.3.0 (#5609)
24e48a3 Bump docspec-python from 1.2.0 to 1.3.0 (#5611)
edafff1 Bump docspec from 1.2.0 to 1.3.0 (#5610)
b66e5f8 Bump black from 22.1.0 to 22.3.0 (#5613)
089d03c Bump mypy from 0.941 to 0.942 (#5606)
dfb438a Bump mypy from 0.931 to 0.941 (#5599)
v2.9.2
What's new
Fixed ✅
- Removed unnecessary dependencies
- Restore functionality of CLI in absence of now-optional checklist-package
Commits
f6866f9 Fix CLI and install instructions in case optional checklists is not present (#5589)
e1c6935 Update torch requirement from <1.11.0,>=1.6.0 to >=1.6.0,<1.12.0 (#5595)
5f5f8c3 Updated the docs for PytorchSeq2VecWrapper
to specify that mask
is required (#5386)
2426ce3 Dependencies (#5593)
2d9fe79 Bump fairscale from 0.4.5 to 0.4.6 (#5590)
ab37da7 Update transformers requirement from <4.17,>=4.1 to >=4.1,<4.18 (#5583)
v2.9.1
What's new
Fixed ✅
- Updated dependencies, especially around doc creation.
- Running the test suite out-of-tree (e.g. after installation) is now possible by pointing the environment variable
ALLENNLP_SRC_DIR
to the sources. - Silenced a warning that happens when you inappropriately clone a tensor.
- Adding more clarification to the
Vocabulary
documentation aroundmin_pretrained_embeddings
andonly_include_pretrained_words
. - Fixed bug with type mismatch caused by latest release of
cached-path
that now returns aPath
instead of astr
.
Added 🎉
- We can now transparently read compressed input files during prediction.
- LZMA compression is now supported.
- Added a way to give JSON blobs as input to dataset readers in the
evaluate
command. - Added the argument
sub_module
inPretrainedTransformerMismatchedEmbedder
Changed ⚠️
- You can automatically include all words from a pretrained file when building a vocabulary by setting the value in
min_pretrained_embeddings
to-1
for that particular namespace.
Commits
3547bfb pin cached-path tighter, make sure our cached-path wrapper still returns str
(#5587)
99c9343 Clarify Vocabulary documentation, add -1 option for min_pretrained_embeddings
(#5581)
3fa5193 Makes the evaluate
command work for the multitask case (Second Edition) (#5579)
9f03803 Add "sub_module" argument in PretrainedTransformerMismatchedEmbedder (#5580)
92e54cc Open Compressed (#5578)
5b3352c Clone warns (#5575)
9da4b0f Add Wassterstein Distance calculation option for fairness metrics (#5546)
b8f92f0 Update mkdocs-material requirement from <8.2.0,>=5.5.0 to >=5.5.0,<8.3.0 (#5572)
a21c0b4 Update filelock requirement from <3.5,>=3.3 to >=3.3,<3.7 (#5571)
6614077 Make tests runnable out-of-tree for help with conda-packaging (#5560)
e679213 Fix CITATION.cff and add automatic validation of your citation metadata (#5561)
efa9f1d try to unpin nltk (#5563)
d01179b Small typo fix (#5555)
3c2299a tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound (#5549)
e463084 Bump black from 21.12b0 to 22.1.0 (#5554)
8226e87 Making checklist optional (#5507)
a76bf1e Update transformers requirement from <4.16,>=4.1 to >=4.1,<4.17 (#5553)
v2.9.0
What's new
Added 🎉
- Added an
Evaluator
class to make comparing source, target, and predictions easier. - Added a way to resize the vocabulary in the T5 module
- Added an argument
reinit_modules
tocached_transformers.get()
that allows you to re-initialize the pretrained weights of a transformer model, using layer indices or regex strings. - Added attribute
_should_validate_this_epoch
toGradientDescentTrainer
that controls whether validation is run at the end of each epoch. - Added
ShouldValidateCallback
that can be used to configure the frequency of validation during training. - Added a
MaxPoolingSpanExtractor
. ThisSpanExtractor
represents each span by a component wise max-pooling-operation.
Fixed ✅
- Fixed the docstring information for the
FBetaMultiLabelMeasure
metric. - Various fixes for Python 3.9
- Fixed the name that the
push-to-hf
command uses to store weights. FBetaMultiLabelMeasure
now works with multiple dimensions- Support for inferior operating systems when making hardlinks
- Use
,
as a separator for filenames in theevaluate
command, thus allowing for URLs (eg.gs://...
) as input files. - Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs
when aConfigurationError
for missing GPU was raised. - Load model on CPU post training to save GPU memory.
- Fixed a bug in
ShouldValidateCallback
that leads to validation occuring after the first epoch regardless ofvalidation_start
value. - Fixed a bug in
ShouldValidateCallback
that leads to validation occuring everyvalidation_interval + 1
epochs, instead of everyvalidation_interval
epochs. - Fixed a bug in
ShouldValidateCallback
that leads to validation never occuring at the end of training.
Removed 👋
- Removed Tango components, since they now live at https://github.com/allenai/tango.
- Removed dependency on the
overrides
package
Commits
dd5a010 Evaluator (#5445)
0b54fb0 Bump fairscale from 0.4.4 to 0.4.5 (#5545)
2deacfe Fix should validate callback train end (#5542)
2cdb874 Bump mypy from 0.910 to 0.931 (#5538)
a91946a Keep NLTK down. They broke the download of omw. (#5540)
73a5cfc Removes stuff that now lives in the tango repo (#5482)
1278f16 Move changes from #5534 to correct place. (#5535)
a711703 Fix ShouldValidateCallback (#5536)
b0b3ad4 Update mkdocs-material requirement from <8.1.0,>=5.5.0 to >=5.5.0,<8.2.0 (#5503)
a3d7125 Max out span extractor (#5520)
515fe9b Configure validation frequency (#5534)
d7e0c87 Update transformers requirement from <4.15,>=4.1 to >=4.1,<4.16 (#5528)
4233247 Bump fairscale from 0.4.3 to 0.4.4 (#5525)
71f2d79 fix 'check_for_gpu' (#5522)
06ec7f9 Reinit layers of pretrained transformer in cached_transformers.get() (#5505)
ec1fb69 add missing nltk download in CI (#5529)
ab4f7b5 Fix model loading on GPU post training (#5518)
3552842 Fix moving average args not rendering properly in docs (#5516)
87ad006 Update transformers requirement from <4.13,>=4.1 to >=4.1,<4.15 (#5515)
39f4f4c tick version for nightly releases
38436d8 Use comma as filename separator (#5506)
e0ee7f4 Dimensions in FBetaMultiLabelMeasure
(#5501)
d77ba3d Hardlink or copy (#5502)
dbcbcf1 Add installation instructions through conda-forge (#5498)
ebad9ee Bump black from 21.11b1 to 21.12b0 (#5496)
82b1f4f Use the correct filename when uploading models to the HF Hub (#5499)
19f6c8f Resize T5 Vocab (#5497)
c557d51 enforce reading in utf-8 encoding (#5476)
1caf0da Removes dependency on the overrides package (#5490)
b99376f Python 3.9 (#5489)
666eaa5 Update mkdocs-material requirement from <7.4.0,>=5.5.0 to >=5.5.0,<8.1.0 (#5486)
64b2c07 Bump fairscale from 0.4.2 to 0.4.3 (#5474)
0a794c6 Fix metric docstring (#5475)
f86ff9f Bump black from 21.10b0 to 21.11b1 (#5473)
a7f6cdf update cached-path (#5477)
844acfa Update filelock requirement from <3.4,>=3.3 to >=3.3,<3.5 (#5469)
05fc7f6 Bump fairscale from 0.4.0 to 0.4.2 (#5461)
923dbde Bump black from 21.9b0 to 21.10b0 (#5453)
09e22aa Update spacy requirement from <3.2,>=2.1.0 to >=2.1.0,<3.3 (#5460)
54b92ae HF now raises ValueError (#5464)
v2.8.0
What's new
Added 🎉
- Added support to push models directly to the Hugging Face Hub with the command
allennlp push-to-hf
. - More default tests for the
TextualEntailmentSuite
.
Changed ⚠️
- The behavior of
--overrides
has changed. Previously the final configuration params were simply taken as the union over the original params and the--overrides
params.
But now you can use--overrides
to completely replace any part of the original config. For example, passing--overrides '{"model":{"type":"foo"}}'
will completely
replace the "model" part of the original config. However, when you just want to change a single field in the JSON structure without removing / replacing adjacent fields,
you can still use the "dot" syntax. For example,--overrides '{"model.num_layers":3}'
will only change thenum_layers
parameter to the "model" part of the config, leaving
everything else unchanged. - Integrated
cached_path
library to replace existing functionality incommon.file_utils
. This introduces some improvements without
any breaking changes.
Fixed ✅
- Fixed the implementation of
PairedPCABiasDirection
inallennlp.fairness.bias_direction
, where the difference vectors should not be centered when performing the PCA.
Commits
7213d52 Update transformers requirement from <4.12,>=4.1 to >=4.1,<4.13 (#5452)
1b02227 bug fix (#5447)
0d8c0fc Update torch requirement from <1.10.0,>=1.6.0 to >=1.6.0,<1.11.0 (#5442)
0c79807 Checklist update (#5438)
ebd6b5b integrate cached_path (#5418)
dcd8d9e Update mkdocs-material requirement from <7.3.0,>=5.5.0 to >=5.5.0,<7.4.0 (#5419)
362349b Registrable _to_params default functionality (#5403)
17ef1aa fix a bug when using fp16 training & gradient clipping (#5426)
a63e28c Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 (#5422)
603552f Add utility function and command to push models to 🤗 Hub (#5370)
e5d332a Update filelock requirement from <3.1,>=3.0 to >=3.0,<3.2 (#5421)
44155ac Make --overrides
more flexible (#5399)
43fd982 Fix PairedPCABiasDirection (#5396)
7785068 Bump black from 21.7b0 to 21.9b0 (#5408)
a09d057 Update transformers requirement from <4.10,>=4.1 to >=4.1,<4.11 (#5393)
527e43d require Python>=3.7 (#5400)
5338bd8 Add scaling to tqdm bar when downloading files (#5397)
v2.7.0
What's new
Added 🎉
- Added support to evaluate mutiple datasets and produce corresponding output files in the
evaluate
command. - Added more documentation to the learning rate schedulers to include a sample config object for how to use it.
- Moved the pytorch learning rate schedulers wrappers to their own file called
pytorch_lr_schedulers.py
so that they will have their own documentation page. - Added a module
allennlp.nn.parallel
with a new base class,DdpAccelerator
, which generalizes
PyTorch'sDistributedDataParallel
wrapper to support other implementations. Two implementations of
this class are provided. The default isTorchDdpAccelerator
(registered at "torch"), which is just a thin wrapper around
DistributedDataParallel
. The other isFairScaleFsdpAccelerator
, which wraps FairScale's
FullyShardedDataParallel
.
You can specify theDdpAccelerator
in the "distributed" section of a configuration file under the key "ddp_accelerator". - Added a module
allennlp.nn.checkpoint
with a new base class,CheckpointWrapper
, for implementations
of activation/gradient checkpointing. Two implentations are provided. The default implementation isTorchCheckpointWrapper
(registered as "torch"),
which exposes PyTorch's checkpoint functionality.
The other isFairScaleCheckpointWrapper
which exposes the more flexible
checkpointing funtionality from FairScale. - The
Model
base class now takes addp_accelerator
parameter (an instance ofDdpAccelerator
) which will be available as
self.ddp_accelerator
during distributed training. This is useful when, for example, instantiating submodules in your
model's__init__()
method by wrapping them withself.ddp_accelerator.wrap_module()
. See theallennlp.modules.transformer.t5
for an example. - We now log batch metrics to tensorboard and wandb.
- Added Tango components, to be explored in detail in a later post
- Added
ScaledDotProductMatrixAttention
, and converted the transformer toolkit to use it - Added tests to ensure that all
Attention
andMatrixAttention
implementations are interchangeable - Added a way for AllenNLP Tango to read and write datasets lazily.
- Added a way to remix datasets flexibly
- Added
from_pretrained_transformer_and_instances
constructor toVocabulary
TransformerTextField
now supports__len__
.
Fixed ✅
- Fixed a bug in
ConditionalRandomField
:transitions
andtag_sequence
tensors were not initialized on the desired device causing high CPU usage (see #2884) - Fixed a mispelling: the parameter
contructor_extras
inLazy()
is now correctly calledconstructor_extras
. - Fixed broken links in
allennlp.nn.initializers
docs. - Fixed bug in
BeamSearch
wherelast_backpointers
was not being passed to anyConstraint
s. TransformerTextField
can now take tensors of shape(1, n)
like the tensors produced from a HuggingFace tokenizer.tqdm
lock is now set insideMultiProcessDataLoading
when new workers are spawned to avoid contention when writing output.ConfigurationError
is now pickleable.- Checkpointer cleaning was fixed to work on Windows Paths
- Multitask models now support
TextFieldTensor
in heads, not just in the backbone. - Fixed the signature of
ScaledDotProductAttention
to match the otherAttention
classes allennlp
commands will now catchSIGTERM
signals and handle them similar toSIGINT
(keyboard interrupt).- The
MultiProcessDataLoader
will properly shutdown its workers when aSIGTERM
is received. - Fixed the way names are applied to Tango
Step
instances. - Fixed a bug in calculating loss in the distributed setting.
- Fixed a bug when extending a sparse sequence by 0 items.
Changed ⚠️
- The type of the
grad_norm
parameter ofGradientDescentTrainer
is nowUnion[float, bool]
,
with a default value ofFalse
.False
means gradients are not rescaled and the gradient
norm is never even calculated.True
means the gradients are still not rescaled but the gradient
norm is calculated and passed on to callbacks. Afloat
value means gradients are rescaled. TensorCache
now supports more concurrent readers and writers.- We no longer log parameter statistics to tensorboard or wandb by default.
Commits
48af9d3 Multiple datasets and output files support for the evaluate command (#5340)
60213cd Tiny tango tweaks (#5383)
2895021 improve signal handling and worker cleanup (#5378)
b41cb3e Fix distributed loss (#5381)
6355f07 Fix Checkpointer cleaner regex on Windows (#5361)
27da04c Dataset remix (#5372)
75af38e Create Vocabulary from both pretrained transformers and instances (#5368)
5dc80a6 Adds a dataset that can be read and written lazily (#5344)
01e8a35 Improved Documentation For Learning Rate Schedulers (#5365)
8370cfa skip loading t5-base in CI (#5371)
13de38d Log batch metrics (#5362)
1f5c6e5 Use our own base images to build allennlp Docker images (#5366)
bffdbfd Bugfix: initializing all tensors and parameters of the ConditionalRandomField
model on the proper device (#5335)
d45a2da Make sure that all attention works the same (#5360)
c1edaef Update google-cloud-storage requirement (#5357)
524244b Update wandb requirement from <0.12.0,>=0.10.0 to >=0.10.0,<0.13.0 (#5356)
90bf33b small fixes for tango (#5350)
2e11a15 tick version for nightly releases
311f110 Tango (#5162)
1df2e51 Bump fairscale from 0.3.8 to 0.3.9 (#5337)
b72bbfc fix constraint bug in beam search, clean up tests (#5328)
ec3e294 Create CITATION.cff (#5336)
8714aa0 This is a desperate attempt to make TensorCache a little more stable (#5334)
fd429b2 Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 (#5326)
1b5ef3a Update spacy requirement from <3.1,>=2.1.0 to >=2.1.0,<3.2 (#5305)
1f20513 TextFieldTensor in multitask models (#5331)
76f2487 set tqdm lock when new workers are spawned (#5330)
67add9d Fix ConfigurationError
deserialization (#5319)
42d8529 allow TransformerTextField to take input directly from HF tokenizer (#5329)
64043ac Bump black from 21.6b0 to 21.7b0 (#5320)
3275055 Update mkdocs-material requirement from <7.2.0,>=5.5.0 to >=5.5.0,<7.3.0 (#5327)
5b1da90 Update links in initializers documentation (#5317)
ca656fc FairScale integration (#5242)
v2.6.0
What's new
Added 🎉
- Added
on_backward
training callback which allows for control over backpropagation and gradient manipulation. - Added
AdversarialBiasMitigator
, a Model wrapper to adversarially mitigate biases in predictions produced by a pretrained model for a downstream task. - Added
which_loss
parameter toensure_model_can_train_save_and_load
inModelTestCase
to specify which loss to test. - Added
**kwargs
toPredictor.from_path()
. These key-word argument will be passed on to thePredictor
's constructor. - The activation layer in the transformer toolkit now can be queried for its output dimension.
TransformerEmbeddings
now takes, but ignores, a parameter for the attention mask. This is needed for compatibility with some other modules that get called the same way and use the mask.TransformerPooler
can now be instantiated from a pretrained transformer module, just like the other modules in the transformer toolkit.TransformerTextField
, for cases where you don't care about AllenNLP's advanced text handling capabilities.- Added
TransformerModule._post_load_pretrained_state_dict_hook()
method. Can be used to modifymissing_keys
andunexpected_keys
after loading a pretrained state dictionary. This is useful when tying weights, for example. - Added an end-to-end test for the Transformer Toolkit.
- Added
vocab
argument toBeamSearch
, which is passed to each contraint inconstraints
(if provided).
Fixed ✅
- Fixed missing device mapping in the
allennlp.modules.conditional_random_field.py
file. - Fixed Broken link in
allennlp.fairness.fairness_metrics.Separation
docs - Ensured all
allennlp
submodules are imported withallennlp.common.plugins.import_plugins()
. - Fixed
IndexOutOfBoundsException
inMultiOptimizer
when checking if optimizer received any parameters. - Removed confusing zero mask from VilBERT.
- Ensured
ensure_model_can_train_save_and_load
is consistently random. - Fixed weight tying logic in
T5
transformer module. Previously input/output embeddings were always tied. Now this is optional,
and the default behavior is taken from theconfig.tie_word_embeddings
value when instantiatingfrom_pretrained_module()
. - Implemented slightly faster label smoothing.
- Fixed the docs for
PytorchTransformerWrapper
- Fixed recovering training jobs with models that expect
get_metrics()
to not be called until they have seen at least one batch. - Made the Transformer Toolkit compatible with transformers that don't start their positional embeddings at 0.
- Weights & Biases training callback ("wandb") now works when resuming training jobs.
Changed ⚠️
- Changed behavior of
MultiOptimizer
so that while a default optimizer is still required, an error is not thrown if the default optimizer receives no parameters. - Made the epsilon parameter for the layer normalization in token embeddings configurable.
Removed 👋
- Removed
TransformerModule._tied_weights
. Weights should now just be tied directly in the__init__()
method. You can also overrideTransformerModule._post_load_pretrained_state_dict_hook()
to remove keys associated with tied weights frommissing_keys
after loading a pretrained state dictionary.
Commits
ef5400d make W&B callback resumable (#5312)
9629340 Update google-cloud-storage requirement (#5309)
f8fad9f Provide vocab as param to constraints (#5321)
56e1f49 Fix training Conditional Random Fields on GPU (#5313) (#5315)
3c1ac03 Update wandb requirement from <0.11.0,>=0.10.0 to >=0.10.0,<0.12.0 (#5316)
7d4a672 Transformer Toolkit fixes (#5303)
aaa816f Faster label smoothing (#5294)
436c52d Docs update for PytorchTransformerWrapper
(#5295)
3d92ac4 Update google-cloud-storage requirement (#5296)
5378533 Fixes recovering when the model expects metrics to be ready (#5293)
7428155 ensure torch always up-to-date in CI (#5286)
3f307ee Update README.md (#5288)
672485f only run CHANGELOG check when source files are modified (#5287)
c6865d7 use smaller snapshot for HFHub integration test
ad54d48 Bump mypy from 0.812 to 0.910 (#5283)
42d96df typo: missing "if" in drop_last
doc (#5284)
a246e27 TransformerTextField (#5280)
82053a9 Improve weight tying logic in transformer module (#5282)
c936da9 Update transformers requirement from <4.8,>=4.1 to >=4.1,<4.9 (#5281)
e8f816d Update google-cloud-storage requirement (#5277)
86504e6 Making model test case consistently random (#5278)
5a7844b add kwargs to Predictor.from_path() (#5275)
8ad562e Update transformers requirement from <4.7,>=4.1 to >=4.1,<4.8 (#5273)
c8b8ed3 Transformer toolkit updates (#5270)
6af9069 update Python environment setup in GitHub Actions (#5272)
f1f51fc Adversarial bias mitigation (#5269)
af101d6 Removes confusing zero mask from VilBERT (#5264)
a1d36e6 Update torchvision requirement from <0.10.0,>=0.8.1 to >=0.8.1,<0.11.0 (#5266)
e5468d9 Bump black from 21.5b2 to 21.6b0 (#5255)
b37686f Update torch requirement from <1.9.0,>=1.6.0 to >=1.6.0,<1.10.0 (#5267)
5da5b5b Upload code coverage reports from different jobs, other CI improvements (#5257)
a6cfb12 added on_backward
trainer callback (#5249)
8db45e8 Ensure all relevant allennlp submodules are imported with import_plugins()
(#5246)
57df0e3 [Docs] Fixes broken link in Fairness_Metrics (#5245)
154f75d Bump black from 21.5b1 to 21.5b2 (#5236)
7a5106d tick version for nightly release
v2.5.0
🆕 AllenNLP v2.5.0
comes with a few big new features and improvements 🆕
There is a whole new module allennlp.fairness
that contains implementations of fairness metrics, bias metrics, and bias mitigation tools for your models thanks to @ArjunSubramonian. For a great introduction, check out the corresponding chapter of the guide: https://guide.allennlp.org/fairness.
Another major addition is the allennlp.confidence_checks.task_checklists
submodule, thanks to @AkshitaB, which provides an automated way to run behavioral tests of your models using the checklist
library.
BeamSearch
also has several new important features, including an easy to add arbitrary constraints, thanks to @danieldeutsch.
See below for a comprehensive list of updates 👇
What's new
Added 🎉
- Added
TaskSuite
base class and command line functionality for runningchecklist
test suites, along with implementations forSentimentAnalysisSuite
,QuestionAnsweringSuite
, andTextualEntailmentSuite
. These can be found in theallennlp.confidence_checks.task_checklists
module. - Added
BiasMitigatorApplicator
, which wraps any Model and mitigates biases by finetuning
on a downstream task. - Added
allennlp diff
command to compute a diff on model checkpoints, analogous to whatgit diff
does on two files. - Meta data defined by the class
allennlp.common.meta.Meta
is now saved in the serialization directory and archive file
when training models from the command line. This is also now part of theArchive
named tuple that's returned fromload_archive()
. - Added
nn.util.distributed_device()
helper function. - Added
allennlp.nn.util.load_state_dict
helper function. - Added a way to avoid downloading and loading pretrained weights in modules that wrap transformers
such as thePretrainedTransformerEmbedder
andPretrainedTransformerMismatchedEmbedder
.
You can do this by setting the parameterload_weights
toFalse
.
See PR #5172 for more details. - Added
SpanExtractorWithSpanWidthEmbedding
, putting specific span embedding computations into the_embed_spans
method and leaving the common code inSpanExtractorWithSpanWidthEmbedding
to unify the arguments, and modifiedBidirectionalEndpointSpanExtractor
,EndpointSpanExtractor
andSelfAttentiveSpanExtractor
accordingly. Now,SelfAttentiveSpanExtractor
can also embed span widths. - Added a
min_steps
parameter toBeamSearch
to set a minimum length for the predicted sequences. - Added the
FinalSequenceScorer
abstraction to calculate the final scores of the generated sequences inBeamSearch
. - Added
shuffle
argument toBucketBatchSampler
which allows for disabling shuffling. - Added
allennlp.modules.transformer.attention_module
which contains a generalizedAttentionModule
.SelfAttention
andT5Attention
both inherit from this. - Added a
Constraint
abstract class toBeamSearch
, which allows for incorporating constraints on the predictions found byBeamSearch
,
along with aRepeatedNGramBlockingConstraint
constraint implementation, which allows for preventing repeated n-grams in the output fromBeamSearch
. - Added
DataCollator
for dynamic operations for each batch.
Changed ⚠️
- Use
dist_reduce_sum
in distributed metrics. - Allow Google Cloud Storage paths in
cached_path
("gs://..."). - Renamed
nn.util.load_state_dict()
toread_state_dict
to avoid confusion withtorch.nn.Module.load_state_dict()
. TransformerModule.from_pretrained_module
now only accepts a pretrained model ID (e.g. "bert-base-case") instead of
an actualtorch.nn.Module
. Other parameters to this method have changed as well.- Print the first batch to the console by default.
- Renamed
sanity_checks
toconfidence_checks
(sanity_checks
is deprecated and will be removed in AllenNLP 3.0). - Trainer callbacks can now store and restore state in case a training run gets interrupted.
- VilBERT backbone now rolls and unrolls extra dimensions to handle input with > 3 dimensions.
BeamSearch
is now aRegistrable
class.
Fixed ✅
- When
PretrainedTransformerIndexer
folds long sequences, it no longer loses the information from token type ids. - Fixed documentation for
GradientDescentTrainer.cuda_device
. - Re-starting a training run from a checkpoint in the middle of an epoch now works correctly.
- When using the "moving average" weights smoothing feature of the trainer, training checkpoints would also get smoothed, with strange results for resuming a training job. This has been fixed.
- When re-starting an interrupted training job, the trainer will now read out the data loader even for epochs and batches that can be skipped. We do this to try to get any random number generators used by the reader or data loader into the same state as they were the first time the training job ran.
- Fixed the potential for a race condition with
cached_path()
when extracting archives. Although the race condition
is still possible if used withforce_extract=True
. - Fixed
wandb
callback to work in distributed training. - Fixed
tqdm
logging into multiple files withallennlp-optuna
.
Commits
b92fd9a Contextualized bias mitigation (#5176)
aa52a9a Checklist fixes (#5239)
6206797 Fix tqdm logging into multiple files with allennlp-optuna (#5235)
b0aa1d4 Generalize T5 modules (#5166)
5b111d0 tick version for nightly release
39d7e5a Make BeamSearch Registrable (#5231)
c014232 Add constraints to beam search (#5216)
98dae7f Emergency fix. I forgot to take this out.
c5bff8b Fixes Checkpointing (#5220)
3d5799d Roll backbone (#5229)
babc450 Added DataCollator
for dynamic operations for each batch. (#5221)
d97ed40 Bump checklist from 0.0.10 to 0.0.11 (#5222)
12155c4 fix race condition when extracting files with cached_path (#5227)
d662977 cancel redundant GH Actions workflows (#5226)
2d8f390 Fix W&B callback for distributed training (#5223)
59df2ad Update nr-interface requirement from <0.0.4 to <0.0.6 (#5213)
3e1b553 Bump black from 20.8b1 to 21.5b1 (#5195)
d2840cb save meta data with model archives (#5209)
bd941c6 added shuffle disable option in BucketBatchSampler (#5212)
3585c9f Implementing abstraction to score final sequences in BeamSearch
(#5208)
79d16af Add a min_steps
parameter to BeamSearch
(#5207)
cf113d7 Changes and improvements to how we initialize transformer modules from pretrained models (#5200)
cccb35d Rename sanity_checks to confidence_checks (#5201)
db8ff67 Update transformers requirement from <4.6,>=4.1 to >=4.1,<4.7 (#5199)
fd5c9e4 Bias Metrics (#5139)
d9b19b6 Bias Mitigation and Direction Methods (#5130)
7473737 add diff command (#5109)
d85c5c3 Explicitly pass serialization directory and local rank to trainer in train command (#5180)
96c3caf fix nltk downloads in install (#5189)
b1b455a improve contributing guide / PR template (#5185)
7a260da fix cuda_device docs (#5188)
0bf590d Update Makefile (#5183)
3335700 Default print first batch (#5175)
b533733 Refactor span extractors and unify forward. (#5160)
01b232f Allow google cloud storage locations for cached_path (#5173)
eb2ae30 Update README.md (#5165)
55efa68 fix dataclasses import (#5169)
a463e0e Add way of skipping pretrained weights download (#5172)
c71bb46 improve err msg for PolynomialDecay LR scheduler (#5143)
530dae4 Simplify metrics (#5154)
12f5b0f Run some slow tests on the self-hosted runner (#5161)
9091580 Fixes token type ids for folded sequences (#5149)
10400e0 Run checklist suites in AllenNLP (#5065)
d11359e make dist_reduce_sum work for tensors (#5147)
9184fbc Fixes Backbone / Model MRO inconsistency (#5148)