Skip to content

v0.0.7: Rewrite SBERT loading; reformatting examples; solved logging bug;

Compare
Choose a tag to compare
@kwang2049 kwang2049 released this 17 Dec 16:07
· 34 commits to main since this release
2cf268c

Rewrite SBERT loading

Previously, GPL loads starting checkpoints (--base_ckpt) by constructing SBERT model from scratch. This way would lose some information of the checkpoint (e.g. pooling and max_seq_length), and one needed to specify them carefully.

Now we have created another method called load_sbert. It will use SentenceTransformer(base_ckpt) to load the checkpoint directly and do some checking & assertions. Loading from a Huggingface-format checkpoint (e.g. "distilbert-base-uncased") now is still possible for many cases as previous, but we do recommend users to load from a SBERT-format if possible, since it will be less likely to misuse the starting checkpoint.

Reformatting examples

In some cases, Huggingface-format checkpoint cannot be loaded directly by SBERT, e.g. "facebook/dpr-question_encoder-single-nq-base". This is because:

  1. Of course, they are not in SBERT-format but in Hugginface-format;
  2. And for Huggingface-format, SBERT can only work with the checkpoint with a Transformer layer as the last layer, i.e. the outputs should contain hidden states with shape (batch_size, sequence_length, hidden_dimenstion).

To use these checkpoints, one needs to reformat them into SBERT-format. We have provided two examples/templates in the new toolkit source file, gpl/toolkit/reformat.py. Please refer to its readme here.

Solved logging bug

Previously, the logging in GPL is overridden by some other loggers and the formatting cannot display as we want. Now we have solved this by dealing with the root logger. And the new formatting will show many usefull details:

fmt='[%(asctime)s] %(levelname)s [%(name)s.%(funcName)s:%(lineno)d] %(message)s'