This repository contains finetuning code and checkpoints for ElasticBERT.
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu, Tianxiang Sun, Junliang He, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
We recommend using Anaconda for setting up the environment of experiments:
conda create -n elasticbert python=3.8.8
conda activate elasticbert
conda install pytorch==1.8.1 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirements.txt
We provide the pre-trained weights of ElasticBERT-BASE and ElasticBERT-LARGE, which can be directly used in Huggingface-Transformers.
ElasticBERT-BASE
: 12 layers, 12 Heads and 768 Hidden Size.ElasticBERT-LARGE
: 24 layers, 16 Heads and 1024 Hidden Size.ElasticBERT-Chinese-BASE
: ElasticBERT-Chinese has been uploaded to huggingface model hub. Welcome to download and use it.
The pre-trained weights can be downloaded here.
Model | MODEL_NAME |
---|---|
ElasticBERT-BASE |
fnlp/elasticbert-base |
ElasticBERT-LARGE |
fnlp/elasticbert-large |
The GLUE task datasets can be downloaded from the GLUE leaderboard
The ELUE task datasets can be downloaded from the ELUE leaderboard
We provide the finetuning code for both GLUE tasks and ELUE tasks in static usage on ElasticBERT.
For GLUE:
cd finetune-static
bash finetune_glue.sh
For ELUE:
cd finetune-static
bash finetune_elue.sh
We provide finetuning code to apply two kind of early exiting methods on ElasticBERT.
For early exit using entropy criterion:
cd finetune-dynamic
bash finetune_elue_entropy.sh
For early exit using patience criterion:
cd finetune-dynamic
bash finetune_elue_patience.sh
Please see our paper for more details!
If you have any problems, raise an issue or contact Xiangyang Liu
If you find this repo helpful, we'd appreciate it a lot if you can cite the corresponding paper:
@inproceedings{liu-etal-2022-towards-efficient,
title = "Towards Efficient {NLP}: A Standard Evaluation and A Strong Baseline",
author = "Liu, Xiangyang and
Sun, Tianxiang and
He, Junliang and
Wu, Jiawen and
Wu, Lingling and
Zhang, Xinyu and
Jiang, Hao and
Cao, Zhao and
Huang, Xuanjing and
Qiu, Xipeng",
booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jul,
year = "2022",
address = "Seattle, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.naacl-main.240",
pages = "3288--3303",
}